event

PhD Defense by Samaneh Ebrahimi

Primary tabs

 Dear faculty members and fellow students,


You are cordially invited to attend my upcoming thesis defense.

Title: Modeling High Dimensional Multi-Stream Data for Monitoring and Prediction

 

Advisor: Dr. Kamran Paynabar


Committee Members:
Dr. Jianjun Shi 
Dr. Nagi Gebraeel

Dr. Chuck Zhang
Dr. Shawn Mankad (Johnson Graduate School of Management, Cornell University)


Date and Time: Tuesday, Aug 14th, 1:00 PM

 

Location: Groseclose 402


Abstract:

This dissertation concentrates on solving problems for monitoring and predictions of high dimensional, streaming data using novel statistical data mining methods. Among a large plethora of problems, this dissertation attempts to focus on three distinct and critical research problems. Chapter 1 concisely reviews the motivation and challenges behind each problem.   In Chapter 2, a new monitoring and diagnosis approach based on PCA for monitoring high dimensional multi-stream data is proposed. Monitoring and diagnostics (M&D) are important components of Statistical Process Control (SPC), however, little work exists for an integrated M&D approach. M&D's main challenge is handling high-dimensional processes commonly found in manufacturing, computer networks, and Internet industry. In Chapter 2, we propose an integrated approach that addresses this challenge.  For monitoring, most commonly used methods in high dimensions are based on finding the underlying lower dimension. One common approach is Principal Component Analysis (PCA), For PCA based monitoring, selecting the Principal Scores (PCs) to include in the model is important. While most of the existing methods focus on the PCs with highest variance, we argue that this is an inappropriate approach for purpose of monitoring. Quite opposite, we show that adaptively chosen PCs are significantly better for process monitoring. Consequently, we develop a novel monitoring methods based on this principle named Adaptive PC Selection (APCS). More importantly, we integrate a novel diagnostic approach to enable a streamlined SPC. The PC-based Signal Recovery (PCSR) diagnostics approach draws inspiration from Compressed Sensing to use  Adaptive Lasso for identifying the sparse change in the process. We theoretically motivate our approaches and do performance evaluation of our integrated M&D method  through simulation and case studies.   In the Chapter 3, a new methodology for dynamically monitoring sparse network is proposed. For this, we focus on modeling the network connections in financial institutions. The interconnectedness of financial institutions can function as a mechanism for the propagation and amplification of shocks throughout the economy, thus contributing to financial crises.  As such, network analysis has become a critical tool to assess interconnectedness and systemic risk levels.  In Chapter 3, we create a formal monitoring system to detect changes within a sequence of sparse networks constructed from an interbank lending market in the European Union. The approach combines a state space model with the Hurdle model to capture temporal dynamics of the edge formation process, which is modeled as a function of node and edge attributes and estimated using an extended Kalman Filter. Elements from statistical process control, such as Exponential Weighted Moving Average control charts, are used to monitor the network sequence in real time in order to distinguish gradual change resulting from the typical edge dynamics from abrupt changes in trading patterns that are caused by fundamental changes in market conditions. We find that the proposed methodology would have raised alarms to regulators prior to several key events and announcements by the European Central Bank during the 2007-2009 financial crisis, demonstrating promise of the approach as an early warning system.     In Chapter 4, a novel deep learning approach for predicting multimedia data labels is proposed. The method is the extension to Classification Restriction Boltzmann Machine (classRBM). The Restricted Boltzmann Machine (RBM) is a probabilistic model to model the distribution of a visible layer of features using one hidden layer. Deep Boltzmann Machines were developed as an extension of RBM with multiple hidden layers. These methods have been successfully applied for unsupervised learning. A new discriminative RBM for supervised learning known as Classification RBM (ClassRBM) was proposed in 2009. Due to estimation intractability, an effective deep extension of ClassRBM has not been used in the literature. In this chapter, we propose a new estimation approach for learning the parameters of deep ClassRBM (ClassDBM). Moreover, we implement this approach on a two benchmark data and advertisement multimedia data for validation.

Status

  • Workflow Status:Published
  • Created By:Tatianna Richardson
  • Created:08/08/2018
  • Modified By:Tatianna Richardson
  • Modified:08/08/2018

Categories

Keywords