event

PhD Defense by Chih-Li Sung

Primary tabs

Title: Contributions to binary-output computer experiments and large-scale computer experiments

 

Advisors: Dr. C. F. Jeff Wu, Dr. Benjamin Haaland

 

Committee Members:

Dr. Cheng Zhu (Dept. of Biomedical Engineering, GT)

Dr. Roshan Vengazhiyil (ISyE)

Dr. Ying Hung (Dept. of Statistics and Biostatistics, Rutgers University)

 

Date and Time: Tuesday, May 1st, 1:00 PM

 

Location: Groseclose 402

 

Abstract:

Computer experiments have played an increasingly important role in science and technology and received enormous attention from industries and research institutes. One prominent example is the redesign of a new rocket engine by the U.S. Air Force (Mak et al., 2018).

 

This dissertation makes contributions in two important aspects of computer experiments:

(i) binary-output computer experiments and (ii) large-scale computer experiments. For (i), the dissertation contains two chapters, where a new emulation method and a novel calibration method are introduced in Chapters 1 and 2, respectively. For (ii), the dissertation contains two chapters, where new computationally efficient search limiting techniques for local Gaussian process approximation are developed in Chapter 3, and a new model, which is called multi-resolution function ANOVA, is proposed in Chapter 4.

 

In Chapter 1, we study the emulation problem of computer experiments whose response is binary. Such non-Gaussian observations are common in some computer experiments. Motivated by the analysis of a class of cell adhesion experiments, we introduce a generalized Gaussian process model for binary responses, which shares some common features with standard Gaussian process models. In addition, the proposed model incorporates a flexible mean function that can capture different types of time series structures. Asymptotic properties of the estimators are derived, and an optimal predictor as well as its predictive distribution are constructed. Their performance is examined via two simulation studies. The methodology is applied to study computer simulations for cell adhesion experiments. The fitted model reveals important biological information in repeated cell bindings, which is not directly observable in lab experiments.

 

In Chapter 2, we develop a calibration method for binary-output computer experiments. Calibration refers to the estimation of unknown parameters in computer experiments. An accurate estimation of these parameters is important because it provides a scientific understanding of the underlying system which is not available in physical experiments. Despite numerous studies on calibration, most of the results are limited to the analysis of continuous responses.   Motivated by a study of cell adhesion experiments, we propose a new calibration method for binary responses. This method is shown to be semiparametric efficient and the estimated parameters are asymptotically consistent. Numerical examples are conducted to demonstrate the finite sample performance. The proposed method is applied to analyze a class of T cell adhesion experiments. The findings shed light on the settings of kinetic parameters in single molecular interactions which are important in the study of the immune system.

 

In Chapter 3, we develop two computationally efficient search limiting techniques for local Gaussian process approximation, which can be used in large-scale computer experiments. Gaussian process models are commonly used as emulators for computer experiments. However, developing a Gaussian process emulator can be computationally prohibitive when the number of experimental samples is even moderately large. Local Gaussian process approximation (Gramacy and Apley, 2015) was proposed as an accurate and computationally feasible emulation alternative. However, constructing local sub-designs specific to predictions at a particular location of interest remains a substantial computational bottleneck to the technique. In this chapter, two computationally efficient neighborhood search limiting techniques are proposed, a maximum distance method and a feature approximation method. Two examples demonstrate that the proposed methods indeed save substantial computation while retaining emulation accuracy.

 

In Chapter 4, we propose a novel model, multi-resolution functional ANOVA, for large-scale and many-input computer experiments that have become typical. More generally, this model can be used for large-scale and many-input non-linear regression problems. An overlapping group lasso approach is used for estimation, ensuring computational feasibility in a large-scale and many-input setting. New results on consistency and inference for the (potentially overlapping) group lasso in a high-dimensional setting are developed and applied to the proposed multi-resolution functional ANOVA model. Importantly, these results allow us to quantify the uncertainty in our predictions. Numerical examples demonstrate that the proposed model enjoys marked computational advantages. Data capabilities, both in terms of sample size and dimension, meet or exceed best available emulation tools while meeting or exceeding emulation accuracy.

 

Status

  • Workflow Status:Published
  • Created By:Tatianna Richardson
  • Created:04/11/2018
  • Modified By:Tatianna Richardson
  • Modified:04/11/2018

Categories

Keywords