event

Ph.D. Proposal Oral Exam - Miald Ghiasi Rad

Primary tabs

Title:  Improvements in the Modeling of High Dimension/Low Sample Size Imbalanced Clinical Data Sets

Committee: 

Dr. Kamaleswaran, Advisor

Dr. Inan, Co-Advisor     

Dr. Anderson, Chair

Dr. Grunwell

Abstract: The objective of the proposed research is to reduce the deficiency in analysis of clini- cal datasets. The clinical research datasets suffer from high expense at the time of sample collection and processing. This forces the researchers to not take and analyze large number of samples at each trial. This fact combined with high dimensionality of features fre- quently observed in these datasets, make the analysis of clinical datasets very challenging. Although methods exist to help overcoming this problem and make these datasets more applicable for modeling, the label imbalance still has a very huge impact on the outcome of any model developed using these datasets. To give an example, in most of the clinical re- search, most of the patients lie towards surviving or not surviving, meaning that the model developed using this dataset is highly possible to be predicting with bias. The models being trained using this population are more vulnerable to errors because of bias imposed by the majority population. Therefore, the minority population is being exposed to false prediction. This minority population may vary by choosing different labels in the analysis making it a more important problem to be solved. This research aims to propose solutions on how to reduce this impact in clinical re- search. To do this, verified datasets are collected and processed, the previous researches have been studied, and new approaches like SMOGN have being tried. The promising pre- liminary results are presented at the end of this proposal to make the case for answering this problem in the imbalanced small sample rate with high dimensionality clinical datasets.

Status

  • Workflow Status:Published
  • Created By:Daniela Staiculescu
  • Created:11/12/2021
  • Modified By:Daniela Staiculescu
  • Modified:11/12/2021

Categories

Target Audience