event

PhD Defense by Hong Seo Lim

Primary tabs

Hong Seo Lim
BME PhD Defense Presentation

Date:2022-06-27
Time: 10AM - 12PM
Location / Meeting Link: EBB Krone 1005 CHOA Seminar Room / https://gatech.zoom.us/j/98561785488

Committee Members:
Peng Qiu, Ph.D. (Advisor), Edward Botchwey, Ph.D. Kavita Dhodapkar, M.D. Eva Dyer, Ph.D. Eberhard Voit, Ph.D.


Title: Developing Graph-based Computational Algorithms for Single-cell Data Science

Abstract:
Explosive advances in single-cell measurement technologies allow in-depth analysis of the cellular heterogeneity of the biological systems of interest. Single-cell profiling through flow cytometry, mass cytometry, and single-cell RNA sequencing (scRNA-seq) has led to novel discoveries in immunology, virology, neuroscience, and cancer biology. Single-cell data science is a new discipline encompassing the usage of statistics, mathematics, or machine learning for various computational challenges arising in single-cell profiling data and subsequent analysis steps. In this thesis, we have identified several single-cell related challenges that need proper attention: (1) proper integration of single-cell datasets acquired from different technologies or affected by batch effect, (2) quantification of cluster-like and trajectory-like characteristics of scRNA-seq datasets for proper algorithm choice, and (3) quantification of cell-type-specific differences across the single-cell dataset. We provide graph-based computational tools to tackle these challenges. The novel computational tools we developed are as follows: (1) We propose a new algorithm, JSOM, to align two datasets through jointly evolved self-organizing maps. We demonstrated that the JSOM maps could be used to identify related clusters between the two datasets, and we demonstrated the alignment of various single-cell profiling datasets. (2) We present five scoring metrics and a new pipeline to quantify geometric characteristics of scRNA-seq data, more specifically, the clusterness and trajectoriness of the data. The proposed scoring metrics are based on pairwise distance distribution, persistent homology, vector magnitude, Ripley's K, and degrees of separation, and we demonstrated that our pipeline could quantify clusterness and trajectoriness of scRNA-seq data. (3) We present a new pipeline to quantify cell-type-specific differences and identify features driving the variation. Our pipeline exploits the quantifiable differences seen in the low-dimensional UMAP and used SHAP analysis to measure the differences, and we demonstrated the algorithm’s utility in interpreting and quantifying differences in various single-cell profiling data. Overall, the developed computational tools would improve various steps of the single-cell data analysis pipeline, contributing to solving computational challenges posed in the field of single-cell data science.

Status

  • Workflow Status:Published
  • Created By:Tatianna Richardson
  • Created:06/13/2022
  • Modified By:Tatianna Richardson
  • Modified:06/24/2022

Categories

Keywords