event

PhD Defense by Hairong Wang

Primary tabs

Title: Knowledge-Informed Weakly-Supervised Deep Learning Models for Cancer Applications

Date: June 6, 2025 (Friday)

Time: 1:00 pm – 3:00 pm EST

Location: Groseclose 403

Zoom link: https://gatech.zoom.us/j/94243279953

 

Hairong Wang

Operations Research PhD Student

H. Milton Stewart School of Industrial and Systems Engineering

Georgia Institute of Technology

 

Committee

Dr. Jing Li (Advisor), H. Milton Stewart School of Industrial and Systems Engineering

Dr. Jianjun Shi, H. Milton Stewart School of Industrial and Systems Engineering

Dr. Xiaochen Xian, H. Milton Stewart School of Industrial and Systems Engineering

Dr. Kristin Swanson, Mayo Clinic

Dr. Shuai Huang, Department of Industrial and Systems Engineering at the University of Washington

 

Abstract

In recent decades, deep learning (DL) has emerged as a promising tool for analyzing complex patterns from large datasets. The computational power and versatility of DL has enabled in-depth analysis of medical imaging, clinical, and molecular data, significantly enhancing diagnosis, prognosis, and treatment planning in healthcare. However, an intrinsic bottleneck exists in healthcare data acquisition, limited by the invasiveness or high expense of sample collection, the need for highly-specialized experts to create accurate labels, the rarity of some diseases in the population, and the difficulty in patient recruitment. One approach to addressing these challenges involves enhancing sample efficiency and integrating biomedical knowledge into data-driven models. This has shown considerable potential in boosting the accuracy, robustness, and interpretability of model outcomes, representing a significant advancement in applying DL within cancer applications. To tackle with sample efficiency and biomedical knowledge integration, this thesis focuses on developing knowledge-informed image-based DL methodologies that are more efficient and generalizable, and seeking practical solutions for real-world cancer applications. 

 

Chapter 2 proposes BioNet, a biologically informed neural network designed to predict the regional distributions of two primary tissue-specific gene modules in recurrent glioblastoma using medical imaging data. BioNet introduces a novel integration of multiple forms of implicit, qualitative biological domain knowledge into the deep learning pipeline to enable pixel-level prediction of gene module distributions. To overcome the challenge posed by the limited availability of image-localized biopsy datasets, BioNet generates virtual biopsy data for pretraining, which supports the learning of more generalizable features. Inspired by the biological relationships between gene modules, BioNet employs a hierarchical architecture and incorporates a knowledge attention loss that combines data-driven and knowledge-driven components to penalize biologically inconsistent predictions.

 

Chapter 3 presents KIMAS, a knowledge-inspired self-supervised masked autoencoder for sparse supervision, designed to address the critical challenge of accurate spatial dense prediction of genetic alterations from non-invasive imaging in glioblastoma. This challenge arises primarily from the extreme sparsity of biopsy-labeled data and the high inter-patient heterogeneity. KIMAS integrates a knowledge-inspired self-supervised pretraining task that reconstructs a synthetic healthy version of the tumoral region of interest, enabling the encoder to learn patient-specific features while excluding pathological variation. It further incorporates a contextualized representation learning approach based on Vision Transformers (ViTs), which captures local features with global context, and a knowledge-informed fine-tuning strategy that combines auxiliary dense-label pretraining with regularization based on organ structural symmetry and prediction map smoothness. 

 

Chapter 4 introduces SmoothSegNet, a novel deep learning framework designed to improve imaging based liver tumor segmentation by addressing key challenges including tumor heterogeneity, vague boundaries, and limited labeled data. The model incorporates a knowledge-informed label smoothing technique that distills clinical insights to generate soft supervision signals, reducing overfitting and enhancing generalization. It employs a global-local segmentation architecture that decomposes the task into two sub-tasks, liver and tumor segmentation, allowing targeted optimization and training for each sub-task. Additionally, customized pre- and post-processing pipelines are integrated to enhance tumor visibility and refine segmentation boundaries, further improving model precision in complex clinical scenarios.

Status

  • Workflow Status:Published
  • Created By:Tatianna Richardson
  • Created:05/27/2025
  • Modified By:Tatianna Richardson
  • Modified:05/27/2025

Categories

Keywords

Target Audience