event

PhD Proposal by Xinyuan Cao

Primary tabs

Title: Foundations of Efficient Representation Learning

 

Date: Tuesday, February 11th

Time: 1:30 PM - 3:00 PM EST

Location: (Hybrid) Klaus 3100; Zoom link: https://gatech.zoom.us/j/94019785975

 

Xinyuan Cao

Machine Learning Ph.D. Student

School of Computer Science

Georgia Institute of Technology

Committee:
• Dr. Santosh Vempala (Advisor) | School of Computer Science, Georgia Institute of Technology

• Dr. Jacob Abernethy | School of Computer Science, Georgia Institute of Technology
• Dr. Pan Li | School of Electrical and Computer Engineering, Georgia Institute of Technology
• Dr. Sahil Singla | School of Computer Science, Georgia Institute of Technology

Abstract:

Representation learning refers to a set of machine learning methods that first extract lower-dimensional features from complex, unstructured data and then use the learned features for a variety of downstream tasks. Despite its empirical success across main domains, a rigorous theoretical foundation for representation learning remains underdeveloped. This thesis aims to bridge this gap by developing theoretical guarantees to better understand representation learning and designing practical algorithms based on theoretical foundations.

 

The first part of this proposal focuses on provable algorithms for feature learning. In supervised settings, I analyze the phenomenon of neural collapse and establish conditions under which it emerges in trained neural networks. In unsupervised learning, I present the first polynomial-time algorithm for learning halfspaces with margins from unlabeled data. My ongoing work explores explainable clustering, where I study the trade-off between interpretability and clustering performance.

 

In the second part, I investigate the theory of transfer learning, particularly in the lifelong learning setting, where a model sequentially acquires and transfers knowledge from past tasks to future ones. I propose an algorithm with nearly tight sample complexity for this setting, improving on work from a decade ago, and extend it to heuristic algorithms that demonstrate strong empirical performance on real-world datasets.

 

The final part of the proposal explores the practical applications of representation learning. To address the scalability limitation in graph contrastive learning (GCL), I propose a simple yet effective GCL framework based on a sparse low-rank approximation on the diffusion matrix. This method significantly improves efficiency while maintaining competitive performance and can be extended to dynamic graph settings in future work. Additionally, my ongoing work investigates the benefits of next-token prediction in large language models.

 

Status

  • Workflow Status:Published
  • Created By:Tatianna Richardson
  • Created:02/04/2025
  • Modified By:Tatianna Richardson
  • Modified:02/04/2025

Categories

Keywords

Target Audience