event
PhD Proposal by Xinyuan Cao
Primary tabs
Title: Foundations of Efficient Representation Learning
Date: Tuesday, February 11th
Time: 1:30 PM - 3:00 PM EST
Location: (Hybrid) Klaus 3100; Zoom link: https://gatech.zoom.us/j/94019785975
Xinyuan Cao
Machine Learning Ph.D. Student
School of Computer Science
Georgia Institute of Technology
Committee:
• Dr. Santosh Vempala (Advisor) | School of Computer Science, Georgia Institute of Technology
• Dr. Jacob Abernethy | School of Computer Science, Georgia Institute of Technology
• Dr. Pan Li | School of Electrical and Computer Engineering, Georgia Institute of Technology
• Dr. Sahil Singla | School of Computer Science, Georgia Institute of Technology
Abstract:
Representation learning refers to a set of machine learning methods that first extract lower-dimensional features from complex, unstructured data and then use the learned features for a variety of downstream tasks. Despite its empirical success across main domains, a rigorous theoretical foundation for representation learning remains underdeveloped. This thesis aims to bridge this gap by developing theoretical guarantees to better understand representation learning and designing practical algorithms based on theoretical foundations.
The first part of this proposal focuses on provable algorithms for feature learning. In supervised settings, I analyze the phenomenon of neural collapse and establish conditions under which it emerges in trained neural networks. In unsupervised learning, I present the first polynomial-time algorithm for learning halfspaces with margins from unlabeled data. My ongoing work explores explainable clustering, where I study the trade-off between interpretability and clustering performance.
In the second part, I investigate the theory of transfer learning, particularly in the lifelong learning setting, where a model sequentially acquires and transfers knowledge from past tasks to future ones. I propose an algorithm with nearly tight sample complexity for this setting, improving on work from a decade ago, and extend it to heuristic algorithms that demonstrate strong empirical performance on real-world datasets.
The final part of the proposal explores the practical applications of representation learning. To address the scalability limitation in graph contrastive learning (GCL), I propose a simple yet effective GCL framework based on a sparse low-rank approximation on the diffusion matrix. This method significantly improves efficiency while maintaining competitive performance and can be extended to dynamic graph settings in future work. Additionally, my ongoing work investigates the benefits of next-token prediction in large language models.
Groups
Status
- Workflow Status:Published
- Created By:Tatianna Richardson
- Created:02/04/2025
- Modified By:Tatianna Richardson
- Modified:02/04/2025
Categories
Keywords
Target Audience