PhD Proposal by Viraj Prabhu
Title: Visual Domain Adaptation with Flexible Data Assumptions
Date: Monday, November 21, 2022
Time: 4:00 PM - 6:00 PM EST
Location (in-person + virtual): Coda C1115 (Druid Hills) and Zoom (https://gatech.zoom.us/j/93422750042?)
Ph.D. Student in Computer Science
College of Computing
Georgia Institute of Technology
Dr. Judy Hoffman (advisor, School of Interactive Computing, Georgia Institute of Technology)
Dr. Dhruv Batra (School of Interactive Computing, Georgia Institute of Technology, Meta)
Dr. James Hays (School of Interactive Computing, Georgia Institute of Technology)
Dr. Zsolt Kira (School of Interactive Computing, Georgia Institute of Technology)
Dr. Sanja Fidler (University of Toronto, NVIDIA)
Modern deep learning-based computer vision models struggle to generalize to domains different from the ones in which they were trained; methods to adapt trained models to new domains, called domain adaptation (DA), is therefore critical to their real-world adoption. Such algorithms will alleviate the need to label a large corpus for every new deployment, which may be infeasible due to data volume (e.g. autonomous driving) or labeling cost (e.g. medical diagnosis). Further, they are necessary to overcome the natural spatio-temporal distribution shifts that a deployed model will invariably face (e.g. changing geographies and seasons). Finally, such methods will unlock the possibility of knowledge transfer from sources of data that are cheaper to label (e.g., transferring embodied agents trained in simulation to reality).
Despite considerable progress in the last decade, modern DA methods make restrictive data assumptions which limit their utility. In this thesis, we propose algorithms for domain adaptation with flexible data assumptions: i) Existing distribution matching-based unsupervised DA methods focus on data distribution shift and struggle in the presence of additional label distribution shift. In practice however, this assumption is unrealistic as we frequently do not have control over the target dataset. We propose a robust selective self-training algorithm based on predictive consistency for adapting object recognition models in this challenging setting, that significantly outperforms prior work. Next, we extend this approach to achieve state-of-the-art performance in new settings (test-time adaptation), tasks (dense prediction), architectures (vision transformers), and initializations (self-supervised learning). ii) Traditional semi-supervised DA assumes labels for a randomly selected subset of the target domain whereas in practice, it is often feasible to select target instances for labeling via active learning (AL). We study this understudied problem of Active DA, and find that existing uncertainty-sampling and diversity-based AL paradigms are suboptimal in this setting. We propose an effective hybrid label acquisition strategy that identifies both uncertain and diverse instances for labeling and leads to improved performance. iii) Finally, in proposed work we study sim2real adaptation for autonomous driving applications, wherein it is common to have plentiful labels in both source (via driving simulators) and target domains (via manual and machine labeling). We formulate this as a problem of supervised domain adaptation, and characterize obstacles to transfer in the form of cross-domain appearance and content gaps. We then propose a theoretically-motivated algorithm to bridge both gaps that leads to strong improvements for 2D object detection.