event

PhD Defense | Learning with and without human feedback

Primary tabs

Austin Xu - Machine Learning PhD Student - School of Electrical and Computer Engineering

Date: April 15th

Time: 12:30 PM – 2:00 PM ET

Location: Online

Meeting Link: https://gatech.zoom.us/j/9026260477?omn=93570001864

Committee

Mark Davenport (Advisor), School of Electrical and Computer Engineering, Georgia Institute of Technology

Christopher Rozell, School of Electrical and Computer Engineering, Georgia Institute of Technology

Ashwin Pananjady, School of Industrial and Systems Engineering, Georgia Institute of Technology

Justin Romberg, School of Electrical and Computer Engineering, Georgia Institute of Technology

Zsolt Kira, School of Interactive Computing, Georgia Institute of Technology

Abstract

Labels and feedback provided by humans play a central role in training contemporary machine learning models, offering models ground truth annotations from which to extract patterns. However, collecting such feedback from humans is a challenging and time-consuming task. As a result, practitioners must be intentional both in how they choose to query humans for feedback and in the problem settings for which they request feedback. This thesis explores learning from human feedback along two fundamental directions. The first part of the thesis focuses on how we can more effectively learn from human feedback from a mathematically grounded perspective. We first consider how to leverage paired comparisons, a simple mechanism for human feedback, for learning rich models of human preference. We then propose a new mechanism for collecting human feedback aimed at balancing informativeness and cognitive burden. The second part of the thesis focuses on how we can leverage pretrained models to avoid collecting additional human feedback. We consider two specific application settings: retrieval and synthetic dataset generation, and show that existing tools, such as large language models or image editing models, can be used to remove the need for collecting human feedback.

Groups

Status

  • Workflow Status:Published
  • Created By:shatcher8
  • Created:04/09/2024
  • Modified By:shatcher8
  • Modified:04/09/2024

Categories

Keywords

  • No keywords were submitted.