PhD Proposal by Hantian Zhang

Title

Data-Centric Bias Mitigation in Machine Learning

Hantian Zhang

Ph.D. Candidate in Computer Science

School of Computer Science

Georgia Institute of Technology

Date/Time: Nov 16, 2023, 8:00 AM to 10:00AM Eastern Time (US and Canada)

Location: Klaus 3100 or join with zoom via https://gatech.zoom.us/j/98209258105?pwd=VWp1ZmhIdlN2dWMzV2EwVnJjc0xmUT09

Join our Cloud HD Video Meeting

Zoom is the leader in modern enterprise video communications, with an easy, reliable cloud platform for video and audio conferencing, chat, and webinars across mobile, desktop, and room systems. Zoom Rooms is the original software-based conference room solution used around the world in board, conference, huddle, and training rooms, as well as executive offices and classrooms. Founded in 2011, Zoom helps businesses and organizations bring their teams together in a frictionless environment to get more done. Zoom is a publicly traded company headquartered in San Jose, CA.

gatech.zoom.us



Committee:

Dr. Xu Chu(co-advisor), School of Computer Science, Georgia Institute of Technology

Dr. Kexin Rong(co-advisor), School of Computer Science, Georgia Institute of Technology

Dr. Joy Arulraj, School of Computer Science, Georgia Institute of Technology

Dr. Shamkant Navathe, School of Computer Science, Georgia Institute of Technology

Dr. Steven Whang, School of Electrical Engineering, KAIST

Abstract:

As Machine Learning (ML) becomes increasingly central to decision-making processes in our society, it is crucial to acknowledge the potential of these ML models to inadvertently perpetuate biases, disproportionately impacting certain demographic groups and individuals. For instance, some ML models used in judicial systems have shown biases against African Americans when predicting recidivism rates. Therefore, addressing the inherent biases and ensuring fairness in ML models is imperative. While enhancements in fairness can be implemented by changing the ML models directly, I argue that a more foundational solution lies in correcting the data as biased data is often the root cause of unfairness.

In my proposed thesis, I aim to systematically understand and mitigate biases in ML models in the full ML life-cycle, from data preparation (pre-processing), to model training (in-processing) and model validation (post-processing). First, I develop a pioneering system, iFlipper, that optimizes for individual fairness in ML. iFlipper enhances training data during data preparation by adjusting the labels, thus mitigating inconsistencies that arise when similar individuals receive varying outcomes. Subsequently, I introduce a declarative system OmniFair that aims at bolstering group fairness in ML. OmniFair allows users to define specific group fairness constraints and change the weight of each training sample during the training process to achieve given group fairness constraints. Finally, I propose to discover and explain semantically coherent subsets (slices) of unstructured data where the ML models underperform after the models are trained . With a good understanding of where the ML models are doing poorly, we can improve the ML models by augmenting the dataset and more examples for that specific slice.

Media

No media selected

Summary

Data-Centric Bias Mitigation in Machine Learning

Details

Thursday

Nov 16 2023

08:00am - 10:00am

Location: Klaus 3100 or join with zoom

In campus calendar: No

Sidebar Content

No sidebar content

Groups

Graduate Studies

Status

Workflow status: Published
Created by: Tatianna Richardson
Created: 11/09/2023
Modified By: Tatianna Richardson
Modified: 11/09/2023

Mercury (Hg)