event

PhD Defense by Samira Samadi

Primary tabs

Title: Human Aspects of Machine Learning

 

Samira Samadi

School of Computer Science

College of Computing

Georgia Institute of Technology

 

Date:  Thursday, March 5th, 2020

Time:  11:00 am - 12:30 pm (EST)

Location: Klaus 1315

 

Committee:

Dr. Santosh Vempala (Advisor, School of Computer Science, Georgia Institute of Technology)

Dr. Adam Kalai (Senior Principal Researcher, Microsoft Research New England)

Dr. Jamie Morgenstern (School of Computer Science & Engineering, University of Washington)

Dr. Vivek Sarkar (School of Computer Science, Georgia Institute of Technology)

Dr. Mohit Singh (School of Industrial & Systems Engineering, Georgia Institute of Technology)

 

Abstract:

As humans are inevitably being influenced by machine learning algorithms, it is crucial to study the human aspects of these algorithms. In my research, I investigate several ML paradigms from the viewpoint of human usability, security, and fairness.


In the first line of work, I study human usability and security of password strategies — mental algorithms proposed by Blum and Vempala to help people calculate, in their heads, passwords for different websites without dependence on external devices. I present the first usability study of two password strategies: the 3-word strategy and the letter-code strategy. Furthermore, I show that given a limited amount of memorization, there are humanly usable password strategies that achieve the information-theoretic highest security guarantee.

In the second line of work, I investigate different fairness criteria for several machine learning techniques including principal component analysis (PCA) and spectral clustering. I show on real-world data sets that PCA can inadvertently produce low-dimensional representations with different fidelity for two different populations (e.g., lighter- versus darker-skin tone individuals). I define the notion of Fair PCA and present an efficient algorithm for finding a low-dimensional representation of the data which is nearly-optimal for this measure. I conclude by a study of spectral clustering with the constraint that every demographic is proportionally represented in each cluster. For this goal, I develop variants of constrained spectral clustering and show that they help find fairer clusterings on real data. 

Status

  • Workflow Status:Published
  • Created By:Tatianna Richardson
  • Created:02/18/2020
  • Modified By:Tatianna Richardson
  • Modified:02/18/2020

Categories

Keywords