event

PhD Defense by Yizheng Chen

Primary tabs

Title: Improving Robustness of DNS Graph Clustering Against Noise

 

Yizheng Chen

Ph.D. Candidate

School of Computer Science

College of Computing

Georgia Institute of Technology

 

Date: Friday, October 13th, 2017

Time: 10 AM - Noon (ET)

Location: Klaus 3126

 

Committee:

------------------------

Dr. Emmanouil Antonakakis (Co-advisor, School of Electrical and Computer Engineering, Georgia Institute of Technology) Dr. Wenke Lee (Co-advisor, School of Computer Science, Georgia Institute of Technology) Dr. Mustaque Ahamad (School of Computer Science, Georgia Institute of Technology) Dr. Raheem Beyah (School of Electrical and Computer Engineering, Georgia Institute of Technology) Dr. Roberto Perdisci (Dept. of Computer Science, University of Georgia and School of Computer Science, Georgia Tech)

 

Abstract

------------------------

 

Clustering is often the first step performed to assist us in finding structure within unlabeled datasets. Given a small set of labels, clustering can also propagate these labels by discovering groups of objects that are similar to each other. The ever-growing amount of data being collected over a long period of time brings us opportunities and challenges for conducting clustering. Analyzing such long-term datasets allows us to solve evolving security problems, such as botnet forensic analysis, early warning of new threats, and the evolution of security phenomena. However, the analysis also faces the challenge presented by noise in the data.

 

This thesis improves the robustness of clustering against noise by focusing on DNS graphs. Noise is either inherent in the dataset, or can be injected by adversaries. The first goal of the thesis is to remediate the effect of the noise inherent in the data. To that end, we perform measurement studies from two different vantage points in the online advertising ecosystem. As a multi-billion dollar industry, the online ad ecosystem naturally attracts ad abuse from miscreants. We propose a new clustering technique to automatically analyze the cost of impression fraud to advertisers generated by the botnet TDSS/TDL4 over four years. In addition, our measurement results show statistically significant differences between blacklisted publishers compared to those that were never blacklisted, from the vantage point of a Demand Side Platform provider.

 

The second goal of the thesis is to increase the robustness of clustering against adversarial noise. Little work has been done in adversarial clustering in order to understand the weaknesses of clustering systems. We propose two novel attacks, one that injects noise to existing clusters, and one that moves data points to noisy clusters. After analyzing the effectiveness and the cost of attacks, we present defense techniques that improve the robustness of clustering in adversarial settings.

Status

  • Workflow Status:Published
  • Created By:Tatianna Richardson
  • Created:10/10/2017
  • Modified By:Tatianna Richardson
  • Modified:10/10/2017

Categories

Keywords