event

CSE Seminar: Hanghang Tong

Primary tabs

Hanghang Tong

Machine Learning Department at Carnegie Mellon

For more information please contact Dr. Guy Lebanon

Title:

Fast Algorithms for Querying and Mining Large Graphs

Abstract:

Graphs appear in a wide range of settings and have posed a wealth of fascinating problems. In this talk, I will present our recent work on (1) querying (e.g., given a social network, how to measure the closeness between two persons? how to track it over time?); and (2) mining (e.g., how to identify abnormal behaviors of computer networks? In the case of virus attacks, which nodes are the best to immunize?) large graphs.

For the task of querying, our main finding is that many complex user-specific patterns on large graphs can be answered by means of proximity measurement. In other words, proximity allows us to query large graphs on the atomic levels. Then, I will talk about how to adapt querying tasks to the time evolving graphs. For fast computation of proximity, we developed a family of fast solutions to compute the proximity in several different scenarios. By carefully leveraging some important properties shared by many real graphs (e.g., the block-wise structure, the linear correlation, the skewness of real bipartite graphs, etc), we can often achieve orders of magnitude of speedup with little or no quality loss. For the task of mining, I will talk about immunization and anomaly detection. For immunization, we proposed a near-optimal, fast and scalable algorithm. For anomaly detection, we proposed a family of example-based low-rank matrix approximation methods. The proposed algorithms are provably equal to or better than best known methods in both space and time, with the same accuracy. On real data sets, it is up to 112x faster than the best competitors, for the same accuracy.

Bio:

Hanghang Tong  got his Ph.D in the Machine Learning Department at Carnegie Mellon University in 2009. He has received best paper awards from  SIAM-DM 2008 and ICDM 2006. He holds an M.S. degree and a B.S. degree from Tsinghua University, P.R. China. His research interests include data mining for multimedia.

To receive future announcements, please sign up to the cse-seminar email list:

https://mailman.cc.gatech.edu/mailman/listinfo/cse-seminar

Status

  • Workflow Status:Published
  • Created By:Mike Terrazas
  • Created:03/11/2010
  • Modified By:Fletcher Moore
  • Modified:10/07/2016

Keywords

  • No keywords were submitted.