PhD Defense by Kisung Lee

Primary tabs

Ph.D. Dissertation Defense Announcement

Title: Scalable Big Data Systems: Architectures and Optimizations

Kisung Lee
School of Computer Science
College of Computing
Georgia Institute of Technology

Date: Thursday, April 30, 2015
Time: 10:00 AM - 12:00 PM EDT
Location: KACB 3402

Dr. Ling Liu (Advisor, School of Computer Science, Georgia Institute of Technology)
Dr. Ed Omiecinski (School of Computer Science, Georgia Institute of Technology)
Dr. Calton Pu (School of Computer Science, Georgia Institute of Technology)
Dr. Karsten Schwan (School of Computer Science, Georgia Institute of Technology)
Dr. Lakshmish Ramaswamy (Department of Computer Science, University of Georgia)

With continued advances in computing and information technology, digital data have grown at an astonishing rate in terms of volume, variety, and velocity. Such big data have huge potential to reveal hidden insights and promote innovation in many business, science, and engineering domains. An important technical challenge faced by many big data systems and applications is how to build efficient big data processing systems and applications that can scale to the rapid growth of digital data in the 21st century.

Dedicated to the development of architectures and optimization techniques for scaling big data processing systems, especially in the era of cloud computing, this dissertation makes three unique contributions. First, it introduces a suite of graph partitioning algorithms that can run much faster than existing data distribution methods and inherently scale to the growth of big data. The main idea of these approaches is to partition a big graph by preserving the core computational data structure as much as possible to maximize intra-server computation and minimize inter-server communication. In addition, it proposes a distributed iterative graph computation framework that effectively utilizes secondary storage to maximize access locality and speed up distributed iterative graph computations. The framework not only considerably reduces memory requirements for iterative graph algorithms but also significantly improves the performance of iterative graph computations. Last but not the least, it establishes a suite of optimization techniques for scalable spatial data processing along with three orthogonal dimensions: (i) scalable processing of spatial alarms for mobile users traveling on road networks, (ii) scalable location tagging for improving the quality of Twitter data analytics and prediction accuracy, and (iii) lightweight spatial indexing for enhancing the performance of big spatial data queries.

In this defense exam, I will briefly highlight these technical contributions and focus on presenting the distributed system for iterative graph algorithms, including system architecture, optimizations, and experimental evaluation.


  • Workflow Status:Published
  • Created By:Tatianna Richardson
  • Created:04/13/2015
  • Modified By:Fletcher Moore
  • Modified:10/07/2016


Target Audience