event

PhD Defense by Prasun Gera

Primary tabs

Title: Overcoming Memory Capacity Constraints for Large Graph Applications on GPUs

Prasun Gera
Ph.D. candidate
School of Computer Science
Georgia Institute of Technology

Date: Thursday, August 13th, 2020
Time: 3 p.m. - 5 p.m. (Eastern Time)
Location: https://bluejeans.com/392070473


Committee:
---------------
Dr. Hyesoon Kim (Advisor, School of Computer Science, Georgia Institute of Technology)
Dr. Santosh Pande (School of Computer Science, Georgia Institute of Technology)
Dr. Richard Vuduc (School of Computer Science and Engineering, Georgia Institute of Technology)
Dr. Moinuddin Qureshi (School of Computer Science, Georgia Institute of Technology)
Dr. Tushar Krishna (School of Electrical and Computer Engineering, Georgia Institute of Technology)


Abstract:
------------
Graphics Processing Units (GPUs) have been used successfully for accelerating a wide variety of
applications in the domains of scientific computing, machine learning, and data analytics over the
last decade. Two important trends that have emerged across these domains are that for a lot of
real-world problems, the working sets are larger than a GPU’s memory capacity, and that the data is
sparse. In this dissertation, we focus on graph analytics, for which the majority of prior work has
been restricted to graphs of modest sizes that fit in memory. Real world graphs such as social
networks and web graphs require tens to hundreds of gigabytes of storage whereas GPU memory is
typically in the order of a few gigabytes. We investigate the following question: How can we
accelerate graph applications on GPUs when the graphs do not fit in memory?

This question opens up two lines of inquiry. First, we consider the system architecture where the
GPU can address larger, albeit slower, host memory that is behind an interconnect such as PCI-e.
While this increases the total addressable memory, graph applications have poor locality that makes
efficient use of this architecture challenging. We formulate the locality problem as a graph
ordering problem and propose efficient reordering methods for large graphs. The solution is general
enough that it can be extended to other similar architectures and even beneficial in cases where
graphs fit in memory.

Second, we consider graph compression as a complementary approach. Conventional approaches to graph
compression have been CPU-centric in that the decompression stage is sequential and branch
intensive. We devise an efficient graph compression format for large sparse graphs that is amenable
to run-time decompression on GPUs. The decompression stage is parallel and load balanced so that it
can use the computational resources of GPUs effectively. In effect, analytics kernels can decompress
the graph and do computation on the fly without decompressing the entire graph since the entire
decompressed graph would not fit in memory.

Status

  • Workflow Status:Published
  • Created By:Tatianna Richardson
  • Created:07/31/2020
  • Modified By:Tatianna Richardson
  • Modified:07/31/2020

Categories

Keywords