event

PhD Proposal by Seonmyeong Bak

Primary tabs

Title: Runtime Approaches for Irregular Parallel Applications on Supercomputers

 

Seonmyeong Bak

Ph.D. Student

School of Computer Science

Georgia Institute of Technology

Email: sbak5@gatech.edu

Homepage: https://sbak5.github.io

 

Date: Thursday, April 16th, 2020
Time: 2:00 pm to 4:00 pm (EST)
Location: *No Physical Location*

BlueJeans:  https://bluejeans.com/sbak3

 

Committee:
Dr. Vivek Sarkar (advisor), School of Computer Science, Georgia Institute of Technology

Dr. Ümit V. Çatalyürek, School of Computational Science and Engineering, Georgia Institute of Technology
Dr. Ada Gavrilovska, School of Computer Science, Georgia Institute of Technology

Abstract:

On-node parallelism has been increased significantly on high-performance computing systems. This huge amount of parallelism can give speed-up to regular parallel applications relatively easily because their computation pattern and data layout have inherent parallelism. However, irregular parallel applications require considerable efforts to run on the modern microprocessors with a massive amount of intra-node parallelism. Parallel programming models and runtime approaches have been proposed to help programmers to write those applications quickly, but it’s still not easy to write efficient irregular parallel applications. The common challenges of the irregular applications are load balancing and overlapping of computation and communication.

In this thesis proposal, we resolve the load balancing and overlapping issues in irregular applications through runtime approaches and APIs, where users provide a runtime system with minimal information for application-aware scheduling.

 

First, we propose an efficient integrated runtime system to handle load balancing of the irregular applications written in hybrid parallel programming models. Our runtime integrates distributed and shared memory programming models into a unified runtime system. In this runtime system, all the cores can be used across different levels of programming models, which enables more efficient load balancing at the intra-node level and reduces waiting time for global synchronization in the inter-node level.

 

Besides, we also provide users with a set of APIs where user can specify functions used to decompose a target loop into subspaces and create chunks within each subspace. Our runtime uses user functions to create chunks in a user-defined way and store balanced groups of chunks in a shared data structure indexed by unique information of each loop. The loop reuses the stored information in the next invocation for a better initial load balance.

 

Lastly, we suggest scheduling algorithms to improve irregular task graphs having a mixed sequence of communication and computation tasks with data-parallelism and blocking operations.

We combine gang-scheduling with work-stealing for data-parallel tasks with frequent inter/intra-node communication in the task graphs for minimized interference and unnecessary context switching.

Also, we propose improved victim selection for work-stealing to improve the overlapping of ready tasks that have child tasks for load balancing.

Status

  • Workflow Status:Published
  • Created By:Tatianna Richardson
  • Created:04/13/2020
  • Modified By:Tatianna Richardson
  • Modified:04/13/2020

Categories

Keywords

Target Audience