event
PhD Proposal by Seonmyeong Bak
Primary tabs
Title: Runtime Approaches for Irregular Parallel Applications on Supercomputers
Seonmyeong Bak
Ph.D. Student
School of Computer Science
Georgia Institute of Technology
Email: sbak5@gatech.edu
Homepage: https://sbak5.github.io
Date: Thursday, April 16th, 2020
Time: 2:00 pm to 4:00 pm (EST)
Location: *No Physical Location*
BlueJeans: https://bluejeans.com/sbak3
Committee:
Dr. Vivek Sarkar (advisor), School of Computer Science, Georgia Institute of Technology
Dr. Ümit V. Çatalyürek, School of Computational Science and Engineering, Georgia Institute of Technology
Dr. Ada Gavrilovska, School of Computer Science, Georgia Institute of Technology
Abstract:
On-node parallelism has been increased significantly on high-performance computing systems. This huge amount of parallelism can give speed-up to regular parallel applications relatively easily because their computation pattern and data layout have inherent parallelism. However, irregular parallel applications require considerable efforts to run on the modern microprocessors with a massive amount of intra-node parallelism. Parallel programming models and runtime approaches have been proposed to help programmers to write those applications quickly, but it’s still not easy to write efficient irregular parallel applications. The common challenges of the irregular applications are load balancing and overlapping of computation and communication.
In this thesis proposal, we resolve the load balancing and overlapping issues in irregular applications through runtime approaches and APIs, where users provide a runtime system with minimal information for application-aware scheduling.
First, we propose an efficient integrated runtime system to handle load balancing of the irregular applications written in hybrid parallel programming models. Our runtime integrates distributed and shared memory programming models into a unified runtime system. In this runtime system, all the cores can be used across different levels of programming models, which enables more efficient load balancing at the intra-node level and reduces waiting time for global synchronization in the inter-node level.
Besides, we also provide users with a set of APIs where user can specify functions used to decompose a target loop into subspaces and create chunks within each subspace. Our runtime uses user functions to create chunks in a user-defined way and store balanced groups of chunks in a shared data structure indexed by unique information of each loop. The loop reuses the stored information in the next invocation for a better initial load balance.
Lastly, we suggest scheduling algorithms to improve irregular task graphs having a mixed sequence of communication and computation tasks with data-parallelism and blocking operations.
We combine gang-scheduling with work-stealing for data-parallel tasks with frequent inter/intra-node communication in the task graphs for minimized interference and unnecessary context switching.
Also, we propose improved victim selection for work-stealing to improve the overlapping of ready tasks that have child tasks for load balancing.
Groups
Status
- Workflow Status:Published
- Created By:Tatianna Richardson
- Created:04/13/2020
- Modified By:Tatianna Richardson
- Modified:04/13/2020
Categories
Keywords
Target Audience