event

PhD Defense by Seonmyeong Bak

Primary tabs

Title: Runtime Approaches to Improve the Efficiency of Hybrid and Irregular Applications

 

Seonmyeong Bak

Ph.D. Candidate

School of Computer Science

Georgia Institute of Technology

 

Date: Tuesday, November 3rd, 2020
Time: 2:00 pm to 4:00 pm (EST)
Location: *No Physical Location*

BlueJeans:  https://bluejeans.com/sbak3

 

Committee:
Dr. Vivek Sarkar (advisor), School of Computer Science, Georgia Institute of Technology

Dr. Ümit V. Çatalyürek, School of Computational Science and Engineering, Georgia Institute of Technology
Dr. Ada Gavrilovska, School of Computer Science, Georgia Institute of Technology
Dr. Tushar Krishna, School of Electrical and Computer Engineering, Georgia Institute of Technology
Dr. Alexey Tumanov, School of Computer Science, Georgia Institute of Technology

Abstract:
On-node parallelism has increased significantly in high-performance computing systems. This huge amount of parallelism can be used to speed up regular parallel applications easily because straightforward approaches usually suffice to map their computation patterns and data layouts on to available on-node parallelism. However, irregular parallel applications require considerable effort to run on the modern processors with massive amounts of intra-node parallelism. Parallel programming models and runtime approaches have been proposed to help programmers to write
those applications quickly, but it’s still not easy to write efficient irregular parallel applications. Two key challenges in mapping irregular applications onto on-node parallelism are load balance and computation-communication overlap. In this thesis defense, we address these challenges through new runtime approaches and new APIs that enable users to provide minimal information for application-aware scheduling.

First, we introduce new algorithms to improve the scheduling of irregular task graphs containing a mix of communication and computation tasks with data-parallelism and blocking operations. We combine gang-scheduling with work-stealing for data parallel tasks with frequent inter/intra-node communication in the task graphs so as to reduce interference and expensive context switching operations. We also propose
improved victim selection policies for work-stealing to improve the load balance and overlap of ready tasks that have child tasks.

Next, we propose an efficient integrated runtime system to handle load balancing of irregular applications written in hybrid parallel programming models. We introduce a unified runtime system that integrates distributed and shared-memory programming, as exemplified by the combination of Charm++ and OpenMP. In this approach, all processing resources (cores) can be used flexibly across both the distributed and shared-memory levels, thereby enabling more efficient load balancing at the intra-node level and reduced waiting times for global synchronization at the inter-node
level.

Finally, we propose a set of APIs that enable users to specify functions used to decompose a target loop into subspaces and to create chunks within each subspace for application-specific load balancing. Our runtime leverages the information provided in the APIs to create user-defined chunks and store balanced groups of chunks in a shared data structure indexed by static loop constructs. In this way, the stored information from one invocation of a loop can be reused in following invocations for an improved initial load balance.

Status

  • Workflow Status:Published
  • Created By:Tatianna Richardson
  • Created:10/26/2020
  • Modified By:Tatianna Richardson
  • Modified:10/26/2020

Categories

Keywords