event

Ph.D. Dissertation Defense - Eric Qin

Primary tabs

TitleBuilding Efficient Tensor Accelerators for Sparse and Irregular Workloads

Committee:

Dr. Tushar Krishna, ECE, Chair, Advisor

Dr. Hyesoon Kim, ECE

Dr. Richard Vuduc, CoC

Dr. Sivasankaran Rajamanickam, Sandia National Labs

Dr. Callie Hao, ECE

Abstract: Popular Machine Learning (ML) and High Performance Computing (HPC) workloads contribute to a significant portion of runtime on data centers. Applications include image classification, speech recognition, recommendation systems, social network analysis, robotic problems, chemical process simulations, and so on. Recently due to large computational demands from emerging workloads, there is a surge of custom hardware accelerator development for computing tensor kernels with high performance and energy efficiency. For example, the Google Tensor Processing Unit (TPU) is a custom hardware accelerator targeting efficient matrix multiplications for Deep Neural Networks (DNNs). However, there are limitations with state-of-the-art accelerators, stemming from (1) a vast spectrum of sparsity across various workloads and (2) irregularity of tensor dimensions (e.g. tall-skinny matrices). This thesis explores novel methodologies and architectures for building efficient accelerators for sparse tensor algebra. The first major contribution of this thesis is the proposal of using specialized on-chip interconnects to provide flexible computational mappings of sparse and irregular matrices onto processing elements (PEs). With the proposed specialized interconnects, this thesis presents a new sparse DNN accelerator targeting workloads with 30% to 100% density (percentage of nonzeros) named SIGMA. Unlike popular DNNs, HPC workloads utilize tensors spanning from 10^-6% dense to fully dense. The second major contribution of this thesis explores the system impact of utilizing various compression formats across all sparsity regions. This thesis proposes a predictor to determine the the best compression format combination and a custom hardware compression format converter named MINT. Together, they provide significant energy-delay product (EDP) improvement over state-of-the-art accelerators. The third major contribution of this thesis analyzes popular state-of-the-art sparse accelerators using a new tool named Hard TACO. The impact of Hard TACO is that it allows realistic architectural exploration of homogeneous and heterogeneous accelerators.

Status

  • Workflow Status:Published
  • Created By:Daniela Staiculescu
  • Created:02/03/2022
  • Modified By:Daniela Staiculescu
  • Modified:02/03/2022

Categories

Target Audience