event

PhD Defense by Hyoukjun Kwon

Primary tabs

Title: Data- and Communication-centric Approaches to Model and Design Flexible Deep Neural Network Accelerators

Hyoukjun Kwon
PhD Candidate 
School of Computer Science
Georgia Institute of Technology 
http://hyoukjunkwon.com/

Date: Thursday, July 16th, 2020

Time: 3 -5 pm
Location:  https://bluejeans.com/219451978 (remote)

Committee:
Dr. Tushar Krishna (advisor), School of Electrical and Computer Engineering, Georgia Institute of Technology
Dr. Vivek Sarkar, School of Computer Science, Georgia Institute of Technology
Dr. Hyesoon Kim, School of Computer Science, Georgia Institute of Technology
Dr. Alexey Tumanov, School of Computer Science, Georgia Institute of Technology
Dr. Micahel Pellauer, Architecture Research Group, NVIDIA

Abstract:

Deep neural network (DNN) acceleration has emerged as an enabler of many applications such as image classification, face recognition, natural language processing, that was challenging to achieve high operational performance (i.e., accuracy or quality of outputs). Since recent DNNs involve billions of multiply-and-accumulate (MAC) operations with millions of parameters, DNN accelerators, specialized hardware for DNN computation, have emerged. However, designing dedicated hardware for each DNN model requires high development costs while DNN models and algorithms rapidly evolve. In addition, specializing a DNN accelerator for one DNN model with limited support for compiler mappings often leads to inefficiency for other DNN models. Therefore, this thesis explores flexible DNN accelerator designs that support diverse compiler mappings (i.e., dataflow + tile sizes for each data dimension) to adapt to new DNN models without re-designing hardware.

This thesis first focuses on the modeling costs and benefits of mapping choices to quantify the potential costs and benefits of mapping choices considering underlying hardware. We codify the cost model and implement MAESTRO, and perform case studies that show no single mapping is ideal for all the layers. For the flexible DNN accelerator designs, this thesis addresses the challenge from two perspectives: reconfigurability and heterogeneity.

For the reconfigurability approach, this thesis focuses on the data movement since the cost of data movement dominates in DNN accelerators, and the rearranging the data movement is effectively equivalent to programming a DNN accelerator considering the nature of predefined target application. We propose a light-weight network-on-chip (NoC) architecture, Microswitch NoC, specialized for DNN accelerator traffic while providing sufficient flexibility for any dataflow. We also present a reconfigurable DNN accelerator design, MAERI, that employs reconfigurable data distribution and reduction NoCs that support all the communication patterns in DNN accelerators and perform reduction inside NoC switches (i.e., in-network-processing style). MAERI enables to map computations on compute units without underutilizing PEs for any irregular DNN computations resulting from diverse layers and various optimizations (e.g., cross layer mapping, sparsity, etc.).

For the heterogeneity approach, this thesis explores heterogeneous DNN accelerators (HDAs), which contains multiple sub-accelerators that contain different amount of hardware resources and run different dataflows. For the HDA-based approach, this thesis proposes a comprehensive HDA optimization framework, Herald, that automatically explore optimization opportunities of mapping DNN layers to a sub-accelerator with the lowest EDP at run time and proper hardware resource partitioning at design time. Finally, we formally define the mapping flexibility so that we can quantify the degree of flexibility of flexible accelerators, which enables comprehensive comparison across flexible DNN accelerators.

Status

  • Workflow Status:Published
  • Created By:Tatianna Richardson
  • Created:07/07/2020
  • Modified By:Tatianna Richardson
  • Modified:07/07/2020

Categories

Keywords