PhD Proposal by Hyoukjun Kwon
Title: Data-centric approaches to model and design flexible deep neural network accelerators Hyoukjun Kwon Ph.D. Student School of Computer Science Georgia Institute of Technology https://hyoukjunkwon.com Date: Monday, Nov 11, 2019 Time: Noon-2pm (EST) Location: KACB 3100 Committee: —— Dr. Tushar Krishna (Advisor, School of Electric and Computer Engineering, Georgia Institute of Technology) Dr. Vivek Sarkar (School of Computer Science, Georgia Institute of Technology) Dr. Hyesoon Kim (School of Computer Science, Georgia Institute of Technology) Dr. Michael Pellauer (Senior Research Scientist, Architecture Research Group, NVIDIA) —— Abstract: —— Deep neural network (DNN) has emerged as an enabler of many applications such as image classification, face recognition, natural language processing, which were challenging to achieve high operational performance (i.e., accuracy or quality of outputs). Since DNNs involve billions of multiply-and-accumulate (MAC) operations with millions of parameters, DNN accelerators, specialized hardware for DNN computation, have emerged to deal with the heavy computation. However, designing a dedicated accelerator for each DNN model requires high development costs, and the accelerator under development can be easily outdated because DNN models and algorithms rapidly evolve. In addition, specializing a DNN accelerator for one DNN model often leads to inefficiency for other DNN models. Therefore, this proposal explores flexible DNN accelerator designs that support diverse compiler mappings (or dataflows) to adapt to new DNN models without redesigning hardware. This proposal addresses the challenge from two perspectives: reconfigurability and heterogeneity. For the reconfigurability approaches, this proposal focuses on the data movement since the cost of data movement dominates in DNN accelerators. We propose a light-weight network-on-chip (NoC) architecture, Microswitch NoC, specialized for DNN accelerator traffic while providing sufficient flexibility for any dataflow. We also present a reconfigurable DNN accelerator design, MAERI, that employs a reconfigurable reduction NoC that performs reduction inside NoC switches. MAERI provides near 100% compute unit utilization for any irregular DNN computations resulting from diverse layers and various optimizations (e.g., cross-layer mapping, sparsity, etc.). For the heterogeneity approach, this proposal explores heterogeneous DNN accelerators (HDAs), which contains multiple sub-accelerators that contain different amounts of hardware resources and run different dataflows. For the HDA-based approach, this proposal proposes a comprehensive HDA optimization framework, HERALD, that automatically explore optimization opportunities of mapping DNN layers to a sub-accelerator with the lowest EDP at run time and proper hardware resource partitioning at design time.
- Workflow Status: Published
- Created By: Tatianna Richardson
- Created: 11/04/2019
- Modified By: Tatianna Richardson
- Modified: 11/04/2019