event

Ph.D. Proposal Oral Exam - Foroozan Karimzadeh

Primary tabs

Title:  Hardware-Friendly Model Compression for DNN Accelerators

Committee: 

Dr. Raychowdhury, Advisor    

Dr. Yu, Chair

Dr. Romberg

Abstract: The objective of the proposed research is to introduce solutions to make powerful Deep Neural Network, DNN, algorithms to be deployable on edge devices through developing hardware-aware DNN compression methods. The rising popularity of intelligent mobile devices and the computational cost of deep learning-based models call for efficient and accurate on-device inference schemes. In particular, we proposed two compression techniques. In the first method, LGPS, we present a hardware-aware pruning method where the locations of non-zero weights are derived in real-time from a LFSR. Using the pro-posed method, we demonstrate a total saving of energy and area up to 63.96% and 64.23%for VGG-16 network on down-sampled ImageNet, respectively for iso-compression-rate and iso-accuracy. Secondly, we propose a novel model compression scheme that allows inference to be carried out using bit-level sparsity, which can be efficiently implemented using in-memory computing macros. We introduce a method called BitS-Net to leverage the benefits of bit-sparsity (where the number of zeros is more than number of ones in binary representation of weight/activation values) when applied to Compute-In-Memory(CIM) with Resistive Random-Access Memory (RRAM) to develop energy efficient DNN accelerators operating in the inference mode. We demonstrate that BitS-Net improves the energy efficiency by up to 5x for ResNet models on the ImageNet dataset.

Status

  • Workflow Status:Published
  • Created By:Daniela Staiculescu
  • Created:11/05/2021
  • Modified By:Daniela Staiculescu
  • Modified:11/05/2021

Categories

Target Audience