Skip to main navigation Skip to search Skip to main content

Maximizing CNN accelerator efficiency through resource partitioning

  • Stony Brook University

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

265 Scopus citations

Abstract

Convolutional neural networks (CNNs) are revolutionizing machine learning, but they present signifcant computational challenges. Recently, many FPGA-based accelerators have been proposed to improve the performance and effciency of CNNs. Current approaches construct a single processor that computes the CNN layers one at a time; the processor is optimized to maximize the throughput at which the collection of layers is computed. However, this approach leads to ineffcient designs because the same processor structure is used to compute CNN layers of radically varying dimensions. We present a new CNN accelerator paradigm and an accompanying automated design methodology that partitions the available FPGA resources into multiple processors, each of which is tailored for a different subset of the CNN convolutional layers. Using the same FPGA resources as a single large processor, multiple smaller specialized processors increase computational effciency and lead to a higher overall throughput. Our design methodology achieves 3.8x higher throughput than the state-of-the-art approach on evaluating the popular AlexNet CNN on a Xilinx Virtex-7 FPGA. For the more recent SqueezeNet and GoogLeNet, the speedups are 2.2x and 2.0x.

Original languageEnglish
Title of host publicationISCA 2017 - 44th Annual International Symposium on Computer Architecture - Conference Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages535-547
Number of pages13
ISBN (Electronic)9781450348928
DOIs
StatePublished - Jun 24 2017
Event44th Annual International Symposium on Computer Architecture - ISCA 2017 - Toronto, Canada
Duration: Jun 24 2017Jun 28 2017

Publication series

NameProceedings - International Symposium on Computer Architecture
VolumePart F128643

Conference

Conference44th Annual International Symposium on Computer Architecture - ISCA 2017
Country/TerritoryCanada
CityToronto
Period06/24/1706/28/17

Keywords

  • Accelerator
  • Convolutional Neural Network
  • FPGA

Fingerprint

Dive into the research topics of 'Maximizing CNN accelerator efficiency through resource partitioning'. Together they form a unique fingerprint.

Cite this