Fast convolutional neural networks on FPGAs with hls4ml
Artikel i vetenskaplig tidskrift, 2021

We introduce an automated tool for deploying ultra low-latency, low-power deep neural networks with convolutional layers on field-programmable gate arrays (FPGAs). By extending the hls4ml library, we demonstrate an inference latency of 5 mu s using convolutional architectures, targeting microsecond latency applications like those at the CERN Large Hadron Collider. Considering benchmark models trained on the Street View House Numbers Dataset, we demonstrate various methods for model compression in order to fit the computational constraints of a typical FPGA device used in trigger and data acquisition systems of particle detectors. In particular, we discuss pruning and quantization-aware training, and demonstrate how resource utilization can be significantly reduced with little to no loss in model accuracy. We show that the FPGA critical resource consumption can be reduced by 97% with zero loss in model accuracy, and by 99% when tolerating a 6% accuracy degradation.

FPGA

convolutional neural network

deep learning

Författare

Thea Aarrestad

CERN

Vladimir Loncar

CERN

Univerzitet u Beogradu

Nicolo Ghielmetti

Politecnico di Milano

CERN

Maurizio Pierini

CERN

Sioni Summers

CERN

Jennifer Ngadiuba

California Institute of Technology (Caltech)

Christoffer Petersson

Chalmers, Matematiska vetenskaper, Algebra och geometri

Zenseact AB

Hampus Linander

Zenseact AB

Yutaro Iiyama

University of Tokyo

Giuseppe Di Guglielmo

Columbia University

Javier Duarte

University of California

Philip Harris

Massachusetts Institute of Technology (MIT)

Dylan Rankin

Massachusetts Institute of Technology (MIT)

Sergo Jindariani

Fermi National Accelerator Laboratory

Kevin Pedro

Fermi National Accelerator Laboratory

Nhan Tran

Fermi National Accelerator Laboratory

Mia Liu

Purdue University

Edward Kreinar

HawkEye 360

Zhenbin Wu

University of Illinois

Duc Hoang

Rhodes College

Machine Learning: Science and Technology

26322153 (eISSN)

Vol. 2 4 045015

Ämneskategorier

Datorteknik

Bioinformatik (beräkningsbiologi)

Inbäddad systemteknik

DOI

10.1088/2632-2153/ac0ea1

Mer information

Senast uppdaterat

2023-04-21