论文信息 - Kibo: An Open-Source Fixed-Point Tool-kit for Training and Inference in FPGA-Based Deep Learning Networks

Kibo: An Open-Source Fixed-Point Tool-kit for Training and Inference in FPGA-Based Deep Learning Networks

Field-Programmable Gate Arrays (FPGAs) have become an essential component of the deep learning landscape, providing a balance between flexibility, customization, and efficiency. One of the key optimizations afforded by FPGA technology is the ability to customize the bit width of fixed-point weights and activations within deep learning networks. In this paper, we present an open-source tool-kit which allows a researcher to investigate different fixed-point representations and saturating arithmetic operations in Python. The tool-kit over-rides arithmetic and comparison functions commonly used in deep learning structures, allowing a researcher to quickly evaluate the impact of alternative numeric representations. Compared to higher-level frameworks such as Tensorflow or PyTorch, a much wider set of numeric precisions can be modeled. Unlike lower-level C-synthesis tools, our tool-kit is written in Python providing the ability to much more rapidly explore architectural alternatives. Our framework is open-source and is available on-line.

Philip Heng Wai Leong | Steven J. E. Wilton | Daniel Holanda Noronha

[1] Samy Bengio,et al. Torch: a modular machine learning software library , 2002 .

[2] David A. Patterson,et al. In-datacenter performance analysis of a tensor processing unit , 2017, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).

[3] Song Han,et al. ESE: Efficient Speech Recognition Engine with Sparse LSTM on FPGA , 2016, FPGA.

[4] Jason Cong,et al. Optimizing FPGA-based Accelerator Design for Deep Convolutional Neural Networks , 2015, FPGA.

[5] Yu Wang,et al. Going Deeper with Embedded FPGA Platform for Convolutional Neural Network , 2016, FPGA.

[6] Hayden Kwok-Hay So,et al. NnCore: A parameterized non-linear function generator for machine learning applications in FPGAs , 2017, 2017 International Conference on Field Programmable Technology (ICFPT).

[7] Eriko Nurvitadhi,et al. Can FPGAs Beat GPUs in Accelerating Next-Generation Deep Neural Networks? , 2017, FPGA.

[8] Philip Heng Wai Leong,et al. FINN: A Framework for Fast, Scalable Binarized Neural Network Inference , 2016, FPGA.

[9] John Salvatier,et al. Theano: A Python framework for fast computation of mathematical expressions , 2016, ArXiv.

[10] Trevor Darrell,et al. Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[11] Zhenyu Liu,et al. Computation Error Analysis of Block Floating Point Arithmetic Oriented Convolution Neural Network Accelerator Design , 2017, AAAI.

[12] Jing Li,et al. Improving the Performance of OpenCL-based FPGA Accelerator for Convolutional Neural Network , 2017, FPGA.

[13] Rajesh Gupta,et al. Accelerating Binarized Convolutional Neural Networks with Software-Programmable FPGAs , 2017, FPGA.

[14] Philipp Gysel,et al. Ristretto: Hardware-Oriented Approximation of Convolutional Neural Networks , 2016, ArXiv.

[15] Andrew C. Ling,et al. An OpenCL™ Deep Learning Accelerator on Arria 10 , 2017, FPGA.

[16] Martín Abadi,et al. TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems , 2016, ArXiv.

[17] Guangwen Yang,et al. F-CNN: An FPGA-based framework for training Convolutional Neural Networks , 2016, 2016 IEEE 27th International Conference on Application-specific Systems, Architectures and Processors (ASAP).

[18] Lin Sun,et al. FPGA-based training of convolutional neural networks with a reduced precision floating-point library , 2017, 2017 International Conference on Field Programmable Technology (ICFPT).

[19] Christos-Savvas Bouganis,et al. Approximate FPGA-based LSTMs under Computation Time Constraints , 2018, ARC.