A Twofold Lookup Table Architecture for Efficient Approximation of Activation Functions

In this article, we propose a novel approach to reduce hardware resource consumption when neural networks (NNs) are deployed on field-programmable gate array (FPGA) boards. Rather than using a classical approach with lookup tables (LUTs) to approximate the activation functions of an NN, the proposed solution is based on a twofold LUT (t-LUT) architecture, which comprises an error-LUT (e-LUT) and a data-LUT (d-LUT), in order to achieve high precision and speed as well as low hardware resource consumption. The efficiency of the proposed approach was tested against multiple earlier approaches. Our solution showed that the compressibility of the previously referenced works, which were based on single LUTs, could be improved by up to 94.44% and those that were based on a range addressable LUT (RALUT) by up to 6.35% in the examined case of a hyperbolic tangent (tanh) activation function. Moreover, when RALUT and our architecture were combined, it improved the compressibility of the RALUT-based result by up to additional 10.21% for a tanh activation function. The designed architecture had an initial latency of 39.721 ns, when tested with a 50-MHz clock, to simultaneously retrieve data from the d-LUT and t-LUTs.

[1]  Michael J. Schulte,et al.  Symmetric table addition methods for neural network approximations , 2001, SPIE Optics + Photonics.

[2]  Leonardo Franco,et al.  High precision FPGA implementation of neural network activation functions , 2014, 2014 IEEE Symposium on Intelligent Embedded Systems (IES).

[3]  Jeen-Shing Wang,et al.  A digital circuit design of hyperbolic tangent sigmoid function for neural networks , 2008, 2008 IEEE International Symposium on Circuits and Systems.

[4]  Stamatis Vassiliadis,et al.  Elementary function generators for neural-network emulators , 2000, IEEE Trans. Neural Networks Learn. Syst..

[5]  Francesco Piazza,et al.  Neural networks with digital LUT activation functions , 1993, Proceedings of 1993 International Conference on Neural Networks (IJCNN-93-Nagoya, Japan).

[6]  Shen-Fu Hsiao,et al.  Table Size Reduction Methods for Faithfully Rounded Lookup-Table-Based Multiplierless Function Evaluation , 2015, IEEE Transactions on Circuits and Systems II: Express Briefs.

[7]  Jörg Henkel,et al.  Efficient Code Density Through Look-up Table Compression , 2007, 2007 Design, Automation & Test in Europe Conference & Exhibition.

[8]  Zbigniew Hajduk,et al.  High accuracy FPGA activation function implementation for neural networks , 2017, Neurocomputing.

[9]  Mitra Mirhassani,et al.  Efficient VLSI Implementation of Neural Networks With Hyperbolic Tangent Activation Function , 2014, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[10]  Roberto Muscedere,et al.  A dynamic address decode circuit for implementing range addressable look-up tables , 2008, 2008 IEEE International Symposium on Circuits and Systems.

[11]  Michael J. Schulte,et al.  Approximating Elementary Functions with Symmetric Bipartite Tables , 1999, IEEE Trans. Computers.

[12]  Roberto Muscedere,et al.  Efficient techniques for binary-to-multidigit multidimensional logarithmic number system conversion using range-addressable look-up tables , 2005, IEEE Transactions on Computers.

[13]  J. M. Tarela,et al.  Approximation of sigmoid function and the derivative for hardware implementation of artificial neurons , 2004 .

[14]  Arnaud Tisserand,et al.  Multipartite table methods , 2005, IEEE Transactions on Computers.

[15]  Pramod Kumar Meher An optimized lookup-table for the evaluation of sigmoid function for artificial neural networks , 2010, 2010 18th IEEE/IFIP International Conference on VLSI and System-on-Chip.

[16]  Erik Reinhard,et al.  Repeated Look-Up Tables , 2020, IEEE Transactions on Image Processing.

[17]  Majid Ahmadi,et al.  Efficient hardware implementation of the hyperbolic tangent sigmoid function , 2009, 2009 IEEE International Symposium on Circuits and Systems.

[18]  Debjit Das Sarma,et al.  Faithful bipartite ROM reciprocal tables , 1995, Proceedings of the 12th Symposium on Computer Arithmetic.

[19]  Huapeng Wu,et al.  High Speed VLSI Implementation of the Hyperbolic Tangent Sigmoid Function , 2008, 2008 Third International Conference on Convergence and Hybrid Information Technology.