Multiplierless and Sparse Machine Learning based on Margin Propagation Networks

The new generation of machine learning processors have evolved from multi-core and parallel architectures (for example graphical processing units) that were designed to efficiently implement matrix-vector-multiplications (MVMs). This is because at the fundamental level, neural network and machine learning operations extensively use MVM operations and hardware compilers exploit the inherent parallelism in MVM operations to achieve hardware acceleration on GPUs, TPUs and FPGAs. A natural question to ask is whether MVM operations are even necessary to implement ML algorithms and whether simpler hardware primitives can be used to implement an ultra-energy-efficient ML processor/architecture. In this paper we propose an alternate hardware-software codesign of ML and neural network architectures where instead of using MVM operations and non-linear activation functions, the architecture only uses simple addition and thresholding operations to implement inference and learning. At the core of the proposed approach is margin-propagation based computation that maps multiplications into additions and additions into a dynamic rectifying-linear-unit (ReLU) operations. This mapping results in significant improvement in computational and hence energy cost. The training of a margin-propagation (MP) network involves optimizing an $L_1$ cost function, which in conjunction with ReLU operations leads to network sparsity and weight updates using only Boolean predicates. In this paper, we show how the MP network formulation can be applied for designing linear classifiers, multi-layer perceptrons and for designing support vector networks.

[1]  Gert Cauwenberghs,et al.  Gini Support Vector Machine: Quadratic Entropy Based Robust Multi-Class Probability Regression , 2007, J. Mach. Learn. Res..

[2]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[3]  Ku He,et al.  Modeling and synthesis of quality-energy optimal approximate adders , 2012, 2012 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).

[4]  Gert Cauwenberghs,et al.  Sub-Microwatt Analog VLSI Support Vector Machine for Pattern Classification and Sequence Estimation , 2004, NIPS.

[5]  Tajana Simunic,et al.  ACAM: Approximate Computing Based on Adaptive Associative Memory with Online Learning , 2016, ISLPED.

[6]  Yoshua Bengio,et al.  Neural Networks with Few Multiplications , 2015, ICLR.

[7]  Zhenghao Peng,et al.  AXNet: ApproXimate computing using an end-to-end trainable neural network , 2018, 2018 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).

[8]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[9]  George K. Karagiannidis,et al.  Efficient Machine Learning for Big Data: A Review , 2015, Big Data Res..

[10]  Mianxiong Dong,et al.  Learning IoT in Edge: Deep Learning for the Internet of Things with Edge Computing , 2018, IEEE Network.

[11]  Kaushik Roy,et al.  Design of power-efficient approximate multipliers for approximate artificial neural networks , 2016, 2016 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).

[12]  Mark Horowitz,et al.  1.1 Computing's energy problem (and what we can do about it) , 2014, 2014 IEEE International Solid-State Circuits Conference Digest of Technical Papers (ISSCC).

[13]  Shantanu Chakrabartty,et al.  An Energy-Scalable Margin Propagation-Based Analog VLSI Support Vector Machine , 2007, 2007 IEEE International Symposium on Circuits and Systems.

[14]  Sherief Reda,et al.  DRUM: A Dynamic Range Unbiased Multiplier for approximate applications , 2015, 2015 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).

[15]  Ming Gu,et al.  Sparse Decoding of Low Density Parity Check Codes Using Margin Propagation , 2009, GLOBECOM 2009 - 2009 IEEE Global Telecommunications Conference.

[16]  Tajana Simunic,et al.  CANNA: Neural network acceleration using configurable approximation on GPGPU , 2018, 2018 23rd Asia and South Pacific Design Automation Conference (ASP-DAC).

[17]  Puneet Gupta,et al.  Trading Accuracy for Power with an Underdesigned Multiplier Architecture , 2011, 2011 24th Internatioal Conference on VLSI Design.

[18]  Tajana Simunic,et al.  RMAC: Runtime Configurable Floating Point Multiplier for Approximate Computing , 2018, ISLPED.

[19]  Yoav Freund,et al.  Large Margin Classification Using the Perceptron Algorithm , 1998, COLT.

[20]  L. Sekanina,et al.  Approximate Circuits in Low-Power Image and Video Processing: The Approximate Median Filter , 2017 .

[21]  Kaushik Roy,et al.  AxNN: Energy-efficient neuromorphic systems using approximate computing , 2014, 2014 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED).

[22]  Gert Cauwenberghs,et al.  MARGIN PROPAGATION AND FORWARD DECODING IN ANALOG VLSI , 2003 .

[23]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[24]  Luis Ceze,et al.  Neural Acceleration for General-Purpose Approximate Programs , 2012, 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture.

[25]  Eric A. Vittoz,et al.  Future of analog in the VLSI environment , 1990, IEEE International Symposium on Circuits and Systems.

[26]  Taejoon Park,et al.  Energy-Efficient Approximate Multiplication for Digital Signal Processing and Classification Applications , 2015, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.