OPTIMO: A 65-nm 279-GOPS/W 16-b Programmable Spatial-Array Processor with On-Chip Network for Solving Distributed Optimizations via the Alternating Direction Method of Multipliers

This article presents OPTIMO, a 65-nm, 16-b, fully programmable, spatial-array processor with 49 cores and a hierarchical multi-cast network for solving distributed optimizations via the alternating direction method of multipliers (ADMM). ADMM is a projection-based method for solving generic-constrained optimizations’ problems. In essence, it relies upon decomposing the decision vector into subvectors, updating sequentially by minimizing an augmented Lagrangian function, and eventually updating the Lagrange multiplier. The ADMM algorithm has typically been used for solving problems in which the decision variable is decomposed into two or multiple subvectors. We demonstrate six template algorithms and their applications and measure a peak energy efficiency of 279 GOPS/W.

[1]  Biao Chen,et al.  Distributed average consensus with deterministic quantization: An ADMM approach , 2015, 2015 IEEE Global Conference on Signal and Information Processing (GlobalSIP).

[2]  Paul Schliekelman,et al.  Statistical Methods in Bioinformatics: An Introduction , 2001 .

[3]  Richard F. Gunst,et al.  Applied Regression Analysis , 1999, Technometrics.

[4]  Stephen P. Boyd,et al.  Fast linear iterations for distributed averaging , 2003, 42nd IEEE International Conference on Decision and Control (IEEE Cat. No.03CH37475).

[5]  Justin Romberg,et al.  Efficient Signal Reconstruction via Distributed Least Square Optimization on a Systolic FPGA Architecture , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[6]  Ulrich Rückert,et al.  Comparing Synchronous, Mesochronous and Asynchronous NoCs for GALS Based MPSoCs , 2017, 2017 IEEE 11th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC).

[7]  Dimitri P. Bertsekas,et al.  Incremental Gradient, Subgradient, and Proximal Methods for Convex Optimization: A Survey , 2015, ArXiv.

[8]  Stephen P. Boyd,et al.  Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers , 2011, Found. Trends Mach. Learn..

[9]  M. Yuan,et al.  Model selection and estimation in regression with grouped variables , 2006 .

[10]  Vivek K Goyal,et al.  Foundations of Signal Processing , 2014 .

[11]  Hoi-Jun Yoo,et al.  A 9.02mW CNN-stereo-based real-time 3D hand-gesture recognition processor for smart mobile devices , 2018, 2018 IEEE International Solid - State Circuits Conference - (ISSCC).

[12]  Simon Litsyn,et al.  Efficient Serial Message-Passing Schedules for LDPC Decoding , 2007, IEEE Transactions on Information Theory.

[13]  Tommy Svensson,et al.  The role of small cells, coordinated multipoint, and massive MIMO in 5G , 2014, IEEE Communications Magazine.

[14]  T. Moon,et al.  Mathematical Methods and Algorithms for Signal Processing , 1999 .

[15]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[16]  H. Zou,et al.  Regularization and variable selection via the elastic net , 2005 .

[17]  Seif Haridi,et al.  Distributed Algorithms , 1992, Lecture Notes in Computer Science.

[18]  Joel Emer,et al.  Eyeriss: an Energy-efficient Reconfigurable Accelerator for Deep Convolutional Neural Networks Accessed Terms of Use , 2022 .

[19]  Arijit Raychowdhury,et al.  14.1 A 65nm 1.1-to-9.1TOPS/W Hybrid-Digital-Mixed-Signal Computing Platform for Accelerating Model-Based and Model-Free Swarm Robotics , 2019, 2019 IEEE International Solid- State Circuits Conference - (ISSCC).

[20]  Marian Verhelst,et al.  14.5 Envision: A 0.26-to-10TOPS/W subword-parallel dynamic-voltage-accuracy-frequency-scalable Convolutional Neural Network processor in 28nm FDSOI , 2017, 2017 IEEE International Solid-State Circuits Conference (ISSCC).

[21]  Jotun Hein,et al.  Statistical Methods in Bioinformatics: An Introduction , 2002 .

[22]  Don H. Johnson,et al.  Statistical Signal Processing , 2009, Encyclopedia of Biometrics.

[23]  Johan A. K. Suykens,et al.  Least Squares Support Vector Machine Classifiers , 1999, Neural Processing Letters.

[24]  Hinrich Schütze,et al.  Book Reviews: Foundations of Statistical Natural Language Processing , 1999, CL.

[25]  Martin J. Wainwright,et al.  Dual Averaging for Distributed Optimization: Convergence Analysis and Network Scaling , 2010, IEEE Transactions on Automatic Control.

[26]  Jun-Seok Park,et al.  14.6 A 1.42TOPS/W deep convolutional neural network recognition processor for intelligent IoE systems , 2016, 2016 IEEE International Solid-State Circuits Conference (ISSCC).