Efficient Signal Reconstruction via Distributed Least Square Optimization on a Systolic FPGA Architecture

Optimization problems form the basis of a wide gamut of computationally challenging tasks in signal processing, machine learning, resource planning and so on. Out of these, convex optimization, and in particular least square optimization, covers a vast majority; and recent advances in iterative algorithms to solve such problems of large dimensions have gained traction. Multi-core designs with systolic or semi-systolic architectures can be a key enabler for implementing discrete dynamical systems and realize massively scalable architectures to solve such optimization algorithms. In this paper, we present a platform architecture implemented in programmable FPGA hardware to solve a template problem in distributed optimization, namely signal reconstruction from non-uniform sampling. This is a quintessential problem with wide-spread applications in signal processing, computational imaging etc. We expect such an architectural exploration to open up promising opportunities to solve distributed optimizations that are becoming increasingly important in real-world applications. The complete system design, mapping and optimization into an FPGA architecture as well as analysis of convergence and scalability have been presented.

[1]  Eckhard Grass,et al.  Globally Asynchronous, Locally Synchronous Circuits: Overview and Outlook , 2007, IEEE Design & Test of Computers.

[2]  Fayez Gebali,et al.  New Systolic Array Architecture for Finite Field Inversion , 2017, Canadian Journal of Electrical and Computer Engineering.

[3]  H. T. Kung,et al.  Supporting systolic and memory communication in iWarp , 1990, ISCA '90.

[4]  J. G. McWhirter,et al.  Recursive Least-Squares Minimization Using A Systolic Array , 1983, Optics & Photonics.

[5]  Suman Datta,et al.  Vertex coloring of graphs via phase dynamics of coupled oscillatory networks , 2016, Scientific Reports.

[6]  Ii R. Marks Restoring lost samples from an oversampled band-limited signal , 1983 .

[7]  Henry Stark Polar, Spiral, and Generalized Sampling and Interpolation , 1993 .

[8]  J. Greg Nash High-throughput programmable systolic array FFT architecture and FPGA implementations , 2014, 2014 International Conference on Computing, Networking and Communications (ICNC).

[9]  Paulo Jorge S. G. Ferreira The stability of a procedure for the recovery of lost samples in band-limited signals , 1994, Signal Process..

[10]  Zhang Gui-Xia,et al.  Research of Distributed Data Optimization Storage and Statistical Method in the Environment of Big Data , 2017, 2017 International Conference on Smart Grid and Electrical Automation (ICSGEA).

[11]  Arijit Raychowdhury,et al.  Optimo: A 65Nm 270Mhz 143.2Mw Programmable Spatial-Array-Processor With A Hierarchical Multi-Cast On-Chip Network For Solving Distributed Optimizations , 2019, 2019 IEEE Custom Integrated Circuits Conference (CICC).

[12]  Javier Hormigo,et al.  High-Throughput FPGA Implementation of QR Decomposition , 2015, IEEE Transactions on Circuits and Systems II: Express Briefs.

[13]  H. T. Kung Why systolic architectures? , 1982, Computer.