RSQP: Problem-specific Architectural Customization for Accelerated Convex Quadratic Optimization

Convex optimization is at the heart of many performance-critical applications across a wide range of domains. Although many high-performance hardware accelerators have been developed for specific optimization problems in the past, designing such accelerator is a challenging task and the resulting computing architecture is often so specific to the targeted application that they can hardly be reused even in a related application within the same domain. To accelerate general-purpose optimization solvers that must operate on diverse user input during run time, an ideal hardware solver should be able to adapt to the provided optimization problem dynamically while achieving high performance and power-efficiency. In this work, a hardware-accelerated general-purpose quadratic program solver, called RSQP, with reconfigurable functional units and data path that facilitate problem-specific customization is presented. RSQP uses a string-based encoding to describe the problem structure with fine granularity. Based on this encoding, functional units and datapath customized to the sparsity pattern of the problem are created by solving a dictionary-based lossless string compression problem and a mixed integer linear program respectively. RSQP has been integrated to accelerate the general-purpose quadratic programming solver OSQP and has been tested using an extensive benchmark with 120 optimization problems from 6 application domains. Through architectural customization, RSQP achieves up to 7× performance improvement over its baseline generic design. Furthermore, when compared with a CPU and a GPU-accelerated implementation, RSQP achieves up to 31.2× and 6.9× end-to-end speedup on these benchmark programs, respectively. Finally, the FPGA accelerator operates at up to 6.6× lower dynamic power consumption and up to 22.7× higher power efficiency over the GPU implementation, making it an attractive solution for power-conscious datacenter applications.

[1]  Taylor P. Reynolds,et al.  < Convex Optimization for Trajectory Generation: A Tutorial on Generating Dynamically Feasible Trajectories Reliably and Efficiently , 2022, IEEE Control Systems.

[2]  M. Anitescu,et al.  Exploiting GPU/SIMD Architectures for Solving Linear-Quadratic MPC Problems* , 2022, 2023 American Control Conference (ACC).

[3]  Yuwei Hu,et al.  High-Performance Sparse Linear Algebra on HBM-Equipped FPGAs Using HLS: A Case Study on SpMV , 2022, FPGA.

[4]  Koldo Basterretxea,et al.  Towards the Development of a CAD Tool for the Implementation of High-Speed Embedded MPCs on FPGAs , 2020, 2020 European Control Conference (ECC).

[5]  J. Lygeros,et al.  GPU acceleration of ADMM for large-scale quadratic programming , 2019, J. Parallel Distributed Comput..

[6]  Stephen P. Boyd,et al.  Differentiable Convex Optimization Layers , 2019, NeurIPS.

[7]  Can Berk Kalayci,et al.  A comprehensive review of deterministic models and applications for mean-variance portfolio optimization , 2019, Expert Syst. Appl..

[8]  Shadi G. Alawneh,et al.  A Survey of Parallel Implementations for Model Predictive Control , 2019, IEEE Access.

[9]  Stephen P. Boyd,et al.  Multi-period portfolio selection with drawdown control , 2018, Ann. Oper. Res..

[10]  Stephen P. Boyd,et al.  Multi-period portfolio selection with drawdown control , 2018, Annals of Operations Research.

[11]  Stephen P. Boyd,et al.  OSQP: an operator splitting solver for quadratic programs , 2017, 2018 UKACC 12th International Conference on Control (CONTROL).

[12]  Stephen P. Boyd,et al.  Multi-Period Trading via Convex Optimization , 2017, Found. Trends Optim..

[13]  J. Zico Kolter,et al.  OptNet: Differentiable Optimization as a Layer in Neural Networks , 2017, ICML.

[14]  Stefano Di Cairano,et al.  Efficient Convex Optimization on GPUs for Embedded Model Predictive Control , 2017, GPGPU@PPoPP.

[15]  Wayne Luk,et al.  Optimising Sparse Matrix Vector multiplication for large scale FEM problems on FPGA , 2016, 2016 26th International Conference on Field Programmable Logic and Applications (FPL).

[16]  Stephen P. Boyd,et al.  CVXPY: A Python-Embedded Modeling Language for Convex Optimization , 2016, J. Mach. Learn. Res..

[17]  Eric S. Chung,et al.  A High Memory Bandwidth FPGA Accelerator for Sparse Matrix-Vector Multiplication , 2014, 2014 IEEE 22nd Annual International Symposium on Field-Programmable Custom Computing Machines.

[18]  Stephen P. Boyd,et al.  Plenary talk: Performance bounds and suboptimal policies for multi-period investment , 2013, 22nd Mediterranean Conference on Control and Automation.

[19]  Manfred Morari,et al.  Embedded Predictive Control on an FPGA using the Fast Gradient Method , 2013, 2013 European Control Conference (ECC).

[20]  Marc-Alexandre Boechat,et al.  An architecture for solving quadratic programs with the fast gradient method on a Field Programmable Gate Array , 2013, 21st Mediterranean Conference on Control and Automation.

[21]  Manfred Morari,et al.  Embedded Online Optimization for Model Predictive Control at Megahertz Rates , 2013, IEEE Transactions on Automatic Control.

[22]  Eric C. Kerrigan,et al.  Model predictive control for deeply pipelined field-programmable gate array implementation: algorithms and circuitry , 2012 .

[23]  Eric S. Chung,et al.  Towards a Universal FPGA Matrix-Vector Multiplication Architecture , 2012, 2012 IEEE 20th International Symposium on Field-Programmable Custom Computing Machines.

[24]  Eric C. Kerrigan,et al.  An FPGA implementation of a sparse quadratic programming solver for constrained predictive control , 2011, FPGA '11.

[25]  George A. Constantinides,et al.  Optimising Memory Bandwidth Use for Matrix-Vector Multiplication in Iterative Methods , 2010, ARC.

[26]  Warren J. Gross,et al.  FPGA architecture and implementation of sparse matrix-vector multiplication for the finite element method , 2008, Comput. Phys. Commun..

[27]  Michael A. Saunders,et al.  SNOPT: An SQP Algorithm for Large-Scale Constrained Optimization , 2002, SIAM J. Optim..

[28]  Abraham Lempel,et al.  Compression of individual sequences via variable-rate coding , 1978, IEEE Trans. Inf. Theory.

[29]  Eric C. Kerrigan,et al.  A Survey of the Implementation of Linear Model Predictive Control on FPGAs , 2018 .

[30]  Endong Wang,et al.  Intel Math Kernel Library , 2014 .

[31]  Simon See,et al.  Solving Quadratic Programming Problems on Graphics Processing Unit , 2011 .

[32]  Hans Joachim Ferreau,et al.  Efficient Numerical Methods for Nonlinear MPC and Moving Horizon Estimation , 2009 .

[33]  Frank Allgöwer,et al.  Nonlinear model predictive control : towards new challenging applications , 2009 .