Penalty Dual Decomposition Method for Nonsmooth Nonconvex Optimization—Part I: Algorithms and Convergence Analysis

Many contemporary signal processing, machine learning and wireless communication applications can be formulated as nonconvex nonsmooth optimization problems. Often there is a lack of efficient algorithms for these problems, especially when the optimization variables are nonlinearly coupled in some nonconvex constraints. In this work, we propose an algorithm named penalty dual decomposition (PDD) for these difficult problems and discuss its various applications. The PDD is a double-loop iterative algorithm. Its inner iteration is used to inexactly solve a nonconvex nonsmooth augmented Lagrangian problem via block-coordinate-descent-type methods, while its outer iteration updates the dual variables and/or a penalty parameter. In Part I of this work, we describe the PDD algorithm and establish its convergence to KKT solutions. In Part II we evaluate the performance of PDD by customizing it to three applications arising from signal processing and wireless communications.

[1]  J. Treiman,et al.  Lagrange Multipliers for Nonconvex Generalized Gradients with Equality, Inequality, and Set Constraints , 1999 .

[2]  Xiangfeng Wang,et al.  Multi-Agent Distributed Optimization via Inexact Consensus ADMM , 2014, IEEE Transactions on Signal Processing.

[3]  M. J. D. Powell,et al.  A method for nonlinear constraints in minimization problems , 1969 .

[4]  Alexey F. Izmailov,et al.  Optimality Conditions for Irregular Inequality-Constrained Problems , 2001, SIAM J. Control. Optim..

[5]  Nicholas I. M. Gould,et al.  On the Complexity of Steepest Descent, Newton's and Regularized Newton's Methods for Nonconvex Unconstrained Optimization Problems , 2010, SIAM J. Optim..

[6]  Gert R. G. Lanckriet,et al.  On the Convergence of the Concave-Convex Procedure , 2009, NIPS.

[7]  Jiang Wang,et al.  A Globally Optimal Bilinear Programming Approach to the Design of Approximate Hilbert Pairs of Orthonormal Wavelet Bases , 2010, IEEE Transactions on Signal Processing.

[8]  Guoyin Li,et al.  Splitting methods for nonconvex composite optimization , 2014, ArXiv.

[9]  Philippe J. Sartori,et al.  Cooperative Algorithms for MIMO Amplify-and-Forward Relay Networks , 2011, IEEE Transactions on Signal Processing.

[10]  Ying-Chang Liang,et al.  Joint Beamforming and Power Control for Multiantenna Relay Broadcast Channel With QoS Constraints , 2009, IEEE Transactions on Signal Processing.

[11]  Nikos D. Sidiropoulos,et al.  A Flexible and Efficient Algorithmic Framework for Constrained Matrix and Tensor Factorization , 2015, IEEE Transactions on Signal Processing.

[12]  M. Hestenes Multiplier and gradient methods , 1969 .

[13]  Jean-Yves Tourneret,et al.  Hyperspectral Unmixing With Spectral Variability Using a Perturbed Linear Mixing Model , 2015, IEEE Transactions on Signal Processing.

[14]  Ha H. Nguyen,et al.  Joint Optimization of Source Precoding and Relay Beamforming in Wireless MIMO Relay Networks , 2014, IEEE Transactions on Communications.

[15]  Alan L. Yuille,et al.  The Concave-Convex Procedure , 2003, Neural Computation.

[16]  Cédric Févotte,et al.  Alternating direction method of multipliers for non-negative matrix factorization with the beta-divergence , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[17]  Nikos D. Sidiropoulos,et al.  Blind Separation of Quasi-Stationary Sources: Exploiting Convex Geometry in Covariance Domain , 2015, IEEE Transactions on Signal Processing.

[18]  Zhi-Quan Luo,et al.  Convergence analysis of alternating direction method of multipliers for a family of nonconvex problems , 2014, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[19]  F. Clarke Generalized gradients and applications , 1975 .

[20]  E. Balder On generalized gradients and optimization ∗ , 2008 .

[21]  Haesun Park,et al.  SymNMF: nonnegative low-rank approximation of a similarity matrix for graph clustering , 2014, Journal of Global Optimization.

[22]  John N. Tsitsiklis,et al.  Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.

[23]  Ying Xiong Nonlinear Optimization , 2014 .

[24]  A. Bruckstein,et al.  K-SVD : An Algorithm for Designing of Overcomplete Dictionaries for Sparse Representation , 2005 .

[25]  Francisco Facchinei,et al.  Parallel Selective Algorithms for Nonconvex Big Data Optimization , 2014, IEEE Transactions on Signal Processing.

[26]  M. Kocvara A Generalized Augmented Lagrangian Method for Semidefinite Programming , 2003 .

[27]  Zhi-Quan Luo,et al.  SINR Constrained Beamforming for a MIMO Multi-User Downlink System: Algorithms and Convergence Analysis , 2012, IEEE Transactions on Signal Processing.

[28]  Jay S. Treiman,et al.  The Linear Nonconvex Generalized Gradient and Lagrange Multipliers , 1995, SIAM J. Optim..

[29]  Francisco Facchinei,et al.  Parallel and Distributed Methods for Constrained Nonconvex Optimization—Part I: Theory , 2016, IEEE Transactions on Signal Processing.

[30]  Benjamin Friedlander,et al.  Bilinear compressed sensing for array self-calibration , 2014, 2014 48th Asilomar Conference on Signals, Systems and Computers.

[31]  Xiqi Gao,et al.  Joint Source-Relay Design for Full-Duplex MIMO AF Relay Systems , 2016, IEEE Transactions on Signal Processing.

[32]  Yong Zhang,et al.  Sparse Approximation via Penalty Decomposition Methods , 2012, SIAM J. Optim..

[33]  Francisco Facchinei,et al.  Feasible methods for nonconvex nonsmooth problems with applications in green communications , 2017, Math. Program..

[34]  Zhi-Quan Luo,et al.  Base Station Activation and Linear Transceiver Design for Optimal Resource Management in Heterogeneous Networks , 2013, IEEE Transactions on Signal Processing.

[35]  M. Elad,et al.  $rm K$-SVD: An Algorithm for Designing Overcomplete Dictionaries for Sparse Representation , 2006, IEEE Transactions on Signal Processing.

[36]  Yin Zhang,et al.  An alternating direction algorithm for matrix completion with nonnegative factors , 2011, Frontiers of Mathematics in China.

[37]  Mikhail V. Solodov,et al.  Local Convergence of Exact and Inexact Augmented Lagrangian Methods under the Second-Order Sufficient Optimality Condition , 2012, SIAM J. Optim..

[38]  Wei Yu,et al.  Sparse Beamforming and User-Centric Clustering for Downlink Cloud Radio Access Network , 2014, IEEE Access.

[39]  Wim Michiels,et al.  Combining Convex–Concave Decompositions and Linearization Approaches for Solving BMIs, With Application to Static Output Feedback , 2011, IEEE Transactions on Automatic Control.

[40]  F. Clarke Optimization And Nonsmooth Analysis , 1983 .

[41]  Wotao Yin,et al.  Global Convergence of ADMM in Nonconvex Nonsmooth Optimization , 2015, Journal of Scientific Computing.

[42]  Jane J. Ye,et al.  Smoothing SQP Methods for Solving Degenerate Nonsmooth Constrained Optimization Problems with Applications to Bilevel Programs , 2014, SIAM J. Optim..

[43]  Kang G. Shin,et al.  Efficient Distributed Linear Classification Algorithms via the Alternating Direction Method of Multipliers , 2012, AISTATS.

[44]  Holger Boche,et al.  Accepted for Publication in Ieee Transactions on Signal Processing 1 Robust Qos-constrained Optimization of Downlink Multiuser Miso Systems , 2022 .

[45]  Zhi-Quan Luo,et al.  A Unified Algorithmic Framework for Block-Structured Optimization Involving Big Data: With applications in machine learning and signal processing , 2015, IEEE Signal Processing Magazine.

[46]  Alexey F. Izmailov,et al.  On attraction of linearly constrained Lagrangian methods and of stabilized and quasi-Newton SQP methods to critical multipliers , 2011, Math. Program..

[47]  Jong-Shi Pang,et al.  Computing B-Stationary Points of Nonsmooth DC Programs , 2015, Math. Oper. Res..

[48]  Zhaosong Lu,et al.  Penalty decomposition methods for rank minimization , 2010, Optim. Methods Softw..

[49]  Zhi-Quan Luo,et al.  A Unified Convergence Analysis of Block Successive Minimization Methods for Nonsmooth Optimization , 2012, SIAM J. Optim..

[50]  Stéphane Canu,et al.  Recovering Sparse Signals With a Certain Family of Nonconvex Penalties and DC Programming , 2009, IEEE Transactions on Signal Processing.

[51]  Stephen P. Boyd,et al.  Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers , 2011, Found. Trends Mach. Learn..

[52]  Marian Codreanu,et al.  Distributed Joint Resource and Routing Optimization in Wireless Sensor Networks via Alternating Direction Method of Multipliers , 2013, IEEE Transactions on Wireless Communications.

[53]  Chong-Yung Chi,et al.  Distributed Robust Multicell Coordinated Beamforming With Imperfect CSI: An ADMM Approach , 2011, IEEE Transactions on Signal Processing.

[54]  Yue Rong,et al.  Joint Source and Relay Optimization for Two-Way Linear Non-Regenerative MIMO Relay Communications , 2012, IEEE Transactions on Signal Processing.

[55]  Erik J. Balder On subdifferential calculus ∗ , 2001 .

[56]  Xu Li,et al.  Min Flow Rate Maximization for Software Defined Radio Access Networks , 2013, IEEE Journal on Selected Areas in Communications.

[57]  Francisco Facchinei,et al.  Ghost Penalties in Nonconvex Constrained Optimization: Diminishing Stepsizes and Iteration Complexity , 2017, Math. Oper. Res..

[58]  Xiangfeng Wang,et al.  Nonnegative matrix factorization using ADMM: Algorithm and convergence analysis , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[59]  Marc Teboulle,et al.  Proximal alternating linearized minimization for nonconvex and nonsmooth problems , 2013, Mathematical Programming.