Stability and Performance Limits of Adaptive Primal-Dual Networks

This paper studies distributed primal-dual strategies for adaptation and learning over networks from streaming data. Two first-order methods are considered based on the Arrow-Hurwicz (AH) and augmented Lagrangian (AL) techniques. Several revealing results are discovered in relation to the performance and stability of these strategies when employed over adaptive networks. The conclusions establish that the advantages that these methods exhibit for deterministic optimization problems do not necessarily carry over to stochastic optimization problems. It is found that they have narrower stability ranges and worse steady-state mean-square-error performance than primal methods of the consensus and diffusion type. It is also found that the AH technique can become unstable under a partial observation model, while the other techniques are able to recover the unknown under this scenario. A method to enhance the performance of AL strategies is proposed by tying the selection of the step-size to their regularization parameter. It is shown that this method allows the AL algorithm to approach the performance of consensus and diffusion strategies but that it remains less stable than these other strategies.

[1]  V. Sunder,et al.  The Laplacian spectrum of a graph , 1990 .

[2]  Sergio Barbarossa,et al.  Distributed Detection and Estimation in Wireless Sensor Networks , 2013, ArXiv.

[3]  S. Haykin Adaptive Filters , 2007 .

[4]  Ali H. Sayed,et al.  On the Learning Behavior of Adaptive Networks—Part II: Performance Analysis , 2013, IEEE Transactions on Information Theory.

[5]  Russell Merris,et al.  The Laplacian Spectrum of a Graph II , 1994, SIAM J. Discret. Math..

[6]  Alan J. Laub,et al.  Matrix analysis - for scientists and engineers , 2004 .

[7]  Gonzalo Mateos,et al.  Distributed Recursive Least-Squares: Stability and Performance Analysis , 2011, IEEE Transactions on Signal Processing.

[8]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[9]  B. V. Dean,et al.  Studies in Linear and Non-Linear Programming. , 1959 .

[10]  B. V. Dean,et al.  Studies in Linear and Non-Linear Programming. , 1959 .

[11]  Ali H. Sayed,et al.  Diffusion Least-Mean Squares Over Adaptive Networks: Formulation and Performance Analysis , 2008, IEEE Transactions on Signal Processing.

[12]  O. Nelles,et al.  An Introduction to Optimization , 1996, IEEE Antennas and Propagation Magazine.

[13]  Danilo P. Mandic,et al.  Cooperative adaptive estimation of distributed noncircular complex signals , 2012, 2012 Conference Record of the Forty Sixth Asilomar Conference on Signals, Systems and Computers (ASILOMAR).

[14]  Stephen P. Boyd,et al.  Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers , 2011, Found. Trends Mach. Learn..

[15]  Soummya Kar,et al.  Gossip Algorithms for Distributed Signal Processing , 2010, Proceedings of the IEEE.

[16]  Ali H. Sayed,et al.  Diffusion LMS Strategies for Distributed Estimation , 2010, IEEE Transactions on Signal Processing.

[17]  Derrick S. Tracy,et al.  A new matrix product and its applications in partitioned matrix differentiation , 1972 .

[18]  Ali H. Sayed,et al.  Distributed Policy Evaluation Under Multiple Behavior Strategies , 2013, IEEE Transactions on Automatic Control.

[19]  Dimitri P. Bertsekas,et al.  Nonlinear Programming , 1997 .

[20]  Marc Moonen,et al.  Distributed computation of the Fiedler vector with application to topology inference in ad hoc networks , 2013, Signal Process..

[21]  Gang George Yin,et al.  Distributed Energy-Aware Diffusion Least Mean Squares: Game-Theoretic Learning , 2013, IEEE Journal of Selected Topics in Signal Processing.

[22]  Ali H. Sayed,et al.  On the Learning Behavior of Adaptive Networks—Part I: Transient Analysis , 2013, IEEE Transactions on Information Theory.

[23]  Ali H. Sayed,et al.  Adaptive Networks , 2014, Proceedings of the IEEE.

[24]  Ali Sayed,et al.  Adaptation, Learning, and Optimization over Networks , 2014, Found. Trends Mach. Learn..

[25]  Jonathon A. Chambers,et al.  A new incremental affine projection-based adaptive algorithm for distributed networks , 2008, Signal Process..

[26]  Ali H. Sayed,et al.  Diffusion strategies for adaptation and learning over networks: an examination of distributed strategies and network behavior , 2013, IEEE Signal Processing Magazine.

[27]  Soummya Kar,et al.  Convergence Rate Analysis of Distributed Gossip (Linear Parameter) Estimation: Fundamental Limits and Tradeoffs , 2010, IEEE Journal of Selected Topics in Signal Processing.

[28]  S. Nash,et al.  Linear and Nonlinear Optimization , 2008 .

[29]  Sergios Theodoridis,et al.  Adaptive Learning in a World of Projections , 2011, IEEE Signal Processing Magazine.

[30]  Isao Yamada,et al.  Parallel algorithms for variational inequalities over the Cartesian product of the intersections of the fixed point sets of nonexpansive mappings , 2008, J. Approx. Theory.

[31]  M. Fiedler A property of eigenvectors of nonnegative symmetric matrices and its application to graph theory , 1975 .

[32]  Ioannis D. Schizas,et al.  Performance Analysis of the Consensus-Based Distributed LMS Algorithm , 2009, EURASIP J. Adv. Signal Process..

[33]  Alejandro Ribeiro,et al.  A Saddle Point Algorithm for Networked Online Convex Optimization , 2014, IEEE Transactions on Signal Processing.

[34]  Ali H. Sayed,et al.  Diffusion Adaptation over Networks , 2012, ArXiv.

[35]  L. Foulds,et al.  Graph Theory Applications , 1991 .

[36]  Ali H. Sayed,et al.  Diffusion Strategies Outperform Consensus Strategies for Distributed Estimation Over Adaptive Networks , 2012, IEEE Transactions on Signal Processing.

[37]  Robert D. Nowak,et al.  Quantized incremental algorithms for distributed optimization , 2005, IEEE Journal on Selected Areas in Communications.

[38]  Roger Fletcher,et al.  Practical methods of optimization; (2nd ed.) , 1987 .

[39]  Marc Moonen,et al.  Seeing the Bigger Picture: How Nodes Can Learn Their Place Within a Complex Ad Hoc Network Topology , 2013, IEEE Signal Processing Magazine.

[40]  Sergios Theodoridis,et al.  Adaptive Robust Distributed Learning in Diffusion Sensor Networks , 2011, IEEE Transactions on Signal Processing.

[41]  Stephen P. Boyd,et al.  Fastest Mixing Markov Chain on a Graph , 2004, SIAM Rev..

[42]  R. Fletcher Practical Methods of Optimization , 1988 .

[43]  Angelia Nedic,et al.  Distributed Stochastic Subgradient Projection Algorithms for Convex Optimization , 2008, J. Optim. Theory Appl..

[44]  Ali H. Sayed,et al.  Diffusion Adaptation Strategies for Distributed Optimization and Learning Over Networks , 2011, IEEE Transactions on Signal Processing.

[45]  Ioannis D. Schizas,et al.  Distributed LMS for Consensus-Based In-Network Adaptive Processing , 2009, IEEE Transactions on Signal Processing.

[46]  Michael G. Rabbat,et al.  Distributed strongly convex optimization , 2012, 2012 50th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[47]  B. V. Dean,et al.  Studies in Linear and Non-Linear Programming. , 1959 .

[48]  Ali H. Sayed,et al.  Performance Limits for Distributed Estimation Over LMS Adaptive Networks , 2012, IEEE Transactions on Signal Processing.

[49]  S. Liberty,et al.  Linear Systems , 2010, Scientific Parallel Computing.

[50]  Ali H. Sayed,et al.  Distributed primal strategies outperform primal-dual strategies over adaptive networks , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[51]  G. Walter,et al.  Graphs and Matrices , 1999 .

[52]  M. Fiedler Algebraic connectivity of graphs , 1973 .

[53]  Asuman E. Ozdaglar,et al.  Distributed Subgradient Methods for Multi-Agent Optimization , 2009, IEEE Transactions on Automatic Control.

[54]  Anna Scaglione,et al.  Convergence and Applications of a Gossip-Based Gauss-Newton Algorithm , 2012, IEEE Transactions on Signal Processing.

[55]  Charles R. Johnson,et al.  Matrix analysis , 1985, Statistical Inference for Engineers and Data Scientists.

[56]  T. D. Morley,et al.  Eigenvalues of the Laplacian of a graph , 1985 .