Online and statistical learning in networks

ONLINE AND STATISTICAL LEARNING IN NETWORKS Shahin Shahrampour Ali Jadbabaie Alexander Rakhlin Learning, prediction and identification has been a main topic of interest in science and engineering for many years. Common in all these problems is an agent that receives the data to perform prediction and identification procedures. The agent might process the data individually, or might interact in a network of agents. The goal of this thesis is to address problems that lie at the interface of statistical processing of data, online learning and network science with a focus on developing distributed algorithms. These problems have wide-spread applications in several domains of systems engineering and computer science. Whether in individual or group, the main task of the agent is to understand how to treat data to infer the unknown parameters of the problem. To this end, the first part of this thesis addresses statistical processing of data. We start with the problem of distributed detection in multi-agent networks. In contrast to the existing literature which focuses on asymptotic learning, we provide a finite-time analysis using a notion of Kullback-Leibler cost. We derive bounds on the cost in terms of network size, spectral gap and relative entropy of data distribution. Next, we turn to focus on an inverse-type problem where the network structure is unknown, and the outputs of a dynamics (e.g. consensus dynamics) are given. We propose several network reconstruction algorithms by measuring the network response to the inputs. Our algorithm reconstructs the Boolean structure (i.e., existence and directions of links) of a directed network from a series of dynamical responses. The second part of the thesis centers around online learning where data is received in a sequential fashion. As an example of collaborative learning, we consider the v stochastic multi-armed bandit problem in a multi-player network. Players explore a pool of arms with payoffs generated from player-dependent distributions. Pulling an arm, each player only observes a noisy payoff of the chosen arm. The goal is to maximize a global welfare or to find the best global arm. Hence, players exchange information locally to benefit from side observations. We develop a distributed online algorithm with a logarithmic regret with respect to the best global arm, and generalize our results to the case that availability of arms varies over time. We then return to individual online learning where one learner plays against an adversary. We develop a fully adaptive algorithm that takes advantage of a regularity of the sequence of observations, retains worst-case performance guarantees, and performs well against complex benchmarks. Our method competes with dynamic benchmarks in which regret guarantee scales with regularity of the sequence of cost functions and comparators. Notably, the regret bound adapts to the smaller complexity measure in the problem environment.

[1]  Vladimir Vovk,et al.  Aggregating strategies , 1990, COLT '90.

[2]  Jens Timmer,et al.  Reconstructing gene-regulatory networks from time series, knock-out data, and prior knowledge , 2007, BMC Systems Biology.

[3]  Kamiar Rahnama Rad,et al.  Distributed parameter estimation in networks , 2010, 49th IEEE Conference on Decision and Control (CDC).

[4]  Karthik Sridharan,et al.  Online Learning with Predictable Sequences , 2012, COLT.

[5]  Ali Jadbabaie,et al.  Non-Bayesian Social Learning , 2011, Games Econ. Behav..

[6]  Angelia Nedic,et al.  Distributed Stochastic Subgradient Projection Algorithms for Convex Optimization , 2008, J. Optim. Theory Appl..

[7]  Angelia Nedic,et al.  Nonasymptotic convergence rates for cooperative learning over time-varying directed graphs , 2014, 2015 American Control Conference (ACC).

[8]  George J. Pappas,et al.  Technical Report: Distributed Algorithms for Stochastic Source Seeking with Mobile Robot Networks , 2014, ArXiv.

[9]  Csaba Szepesvári,et al.  Exploration-exploitation tradeoff using variance estimates in multi-armed bandits , 2009, Theor. Comput. Sci..

[10]  Shahin Shahrampour,et al.  Online Learning of Dynamic Parameters in Social Networks , 2013, NIPS.

[11]  Karthik Sridharan,et al.  Optimization, Learning, and Games with Predictable Sequences , 2013, NIPS.

[12]  Robert R. Tenney,et al.  Detection with distributed sensors , 1980 .

[13]  Martin Zinkevich,et al.  Online Convex Programming and Generalized Infinitesimal Gradient Ascent , 2003, ICML.

[14]  Ye Yuan,et al.  Network Reconstruction from Intrinsic Noise , 2013, ArXiv.

[15]  Nicolò Cesa-Bianchi,et al.  A new look at shifting regret , 2012, ArXiv.

[16]  Mehran Mesbahi,et al.  Network Identification via Node Knockout , 2010, IEEE Transactions on Automatic Control.

[17]  Shahin Shahrampour,et al.  Multi-armed bandits in multi-agent networks , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[18]  Mehran Mesbahi,et al.  A graph realization approach to network identification , 2012, 2012 IEEE 51st IEEE Conference on Decision and Control (CDC).

[19]  Peter Auer,et al.  The Nonstochastic Multiarmed Bandit Problem , 2002, SIAM J. Comput..

[20]  P. Bartlett,et al.  Optimal strategies and minimax lower bounds for online convex games [Technical Report No. UCB/EECS-2008-19] , 2008 .

[21]  Pooya Molavi,et al.  (Non-)Bayesian learning without recall , 2014, 53rd IEEE Conference on Decision and Control.

[22]  V. Borkar,et al.  Asymptotic agreement in distributed estimation , 1982 .

[23]  T. Javidi,et al.  Social learning and distributed hypothesis testing , 2014, 2014 IEEE International Symposium on Information Theory.

[24]  Jie Lin,et al.  Coordination of groups of mobile autonomous agents using nearest neighbor rules , 2003, IEEE Trans. Autom. Control..

[25]  J. Tsitsiklis Decentralized Detection' , 1993 .

[26]  Shahin Shahrampour,et al.  Reconstruction of directed networks from consensus dynamics , 2013, 2013 American Control Conference.

[27]  Elad Hazan,et al.  Better Algorithms for Benign Bandits , 2009, J. Mach. Learn. Res..

[28]  Richard M. Murray,et al.  Consensus problems in networks of agents with switching topology and time-delays , 2004, IEEE Transactions on Automatic Control.

[29]  Shahin Shahrampour,et al.  Exponentially fast parameter estimation in networks using distributed dual averaging , 2013, 52nd IEEE Conference on Decision and Control.

[30]  Bhaskar Krishnamachari,et al.  Combinatorial Network Optimization With Unknown Variables: Multi-Armed Bandits With Linear Rewards and Individual Observations , 2010, IEEE/ACM Transactions on Networking.

[31]  Aranya Chakrabortty,et al.  A Graph-Theoretic Condition for Global Identifiability of Weighted Consensus Networks , 2016, IEEE Transactions on Automatic Control.

[32]  Tor Lattimore,et al.  Bounded Regret for Finite-Armed Structured Bandits , 2014, NIPS.

[33]  Duncan J. Watts,et al.  Collective dynamics of ‘small-world’ networks , 1998, Nature.

[34]  Shahin Shahrampour,et al.  Learning without recall by random walks on directed graphs , 2015, 2015 54th IEEE Conference on Decision and Control (CDC).

[35]  Venugopal V. Veeravalli,et al.  Decentralized detection in sensor networks , 2003, IEEE Trans. Signal Process..

[36]  Sean C. Warnick,et al.  Necessary and Sufficient Conditions for Dynamical Structure Reconstruction of LTI Networks , 2008, IEEE Transactions on Automatic Control.

[37]  Elad Hazan,et al.  Logarithmic regret algorithms for online convex optimization , 2006, Machine Learning.

[38]  Ali H. Sayed,et al.  Distributed Detection Over Adaptive Networks Using Diffusion Adaptation , 2011, IEEE Transactions on Signal Processing.

[39]  Stephen P. Boyd,et al.  Randomized gossip algorithms , 2006, IEEE Transactions on Information Theory.

[40]  Claudio Gentile,et al.  A Gang of Bandits , 2013, NIPS.

[41]  Robert D. Kleinberg,et al.  Regret bounds for sleeping experts and bandits , 2010, Machine Learning.

[42]  Shahin Shahrampour,et al.  Switching to learn , 2015, 2015 American Control Conference (ACC).

[43]  Angelia Nedić,et al.  Fast Convergence Rates for Distributed Non-Bayesian Learning , 2015, IEEE Transactions on Automatic Control.

[44]  Omar Besbes,et al.  Non-Stationary Stochastic Optimization , 2013, Oper. Res..

[45]  Martin J. Wainwright,et al.  Dual Averaging for Distributed Optimization: Convergence Analysis and Network Scaling , 2010, IEEE Transactions on Automatic Control.

[46]  Michael I. Jordan,et al.  Learning graphical models for stationary time series , 2004, IEEE Transactions on Signal Processing.

[47]  Murti V. Salapaka,et al.  On the Problem of Reconstructing an Unknown Topology via Locality Properties of the Wiener Filter , 2010, IEEE Transactions on Automatic Control.

[48]  Alfredo De Santis,et al.  Learning probabilistic prediction functions , 1988, [Proceedings 1988] 29th Annual Symposium on Foundations of Computer Science.

[49]  Ohad Shamir,et al.  Optimal Distributed Online Prediction Using Mini-Batches , 2010, J. Mach. Learn. Res..

[50]  José M. F. Moura,et al.  Large Deviations Performance of Consensus+Innovations Distributed Detection With Non-Gaussian Observations , 2011, IEEE Transactions on Signal Processing.

[51]  Michel Verhaegen,et al.  PO-MOESP subspace identification of Directed Acyclic Graphs with unknown topology , 2015, Autom..

[52]  Vaibhav Srivastava,et al.  Modeling Human Decision Making in Generalized Gaussian Multiarmed Bandits , 2013, Proceedings of the IEEE.

[53]  J. Shamma,et al.  Belief consensus and distributed hypothesis testing in sensor networks , 2006 .

[54]  Shie Mannor,et al.  From Bandits to Experts: On the Value of Side-Observations , 2011, NIPS.

[55]  John Darzentas,et al.  Problem Complexity and Method Efficiency in Optimization , 1983 .

[56]  Stephen P. Boyd,et al.  Fastest Mixing Markov Chain on a Graph , 2004, SIAM Rev..

[57]  A. Ozdaglar,et al.  LIDS Report 1 Distributed Subgradient Methods over Random Networks , 2008 .

[58]  G. Sohie,et al.  Generalization of the matrix inversion lemma , 1986, Proceedings of the IEEE.

[59]  Mehran Mesbahi,et al.  Agreement over random networks , 2004, 2004 43rd IEEE Conference on Decision and Control (CDC) (IEEE Cat. No.04CH37601).

[60]  Rosario N. Mantegna,et al.  Book Review: An Introduction to Econophysics, Correlations, and Complexity in Finance, N. Rosario, H. Mantegna, and H. E. Stanley, Cambridge University Press, Cambridge, 2000. , 2000 .

[61]  Manfred K. Warmuth,et al.  Tracking a Small Set of Experts by Mixing Past Posteriors , 2003, J. Mach. Learn. Res..

[62]  T. Sauer,et al.  Reconstructing the topology of sparsely connected dynamical networks. , 2008, Physical review. E, Statistical, nonlinear, and soft matter physics.

[63]  Marc Teboulle,et al.  Mirror descent and nonlinear projected subgradient methods for convex optimization , 2003, Oper. Res. Lett..

[64]  Martin H. Levinson Linked: The New Science of Networks , 2004 .

[65]  Manfred K. Warmuth,et al.  The weighted majority algorithm , 1989, 30th Annual Symposium on Foundations of Computer Science.

[66]  T. L. Lai Andherbertrobbins Asymptotically Efficient Adaptive Allocation Rules , 1985 .

[67]  O. SIAMJ.,et al.  PROX-METHOD WITH RATE OF CONVERGENCE O(1/t) FOR VARIATIONAL INEQUALITIES WITH LIPSCHITZ CONTINUOUS MONOTONE OPERATORS AND SMOOTH CONVEX-CONCAVE SADDLE POINT PROBLEMS∗ , 2004 .

[68]  George J. Pappas,et al.  Genetic network identification using convex programming. , 2009, IET systems biology.

[69]  Jeffrey S. Rosenthal,et al.  Convergence Rates for Markov Chains , 1995, SIAM Rev..

[70]  Sean C. Warnick,et al.  Robust dynamical network structure reconstruction , 2011, Autom..

[71]  Atilla Eryilmaz,et al.  Multi-armed bandits in the presence of side observations in social networks , 2013, 52nd IEEE Conference on Decision and Control.

[72]  Fabio Morbidi,et al.  A distributed solution to the network reconstruction problem , 2014, Syst. Control. Lett..

[73]  Elad Hazan,et al.  Interior-Point Methods for Full-Information and Bandit Online Learning , 2012, IEEE Transactions on Information Theory.

[74]  Marc Lelarge,et al.  Leveraging Side Observations in Stochastic Bandits , 2012, UAI.

[75]  Shahin Shahrampour,et al.  Topology Identification of Directed Dynamical Networks via Power Spectral Analysis , 2013, IEEE Transactions on Automatic Control.

[76]  Rodolphe Sepulchre,et al.  Synchronization in networks of identical linear systems , 2009, Autom..

[77]  Sébastien Bubeck,et al.  Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems , 2012, Found. Trends Mach. Learn..

[78]  Ιωαννησ Τσιτσικλησ,et al.  PROBLEMS IN DECENTRALIZED DECISION MAKING AND COMPUTATION , 1984 .

[79]  John N. Tsitsiklis,et al.  On Learning With Finite Memory , 2012, IEEE Transactions on Information Theory.

[80]  D. Brillinger Time series - data analysis and theory , 1981, Classics in applied mathematics.

[81]  Pooya Molavi,et al.  Information Heterogeneity and the Speed of Learning in Social Networks , 2013 .

[82]  Constantinos Daskalakis,et al.  Near-optimal no-regret algorithms for zero-sum games , 2011, SODA '11.

[83]  Elad Hazan,et al.  Extracting certainty from uncertainty: regret bounded by variation in costs , 2008, Machine Learning.

[84]  Victor M. Preciado,et al.  Robust topology identification and control of LTI networks , 2014, 2014 IEEE Global Conference on Signal and Information Processing (GlobalSIP).

[85]  Joseph Naor,et al.  Unified Algorithms for Online Learning and Competitive Analysis , 2012, COLT.

[86]  M.G. Rabbat,et al.  Generalized consensus computation in networked systems with erasure links , 2005, IEEE 6th Workshop on Signal Processing Advances in Wireless Communications, 2005..

[87]  Mehran Mesbahi,et al.  A Sieve Method for Consensus-type Network Tomography , 2011, ArXiv.

[88]  Sezai Emre Tuna,et al.  Conditions for Synchronizability in Arrays of Coupled Linear Systems , 2008, IEEE Transactions on Automatic Control.

[89]  Naumaan Nayyar,et al.  Decentralized Learning for Multiplayer Multiarmed Bandits , 2014, IEEE Transactions on Information Theory.

[90]  Donatello Materassi,et al.  Topological identification in networks of dynamical systems , 2008, 2008 47th IEEE Conference on Decision and Control.

[91]  Luc Moreau,et al.  Stability of multiagent systems with time-dependent communication links , 2005, IEEE Transactions on Automatic Control.

[92]  C. Granger Investigating causal relations by econometric models and cross-spectral methods , 1969 .

[93]  Daniele Marinazzo,et al.  Kernel method for nonlinear granger causality. , 2007, Physical review letters.

[94]  Marc Timme,et al.  Revealing network connectivity from response dynamics. , 2006, Physical review letters.

[95]  John N. Tsitsiklis,et al.  On distributed averaging algorithms and quantization effects , 2007, 2008 47th IEEE Conference on Decision and Control.

[96]  Qing Zhao,et al.  Distributed Learning in Multi-Armed Bandit With Multiple Players , 2009, IEEE Transactions on Signal Processing.

[97]  Gábor Lugosi,et al.  Prediction, learning, and games , 2006 .

[98]  Fan Chung,et al.  Spectral Graph Theory , 1996 .

[99]  Rebecca Willett,et al.  Online Optimization in Dynamic Environments , 2013, ArXiv.

[100]  Haipeng Luo,et al.  Fast Convergence of Regularized Learning in Games , 2015, NIPS.

[101]  Asuman E. Ozdaglar,et al.  Distributed Subgradient Methods for Multi-Agent Optimization , 2009, IEEE Transactions on Automatic Control.

[102]  M. Kearns,et al.  Algorithmic Game Theory: Graphical Games , 2007 .

[103]  Mingyan Liu,et al.  Performance and Convergence of Multi-user Online Learning , 2011, GAMENETS.

[104]  C. McDiarmid Concentration , 1862, The Dental register.

[105]  José M. F. Moura,et al.  Distributed Detection Over Noisy Networks: Large Deviations Analysis , 2011, IEEE Transactions on Signal Processing.

[106]  Pramod P. Khargonekar,et al.  A global identifiability condition for consensus networks on tree graphs , 2015, 2015 American Control Conference (ACC).

[107]  Angelia Nedic,et al.  Network independent rates in distributed learning , 2015, 2016 American Control Conference (ACC).

[108]  Shahin Shahrampour,et al.  Distributed Detection: Finite-Time Analysis and Impact of Network Topology , 2014, IEEE Transactions on Automatic Control.

[109]  Alain Y. Kibangou,et al.  Distributed network topology reconstruction in presence of anonymous nodes , 2015, 2015 23rd European Signal Processing Conference (EUSIPCO).

[110]  Rong Jin,et al.  25th Annual Conference on Learning Theory Online Optimization with Gradual Variations , 2022 .

[111]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[112]  Peter Auer,et al.  Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.

[113]  H. Vincent Poor,et al.  Bandit problems in networks: Asymptotically efficient distributed allocation rules , 2011, IEEE Conference on Decision and Control and European Control Conference.

[114]  Shahin Shahrampour,et al.  Online Optimization : Competing with Dynamic Comparators , 2015, AISTATS.