The power of online learning in stochastic network optimization

In this paper, we investigate the power of online learning in stochastic network optimization with unknown system statistics <i>a priori</i>. We are interested in understanding how information and learning can be efficiently incorporated into system control techniques, and what are the fundamental benefits of doing so. We propose two <i>Online Learning-Aided Control</i> techniques, <b>OLAC</b> and <b>OLAC2</b>, that explicitly utilize the past system information in current system control via a learning procedure called <i>dual learning</i>. We prove strong performance guarantees of the proposed algorithms: <b>OLAC</b> and <b>OLAC2</b> achieve the near-optimal [<i>O</i>(ε), <i>O</i>([log(1/ε)]<sup>2</sup>)] utility-delay tradeoff and <b>OLAC2</b> possesses an <i>O</i>(ε<sup>-2/3</sup>) convergence time. Simulation results also confirm the superior performance of the proposed algorithms in practice. To the best of our knowledge, <b>OLAC</b> and <b>OLAC2</b> are the first algorithms that simultaneously possess explicit near-optimal delay guarantee and sub-linear convergence time, and our attempt is the first to explicitly incorporate online learning into stochastic network optimization and to demonstrate its power in both theory and practice.

[1]  A. Banerjee Convex Analysis and Optimization , 2006 .

[2]  Pravin Varaiya,et al.  The Max-Pressure Controller for Arbitrary Networks of Signalized Intersections , 2013 .

[3]  Jean Walrand,et al.  Stable and utility-maximizing scheduling for stochastic processing networks , 2009, 2009 47th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[4]  Michael J. Neely,et al.  Opportunistic Scheduling with Reliability Guarantees in Cognitive Radio Networks , 2008, IEEE INFOCOM 2008 - The 27th Conference on Computer Communications.

[5]  A. Robert Calderbank,et al.  Layering as Optimization Decomposition: A Mathematical Theory of Network Architectures , 2007, Proceedings of the IEEE.

[6]  Alexandros G. Dimakis,et al.  Efficient Algorithms for Renewable Energy Allocation to Delay Tolerant Consumers , 2010, 2010 First IEEE International Conference on Smart Grid Communications.

[7]  Fan Chung Graham,et al.  Concentration Inequalities and Martingale Inequalities: A Survey , 2006, Internet Math..

[8]  LiuXin,et al.  The power of online learning in stochastic network optimization , 2014 .

[9]  Dimitri P. Bertsekas,et al.  Dynamic Programming and Optimal Control, Two Volume Set , 1995 .

[10]  Lei Ying,et al.  On Combining Shortest-Path and Back-Pressure Routing Over Multihop Wireless Networks , 2011, IEEE/ACM Transactions on Networking.

[11]  Steven H. Low,et al.  Optimization flow control—I: basic algorithm and convergence , 1999, TNET.

[12]  R. Srikant,et al.  Fair Resource Allocation in Wireless Networks Using Queue-Length-Based Scheduling and Congestion Control , 2005, IEEE/ACM Transactions on Networking.

[13]  Atilla Eryilmaz,et al.  Wireless scheduling for network utility maximization with optimal convergence speed , 2013, 2013 Proceedings IEEE INFOCOM.

[14]  M. Neely,et al.  Max-Weight Achieves the Exact $[O(1/V), O(V)]$ Utility-Delay Tradeoff Under Markov Dynamics , 2010, 1008.0200.

[15]  Bhaskar Krishnamachari,et al.  LIFO-Backpressure Achieves Near-Optimal Utility-Delay Tradeoff , 2010, IEEE/ACM Transactions on Networking.

[16]  Tung Le,et al.  Decentralized signal control for urban road networks , 2013, 1310.0491.

[17]  Sean P. Meyn Control Techniques for Complex Networks: Workload , 2007 .

[18]  Michael J. Neely,et al.  Optimal Energy and Delay Tradeoffs for Multiuser Wireless Downlinks , 2007, IEEE Transactions on Information Theory.

[19]  Longbo Huang,et al.  Dynamic product assembly and inventory control for maximum profit , 2010, 49th IEEE Conference on Decision and Control (CDC).

[20]  Anand Sivasubramaniam,et al.  Optimal power cost management using stored energy in data centers , 2011, PERV.

[21]  W. Rudin Principles of mathematical analysis , 1964 .

[22]  Michael J. Neely,et al.  Super-Fast Delay Tradeoffs for Utility Optimal Fair Scheduling in Wireless Networks , 2006, Proceedings IEEE INFOCOM 2006. 25TH IEEE International Conference on Computer Communications.

[23]  Michael J. Neely,et al.  Optimal Energy and Delay Tradeoffs for Multi-User Wireless Downlinks , 2006, Proceedings IEEE INFOCOM 2006. 25TH IEEE International Conference on Computer Communications.

[24]  M. Neely,et al.  The Optimality of Two Prices : Maximizing Revenue in a Stochastic Network , 2007 .

[25]  Leandros Tassiulas,et al.  Resource Allocation and Cross-Layer Control in Wireless Networks , 2006, Found. Trends Netw..

[26]  J. G. Dai,et al.  Maximum Pressure Policies in Stochastic Processing Networks , 2005, Oper. Res..

[27]  Longbo Huang,et al.  Delay reduction via Lagrange multipliers in stochastic network optimization , 2009, IEEE Transactions on Automatic Control.

[28]  Alexander L. Stolyar,et al.  Novel Architectures and Algorithms for Delay Reduction in Back-Pressure Scheduling and Routing , 2009, IEEE INFOCOM 2009.

[29]  Jean C. Walrand,et al.  Optimal smart grid tariffs , 2012, 2012 Information Theory and Applications Workshop.

[30]  R. Durrett Probability: Theory and Examples , 1993 .

[31]  Longbo Huang,et al.  The Optimality of Two Prices: Maximizing Revenue in a Stochastic Communication System , 2010, IEEE/ACM Transactions on Networking.