论文信息 - The power of online learning in stochastic network optimization

The power of online learning in stochastic network optimization

In this paper, we investigate the power of online learning in stochastic network optimization with unknown system statistics a priori. We are interested in understanding how information and learning can be efficiently incorporated into system control techniques, and what are the fundamental benefits of doing so. We propose two Online Learning-Aided Control techniques, OLAC and OLAC2, that explicitly utilize the past system information in current system control via a learning procedure called dual learning. We prove strong performance guarantees of the proposed algorithms: OLAC and OLAC2 achieve the near-optimal [O(ε), O([log(1/ε)]2)] utility-delay tradeoff and OLAC2 possesses an O(ε-2/3) convergence time. Simulation results also confirm the superior performance of the proposed algorithms in practice. To the best of our knowledge, OLAC and OLAC2 are the first algorithms that simultaneously possess explicit near-optimal delay guarantee and sub-linear convergence time, and our attempt is the first to explicitly incorporate online learning into stochastic network optimization and to demonstrate its power in both theory and practice.

Longbo Huang | Xiaohong Hao | Xin Liu

[1] A. Banerjee. Convex Analysis and Optimization , 2006 .

[2] Pravin Varaiya,et al. The Max-Pressure Controller for Arbitrary Networks of Signalized Intersections , 2013 .

[3] Jean Walrand,et al. Stable and utility-maximizing scheduling for stochastic processing networks , 2009, 2009 47th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[4] Michael J. Neely,et al. Opportunistic Scheduling with Reliability Guarantees in Cognitive Radio Networks , 2008, IEEE INFOCOM 2008 - The 27th Conference on Computer Communications.

[5] A. Robert Calderbank,et al. Layering as Optimization Decomposition: A Mathematical Theory of Network Architectures , 2007, Proceedings of the IEEE.

[6] Alexandros G. Dimakis,et al. Efficient Algorithms for Renewable Energy Allocation to Delay Tolerant Consumers , 2010, 2010 First IEEE International Conference on Smart Grid Communications.

[7] Fan Chung Graham,et al. Concentration Inequalities and Martingale Inequalities: A Survey , 2006, Internet Math..

[8] LiuXin,et al. The power of online learning in stochastic network optimization , 2014 .

[9] Dimitri P. Bertsekas,et al. Dynamic Programming and Optimal Control, Two Volume Set , 1995 .

[10] Lei Ying,et al. On Combining Shortest-Path and Back-Pressure Routing Over Multihop Wireless Networks , 2011, IEEE/ACM Transactions on Networking.

[11] Steven H. Low,et al. Optimization flow control—I: basic algorithm and convergence , 1999, TNET.

[12] R. Srikant,et al. Fair Resource Allocation in Wireless Networks Using Queue-Length-Based Scheduling and Congestion Control , 2005, IEEE/ACM Transactions on Networking.

[13] Atilla Eryilmaz,et al. Wireless scheduling for network utility maximization with optimal convergence speed , 2013, 2013 Proceedings IEEE INFOCOM.

[14] M. Neely,et al. Max-Weight Achieves the Exact $[O(1/V), O(V)]$ Utility-Delay Tradeoff Under Markov Dynamics , 2010, 1008.0200.

[15] Bhaskar Krishnamachari,et al. LIFO-Backpressure Achieves Near-Optimal Utility-Delay Tradeoff , 2010, IEEE/ACM Transactions on Networking.

[16] Tung Le,et al. Decentralized signal control for urban road networks , 2013, 1310.0491.

[17] Sean P. Meyn. Control Techniques for Complex Networks: Workload , 2007 .

[18] Michael J. Neely,et al. Optimal Energy and Delay Tradeoffs for Multiuser Wireless Downlinks , 2007, IEEE Transactions on Information Theory.

[19] Longbo Huang,et al. Dynamic product assembly and inventory control for maximum profit , 2010, 49th IEEE Conference on Decision and Control (CDC).

[20] Anand Sivasubramaniam,et al. Optimal power cost management using stored energy in data centers , 2011, PERV.

[21] W. Rudin. Principles of mathematical analysis , 1964 .

[22] Michael J. Neely,et al. Super-Fast Delay Tradeoffs for Utility Optimal Fair Scheduling in Wireless Networks , 2006, Proceedings IEEE INFOCOM 2006. 25TH IEEE International Conference on Computer Communications.

[23] Michael J. Neely,et al. Optimal Energy and Delay Tradeoffs for Multi-User Wireless Downlinks , 2006, Proceedings IEEE INFOCOM 2006. 25TH IEEE International Conference on Computer Communications.

[24] M. Neely,et al. The Optimality of Two Prices : Maximizing Revenue in a Stochastic Network , 2007 .

[25] Leandros Tassiulas,et al. Resource Allocation and Cross-Layer Control in Wireless Networks , 2006, Found. Trends Netw..

[26] J. G. Dai,et al. Maximum Pressure Policies in Stochastic Processing Networks , 2005, Oper. Res..

[27] Longbo Huang,et al. Delay reduction via Lagrange multipliers in stochastic network optimization , 2009, IEEE Transactions on Automatic Control.

[28] Alexander L. Stolyar,et al. Novel Architectures and Algorithms for Delay Reduction in Back-Pressure Scheduling and Routing , 2009, IEEE INFOCOM 2009.

[29] Jean C. Walrand,et al. Optimal smart grid tariffs , 2012, 2012 Information Theory and Applications Workshop.

[30] R. Durrett. Probability: Theory and Examples , 1993 .

[31] Longbo Huang,et al. The Optimality of Two Prices: Maximizing Revenue in a Stochastic Communication System , 2010, IEEE/ACM Transactions on Networking.