Coded Retransmission in Wireless Networks Via Abstract MDPs: Theory and Algorithms

Consider a transmission scheme with a single transmitter and multiple receivers over a faulty broadcast channel. For each receiver, the transmitter has a unique infinite stream of packets, and its goal is to deliver them at the highest throughput possible. While such multiple-unicast models are unsolved in general, several network coding-based schemes were suggested. In such schemes, the transmitter can either send an uncoded packet, or a coded packet which is the function of a few packets. The packets sent can be received by the designated receiver (with some probability) or heard and stored by other receivers. Two functional modes are considered; the first presumes that the storage time is unlimited, while in the second one it is limited by the given time to expire (TTE) parameter. We model the transmission process as an infinite-horizon Markov decision process (MDP). Since the large state space renders exact solutions computationally impractical, we introduce policy-restricted and induced MDPs with significantly reduced state space, and prove that with proper reward function they have equal optimal value function (hence equal optimal throughput). We then derive a reinforcement learning algorithm, which learns the optimal policy for the induced MDP. This optimal strategy of the induced MDP, once applied to the policy of the restricted one, significantly improves over uncoded schemes. Next, we enhance the algorithm by means of analysis of the structural properties of the resulting cost functional. We demonstrate that our method scales well in the number of users, and automatically adapts to the packet loss rates, unknown in advance. In addition, the performance is compared to the recent bound by Wang, which assumes much stronger coding (e.g., intrasession and buffering of coded packets), yet is shown to be comparable

[1]  Robert Tappan Morris,et al.  ExOR: opportunistic multi-hop routing for wireless networks , 2005, SIGCOMM '05.

[2]  Ronen I. Brafman,et al.  R-MAX - A General Polynomial Time Algorithm for Near-Optimal Reinforcement Learning , 2001, J. Mach. Learn. Res..

[3]  M. Medard,et al.  On Delay Performance Gains From Network Coding , 2006, 2006 40th Annual Conference on Information Sciences and Systems.

[4]  Ronald Ortner,et al.  Noname manuscript No. (will be inserted by the editor) Adaptive Aggregation for Reinforcement Learning in Average Reward Markov Decision Processes , 2022 .

[5]  Alexander Sprintson,et al.  On the Index Coding Problem and Its Relation to Network Coding and Matroid Theory , 2008, IEEE Transactions on Information Theory.

[6]  Martin L. Puterman,et al.  Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[7]  Dong Nguyen,et al.  Network coding-based wireless media transmission using POMDP , 2009, 2009 17th International Packet Video Workshop.

[8]  Randall Dougherty,et al.  Nonreversibility and Equivalent Constructions of Multiple-Unicast Networks , 2006, IEEE Transactions on Information Theory.

[9]  Muriel Medard,et al.  A feedback-based adaptive broadcast coding scheme for reducing in-order delivery delay , 2009, 2009 Workshop on Network Coding, Theory, and Applications.

[10]  Rudolf Ahlswede,et al.  Network information flow , 2000, IEEE Trans. Inf. Theory.

[11]  Randall Dougherty,et al.  Insufficiency of linear coding in network information flow , 2005, IEEE Transactions on Information Theory.

[12]  Parastoo Sadeghi,et al.  An Optimal Adaptive Network Coding Scheme for Minimizing Decoding Delay in Broadcast Erasure Channels , 2010, EURASIP J. Wirel. Commun. Netw..

[13]  Dong Nguyen,et al.  Multimedia wireless transmission with network coding , 2007, Packet Video 2007.

[14]  Xiao Xiao,et al.  A Wireless Broadcasting Retransmission Approach Based on Network Coding , 2008, 2008 4th IEEE International Conference on Circuits and Systems for Communications.

[15]  Sridhar Mahadevan,et al.  Average reward reinforcement learning: Foundations, algorithms, and empirical results , 2004, Machine Learning.

[16]  Michael Langberg,et al.  On the complementary Index Coding problem , 2011, 2011 IEEE International Symposium on Information Theory Proceedings.

[17]  Michael Kearns,et al.  Near-Optimal Reinforcement Learning in Polynomial Time , 2002, Machine Learning.

[18]  Omer Gurewitz,et al.  Coded unicast downstream traffic in a wireless network: analysis and WiFi implementation , 2013, EURASIP Journal on Advances in Signal Processing.

[19]  Baochun Li,et al.  SlideOR: Online Opportunistic Network Coding in Wireless Mesh Networks , 2010, 2010 Proceedings IEEE INFOCOM.

[20]  Chih-Chun Wang,et al.  On the Capacity of 1-to- $K$ Broadcast Packet Erasure Channels With Channel Output Feedback , 2010, IEEE Transactions on Information Theory.

[21]  Yitzhak Birk,et al.  Coding on demand by an informed source (ISCOD) for efficient broadcast of different supplemental data to caching clients , 2006, IEEE Transactions on Information Theory.

[22]  Thomas J. Walsh,et al.  Towards a Unified Theory of State Abstraction for MDPs , 2006, AI&M.

[23]  Eduardo F. Morales,et al.  An Introduction to Reinforcement Learning , 2011 .

[24]  Dimitri P. Bertsekas,et al.  Dynamic Programming and Optimal Control, Two Volume Set , 1995 .

[25]  Shahrokh Valaee,et al.  On densifying coding opportunities in instantly decodable network coding graphs , 2012, 2012 IEEE International Symposium on Information Theory Proceedings.

[26]  Devavrat Shah,et al.  ARQ for network coding , 2008, 2008 IEEE International Symposium on Information Theory.

[27]  Michael L. Littman,et al.  An analysis of model-based Interval Estimation for Markov Decision Processes , 2008, J. Comput. Syst. Sci..

[28]  Alexander Sprintson,et al.  On the Minimum Number of Transmissions in Single-Hop Wireless Coding Networks , 2007, 2007 IEEE Information Theory Workshop.

[29]  Doina Precup,et al.  Bisimulation Metrics for Continuous Markov Decision Processes , 2011, SIAM J. Comput..

[30]  Leandros Tassiulas,et al.  Broadcast erasure channel with feedback - Capacity and algorithms , 2009, 2009 Workshop on Network Coding, Theory, and Applications.

[31]  Peter Stone,et al.  State Abstraction Discovery from Irrelevant State Variables , 2005, IJCAI.

[32]  Shahrokh Valaee,et al.  An Adaptive Network Coded Retransmission Scheme for Single-Hop Wireless Multicast Broadcast Services , 2011, IEEE/ACM Transactions on Networking.

[33]  Omer Gurewitz,et al.  Coded retransmission in wireless networks via abstract MDPs: Theory and algorithms , 2015, ISIT.

[34]  Jörg Widmer,et al.  Informed network coding for minimum decoding delay , 2008, 2008 5th IEEE International Conference on Mobile Ad Hoc and Sensor Systems.

[35]  Milica Stojanovic,et al.  Network coding for data dissemination: it is not what you know, but what your neighbors don't know , 2009, 2009 7th International Symposium on Modeling and Optimization in Mobile, Ad Hoc, and Wireless Networks.

[36]  L. Keller,et al.  Online Broadcasting with Network Coding , 2008, 2008 Fourth Workshop on Network Coding, Theory and Applications.

[37]  Donald F. Towsley,et al.  Reliability Gain of Network Coding in Lossy Wireless Networks , 2008, IEEE INFOCOM 2008 - The 27th Conference on Computer Communications.

[38]  Chih-Chun Wang On the Capacity of 1-to-K Broadcast Packet Erasure Channels With Channel Output Feedback , 2012, IEEE Trans. Inf. Theory.

[39]  Richard M. Karp,et al.  Reducibility Among Combinatorial Problems , 1972, 50 Years of Integer Programming.

[40]  Matthew E. Taylor,et al.  Abstraction and Generalization in Reinforcement Learning: A Summary and Framework , 2009, ALA.

[41]  Sudipta Sengupta,et al.  Loss-aware network coding for unicast wireless sessions: design, implementation, and performance evaluation , 2008, SIGMETRICS '08.

[42]  Muriel Médard,et al.  XORs in the Air: Practical Wireless Network Coding , 2006, IEEE/ACM Transactions on Networking.

[43]  Dong Nguyen,et al.  Wireless Broadcast Using Network Coding , 2009, IEEE Transactions on Vehicular Technology.

[44]  Yishay Mansour,et al.  A Sparse Sampling Algorithm for Near-Optimal Planning in Large Markov Decision Processes , 1999, Machine Learning.

[45]  Kenneth W. Shum,et al.  Data Dissemination With Side Information and Feedback , 2014, IEEE Transactions on Wireless Communications.