RLProph: a dynamic programming based reinforcement learning approach for optimal routing in opportunistic IoT networks

Routing in Opportunistic Internet of Things networks (OppIoTs) is a challenging task because of intermittent connectivity between devices and the lack of a fixed path between the source and destination of messages. Recently, machine learning (ML) and reinforcement learning (RL) have been used with great success to automate processes in a number of different problem domains. In this paper, we seek to fully automate the OppIoT routing process by using the Policy Iteration algorithm to maximize the possibility of message delivery. Moreover, we model the OppIoT environment as a Markov decision process (MDP) replete with states, actions, rewards, and transition probabilities. The proposed routing protocol, RLProph, is able to optimize the routing process via the optimal policy obtained by solving the MDP using Policy Iteration. Through extensive simulations, we show that RLProph outperforms a number of ML-based and context-aware routing protocols on a multitude of performance criteria.

[1]  Alex Pentland,et al.  DakNet: rethinking connectivity in developing nations , 2004, Computer.

[2]  Sung-Bong Yang,et al.  An adaptive routing algorithm considering position and social similarities in an opportunistic network , 2016, Wirel. Networks.

[3]  Mohammad S. Obaidat,et al.  Probability-based controlled flooding in opportunistic networks , 2015, 2015 12th International Joint Conference on e-Business and Telecommunications (ICETE).

[4]  Alex Graves,et al.  Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.

[5]  Timur Friedman,et al.  MobySpace: Mobility Pattern Space Routing for DTNs , 2005, SIGCOMM 2005.

[6]  Deepak Kumar Sharma,et al.  SEIR: A Stackelberg game based approach for energy-aware and incentivized routing in selfish Opportunistic Networks , 2017, 2017 51st Annual Conference on Information Sciences and Systems (CISS).

[7]  Jizhao Liu,et al.  Destination-aware metric based social routing for mobile opportunistic networks , 2019 .

[8]  Wei-Pang Yang,et al.  A discretization algorithm based on Class-Attribute Contingency Coefficient , 2008, Inf. Sci..

[9]  Yong Wang,et al.  Energy-efficient computing for wildlife tracking: design tradeoffs and early experiences with ZebraNet , 2002, ASPLOS X.

[10]  Masoud Sabaei,et al.  CPTR: conditional probability tree based routing in opportunistic networks , 2017, Wirel. Networks.

[11]  Deepak Kumar Sharma,et al.  A fuzzy logic and game theory based adaptive approach for securing opportunistic networks against black hole attacks , 2018, Int. J. Commun. Syst..

[12]  Mostafa Abdollahi,et al.  On selection of forwarding nodes for long opportunistic routes , 2019, Wirel. Networks.

[13]  Yishay Mansour,et al.  Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.

[14]  Oliver Brock,et al.  MV routing and capacity building in disruption tolerant networks , 2005, Proceedings IEEE 24th Annual Joint Conference of the IEEE Computer and Communications Societies..

[15]  Isaac Woungang,et al.  HBPR: History Based Prediction for Routing in Infrastructure-less Opportunistic Networks , 2013, 2013 IEEE 27th International Conference on Advanced Information Networking and Applications (AINA).

[16]  Ling-Jyh Chen,et al.  PRoPHET+: An Adaptive PRoPHET-Based Routing Protocol for Opportunistic Network , 2010, 2010 24th IEEE International Conference on Advanced Information Networking and Applications.

[17]  Deepak Kumar Sharma,et al.  A game theory based secure model against Black hole attacks in Opportunistic Networks , 2017, 2017 51st Annual Conference on Information Sciences and Systems (CISS).

[18]  David Silver,et al.  Deep Reinforcement Learning with Double Q-Learning , 2015, AAAI.

[19]  Cauligi S. Raghavendra,et al.  Spray and wait: an efficient routing scheme for intermittently connected mobile networks , 2005, WDTN '05.

[20]  Anders Lindgren,et al.  Probabilistic routing in intermittently connected networks , 2003, MOCO.

[21]  MartonosiMargaret,et al.  Energy-efficient computing for wildlife tracking , 2002 .

[22]  Zhigang Chen,et al.  Information cache management and data transmission algorithm in opportunistic social networks , 2018, Wireless Networks.

[23]  Deepak Kumar Sharma,et al.  KNNR:K-nearest neighbour classification based routing protocol for opportunistic networks , 2017, 2017 Tenth International Conference on Contemporary Computing (IC3).

[24]  Amin Vahdat,et al.  Epidemic Routing for Partially-Connected Ad Hoc Networks , 2009 .

[25]  Isaac Woungang,et al.  Efficient routing based on past information to predict the future location for message passing in infrastructure-less opportunistic networks , 2014, The Journal of Supercomputing.

[26]  Alessandro Puiatti,et al.  Probabilistic Routing Protocol for Intermittently Connected Mobile Ad hoc Network (PROPICMAN) , 2007, 2007 IEEE International Symposium on a World of Wireless, Mobile and Multimedia Networks.

[27]  Jörg Ott,et al.  The ONE simulator for DTN protocol evaluation , 2009, SIMUTools 2009.

[28]  Jie Wu,et al.  Community-Aware Opportunistic Routing in Mobile Social Networks , 2014, IEEE Transactions on Computers.

[29]  Usama M. Fayyad,et al.  Multi-Interval Discretization of Continuous-Valued Attributes for Classification Learning , 1993, IJCAI.

[30]  Demis Hassabis,et al.  Mastering the game of Go without human knowledge , 2017, Nature.

[31]  Zhu Wang,et al.  Opportunistic IoT: Exploring the harmonious interaction between human and the internet of things , 2013, J. Netw. Comput. Appl..

[32]  Marco Conti,et al.  HiBOp: a History Based Routing Protocol for Opportunistic Networks , 2007, 2007 IEEE International Symposium on a World of Wireless, Mobile and Multimedia Networks.

[33]  Lukasz A. Kurgan,et al.  CAIM discretization algorithm , 2004, IEEE Transactions on Knowledge and Data Engineering.

[34]  Joel J. P. C. Rodrigues,et al.  A Machine Learning-Based Protocol for Efficient Routing in Opportunistic Networks , 2018, IEEE Systems Journal.

[35]  Deepak Kumar Sharma,et al.  GMMR: A Gaussian mixture model based unsupervised machine learning approach for optimal routing in opportunistic IoT networks , 2019, Comput. Commun..

[36]  Lester G. Telser Book Review:Dynamic Programming and Markov Processes Ronald A. Howard , 1961 .

[37]  R. Bellman A Markovian Decision Process , 1957 .

[38]  Pan Hui,et al.  CRAWDAD dataset cambridge/haggle (v.2009-05-29) , 2009 .

[39]  Cauligi S. Raghavendra,et al.  Spray and Focus: Efficient Mobility-Assisted Routing for Heterogeneous and Correlated Mobility , 2007, Fifth Annual IEEE International Conference on Pervasive Computing and Communications Workshops (PerComW'07).