An Online Method for Opportunistic Task Replications

We discuss the online optimization of redundant copies of computing tasks. The method helps to overcome the limitations of cloud and edge computing methods for mobile applications where the shared use of mixed and distributed processors often yields unpredictable execution times. While the aim is to obtain the earliest response (discarding the rest), without careful control of the task replication process, the redundant executions may lead to excessive processor contention leading to undesired effects. The current state of practice assumes the use of homogeneous processors, heuristics, and perfect knowledge of the future task runtimes, which limit the application scope of the idea. Through reinforcement learning, the proposed method discovers autonomously how to optimally select for each new request both the number of replicas and their processor assignment. The method can operate effectively without full knowledge of system features and regardless of the changing system state. An extensive simulation study using different scenarios confirms the performance of this proposal and offers a quantitative insight into its advantages and limitations. Diverse mobile applications prospectively benefit from this approach, as computer and communication networks become larger and more diverse, and mobile applications require increasingly higher reliability and lower latency.

[1]  Kumpati S. Narendra,et al.  Learning Automata - A Survey , 1974, IEEE Trans. Syst. Man Cybern..

[2]  Ness B. Shroff,et al.  On Delay-Optimal Scheduling in Queueing Systems with Replications , 2016, ArXiv.

[3]  Kannan Ramchandran,et al.  The MDS Queue , 2012, ArXiv.

[4]  T. Charles Clancy,et al.  On the latency of heterogeneous MDS queue , 2014, 2014 IEEE Global Communications Conference.

[5]  Mandayam A. L. Thathachar,et al.  Learning the global maximum with parameterized learning automata , 1995, IEEE Trans. Neural Networks.

[6]  Mariana Agache,et al.  CONTINUOUS AND DISCRETIZED GENERALIZED PURSUIT LEARNING SCHEMES , 2000 .

[7]  P. S. Sastry,et al.  Varieties of learning automata: an overview , 2002, IEEE Trans. Syst. Man Cybern. Part B.

[8]  Emina Soljanin,et al.  On the Delay-Storage Trade-Off in Content Download from Coded Distributed Storage Systems , 2013, IEEE Journal on Selected Areas in Communications.

[9]  Emina Soljanin,et al.  Coding for fast content download , 2012, 2012 50th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[10]  B. John Oommen,et al.  Generalized pursuit learning schemes: new families of continuous and discretized learning automata , 2002, IEEE Trans. Syst. Man Cybern. Part B.

[11]  M. A. L. THATHACHAR,et al.  A new approach to the design of reinforcement schemes for learning automata , 1985, IEEE Transactions on Systems, Man, and Cybernetics.

[12]  Omkar J. Tilak,et al.  On ε-Optimality of the Pursuit Learning Algorithm , 2012, J. Appl. Probab..

[13]  Jun Li,et al.  Distributed caching based on decentralized learning automata , 2015, 2015 IEEE International Conference on Communications (ICC).

[14]  Kannan Ramchandran,et al.  Codes can reduce queueing delay in data centers , 2012, 2012 IEEE International Symposium on Information Theory Proceedings.

[15]  Pi-Chung Wang,et al.  Adaptive Replication for Mobile Edge Computing , 2018, IEEE Journal on Selected Areas in Communications.

[16]  Alan Scheller-Wolf,et al.  A Better Model for Job Redundancy: Decoupling Server Slowdown and Job Size , 2016, IEEE/ACM Transactions on Networking.

[17]  Emina Soljanin,et al.  Queues with Redundancy: Latency-Cost Analysis , 2015, PERV.

[18]  Wing Cheong Lau,et al.  Optimization for Speculative Execution in Big Data Processing Clusters , 2017, IEEE Transactions on Parallel and Distributed Systems.

[19]  Ger Koole,et al.  Resource allocation in grid computing , 2008, J. Sched..