A set of novel continuous action-set reinforcement learning automata models to optimize continuous functions

Learning automata (LA) as a powerful tool for reinforcement learning which belongs to the subject of Artificial Intelligence, could search for the optimal state adaptively in a random environment. In the past decades quite a few FALA algorithms are maturely developed but exposing critical defects, when they are applied to optimize continuous functions. In order to overcome their shortcomings and explore a higher-performance LA, we propose a novel CALA algorithm to solve the function optimization problems via one kind of LA prototypes, i.e, the continuous action-set reinforcement learning automata, which is abbreviated as CARLA. The key mechanism of the proposed algorithm lies in a combination of equidistant discretization and linear interpolation. Specifically, four categories of application models are constructed. Two of them are created to obtain continuous actions when the priori information is finite ones, thus avoiding the drawbacks of FALA. The realization of this functionality recourses to the so-called cumulative distribution function (CDF) and a new concept of area surrounded by curves (AsbC) respectively. The other two models are modified versions to balance the trade-off between accuracy and speed. Moreover, these models are expanded to their generalized versions so that multidimensional function optimization problems can be handled as well. A massive amount of experiments including four benchmarks and three scenarios are designed to demonstrate the effectiveness and efficiency of the proposed application models. The proposed algorithm outperforms the state of the arts of LA as well as optimization algorithms, with a high accuracy rate, a fast convergence speed, and a competitive time consumption, especially in noised environments.

[1]  B. John Oommen,et al.  Using Stochastic AI Techniques to Achieve Unbounded Resolution in Finite Player Goore Games and its Applications , 2007, 2007 IEEE Symposium on Computational Intelligence and Games.

[2]  Lev A. Sakhnovich,et al.  Interpolation Theory and Its Applications , 1997 .

[3]  MengChu Zhou,et al.  Fast and Epsilon-Optimal Discretized Pursuit Learning Automata , 2015, IEEE Transactions on Cybernetics.

[4]  Dr A. Alavi,et al.  Statistical Mechanics and its applications , 2007 .

[5]  Georgios I. Papadimitriou,et al.  A new class of epsi-optimal learning automata , 2004, IEEE Trans. Syst. Man Cybern. Part B.

[6]  Timothy Gordon,et al.  Continuous action reinforcement learning applied to vehicle suspension control , 1997 .

[7]  M. A. L. THATHACHAR,et al.  A new approach to the design of reinforcement schemes for learning automata , 1985, IEEE Transactions on Systems, Man, and Cybernetics.

[8]  B. John Oommen,et al.  A Novel Strategy for Solving the Stochastic Point Location Problem Using a Hierarchical Searching Scheme , 2014, IEEE Transactions on Cybernetics.

[9]  Jiaheng Wang,et al.  Adaptive mechanism design and game theoretic analysis of auction-driven dynamic spectrum access in cognitive radio networks , 2014, EURASIP J. Wirel. Commun. Netw..

[10]  J. A. Anderson,et al.  Talking Nets: An Oral History Of Neural Networks , 1998, IEEE Trans. Neural Networks.

[11]  Matti Latva-aho,et al.  Distributed resource allocation for MISO downlink systems via the alternating direction method of multipliers , 2012, EURASIP Journal on Wireless Communications and Networking.

[12]  B.J. Oommen,et al.  Parameter learning from stochastic teachers and stochastic compulsive liars , 2006, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[13]  Hamid Beigy,et al.  A new continuous action-set learning automaton for function optimization , 2006, J. Frankl. Inst..

[14]  Kumpati S. Narendra,et al.  Learning automata - an introduction , 1989 .

[15]  Q. Henry Wu,et al.  Function optimisation by learning automata , 2013, Inf. Sci..

[16]  Wen Jiang,et al.  A new Learning Automata based approach for online tracking of event patterns , 2014, Neurocomputing.

[17]  B. John Oommen,et al.  Stochastic searching on the line and its applications to parameter learning in nonlinear optimization , 1997, IEEE Trans. Syst. Man Cybern. Part B.

[18]  Javier de Lope,et al.  Fusion of probabilistic knowledge-based classification rules and learning automata for automatic recognition of digital images , 2013, Pattern Recognit. Lett..

[19]  P. Venkata Krishna,et al.  Learning automata as a utility for power management in smart grids , 2013, IEEE Communications Magazine.

[20]  B. John Oommen,et al.  Modeling a Student's Behavior in a Tutorial-Like System Using Learning Automata , 2010, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[21]  Matthew Stewart,et al.  IEEE Transactions on Cybernetics , 2015, IEEE Transactions on Cybernetics.

[22]  B. John Oommen,et al.  Modeling the “Learning Process” of the Teacher in a Tutorial-Like System Using Learning Automata , 2013, IEEE Transactions on Cybernetics.

[23]  T. J. Gordon,et al.  Genetic learning automata for function optimization , 2002, IEEE Trans. Syst. Man Cybern. Part B.

[24]  Jianhua Li,et al.  A novel estimator based learning automata algorithm , 2014, Applied Intelligence.

[25]  Mohammad S. Obaidat,et al.  Collaborative Learning Automata-Based Routing for Rescue Operations in Dense Urban Regions Using Vehicular Sensor Networks , 2015, IEEE Systems Journal.

[26]  Mohammad Reza Meybodi,et al.  Sampling from complex networks using distributed learning automata , 2014 .

[27]  De-Shuang Huang,et al.  A General CPL-AdS Methodology for Fixing Dynamic Parameters in Dual Environments , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[28]  Alagan Anpalagan,et al.  Opportunistic Spectrum Access in Unknown Dynamic Environment: A Game-Theoretic Stochastic Learning Solution , 2012, IEEE Transactions on Wireless Communications.

[29]  P. Venkata Krishna,et al.  Learning Automata-Based QoS Framework for Cloud IaaS , 2014, IEEE Transactions on Network and Service Management.

[30]  P. S. Sastry,et al.  Varieties of learning automata: an overview , 2002, IEEE Trans. Syst. Man Cybern. Part B.

[31]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[32]  Xin-She Yang,et al.  Artificial Intelligence, Evolutionary Computing and Metaheuristics - In the Footsteps of Alan Turing , 2013, Artificial Intelligence, Evolutionary Computing and Metaheuristics.

[33]  Sabine Loudcher,et al.  A Comparison of Some Contextual Discretization Methods , 1996, Inf. Sci..

[34]  MengChu Zhou,et al.  Last-Position Elimination-Based Learning Automata , 2014, IEEE Transactions on Cybernetics.

[35]  Randy L. Haupt,et al.  Practical Genetic Algorithms , 1998 .

[36]  Xianyi Zeng,et al.  A learning automata based algorithm for optimization of continuous complex functions , 2005, Inf. Sci..

[37]  Lei Shu,et al.  Efficient Medium Access Control for Cyber–Physical Systems With Heterogeneous Networks , 2015, IEEE Systems Journal.

[38]  Yifan Wang,et al.  A cooperative framework of learning automata and its application in tutorial-like system , 2016, Neurocomputing.

[39]  B. John Oommen,et al.  On incorporating the paradigms of discretization and Bayesian estimation to create a new family of pursuit learning automata , 2013, Applied Intelligence.

[40]  Vivek Tiwari,et al.  Lacas: learning automata-based congestion avoidance scheme for healthcare wireless sensor networks , 2009, IEEE Journal on Selected Areas in Communications.

[41]  Mohammad Reza Meybodi,et al.  STOCHASTIC OPTIMIZATION USING CONTINUOUS ACTION-SET LEARNING AUTOMATA , 2005 .

[42]  B. John Oommen,et al.  Automata learning and intelligent tertiary searching for stochastic point location , 1998, IEEE Trans. Syst. Man Cybern. Part B.