An Adaptive Strategy Selection Method With Reinforcement Learning for Robotic Soccer Games

Robotic soccer games, which have become popular, require timely and precise decision-making in a dynamic environment. To address the problems of complexity in a critical situation, policy improvement in robotic soccer games must occur. This paper proposes an adaptive decision-making method that uses reinforcement learning (RL), and the decision-making system for a robotic soccer game is composed of two subsystems. The first subsystem in the architecture for the proposed method criticizes the situation, and the second subsystem implements decision-making policy. Inspired by the support vector machine (SVM), a situation classification method, which is called an improved SVM, embeds a decision tree structure and simultaneously addresses the problems of a large scale and multiple classifications. When a variety of situations that are collected in the field are classified and congregated into the tree structure, the problem of local strategy selection for each individual class of situations over time is regarded as a RL problem and is solved using a Q-learning method. The results of simulations and experiments demonstrate that the proposed method allows satisfactory decision-making.

[1]  S. Abe Fuzzy support vector machines for multilabel classification , 2015, Pattern Recognit..

[2]  Bin Chen,et al.  Autonomous intelligent decision-making system based on Bayesian SOM neural network for robot soccer , 2014, Neurocomputing.

[3]  Kao-Shing Hwang,et al.  Reinforcement Learning in Strategy Selection for a Coordinated Multirobot System , 2007, IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans.

[4]  Pei-Chann Chang,et al.  A hybrid model combining case-based reasoning and fuzzy decision tree for medical data classification , 2011, Appl. Soft Comput..

[5]  Ayse Basar Bener,et al.  Bayesian Networks For Evidence-Based Decision-Making in Software Engineering , 2014, IEEE Transactions on Software Engineering.

[6]  Slobodan Ribaric,et al.  A model of fuzzy spatio-temporal knowledge representation and reasoning based on high-level Petri nets , 2012, Inf. Syst..

[7]  Kao-Shing Hwang,et al.  Gait Balance and Acceleration of a Biped Robot Based on Q-Learning , 2016, IEEE Access.

[8]  Peter Dayan,et al.  Q-learning , 1992, Machine Learning.

[9]  Yujing Hu,et al.  Accelerating Multiagent Reinforcement Learning by Equilibrium Transfer , 2015, IEEE Transactions on Cybernetics.

[10]  Dipti Srinivasan,et al.  An Introduction to Multi-Agent Systems , 2010 .

[11]  Kao-Shing Hwang,et al.  Cooperative strategy based on adaptive Q-learning for robot soccer systems , 2004, IEEE Transactions on Fuzzy Systems.

[12]  Manuela M. Veloso,et al.  Selectively Reactive Coordination for a Team of Robot Soccer Champions , 2016, AAAI.

[13]  Yi-Hung Huang,et al.  Feature selection based on an improved cat swarm optimization algorithm for big data classification , 2016, The Journal of Supercomputing.

[14]  Christian Bettstetter,et al.  Cooperative ARQ With Relay Selection: An Analytical Framework Using Semi-Markov Processes , 2014, IEEE Transactions on Vehicular Technology.

[15]  H. Karimi,et al.  A Mahalanobis Hyperellipsoidal Learning Machine Class Incremental Learning Algorithm , 2014 .

[16]  Hongbing Wang,et al.  A multi-agent reinforcement learning approach to dynamic service composition , 2016 .

[17]  Saeid Nahavandi,et al.  System Design Perspective for Human-Level Agents Using Deep Reinforcement Learning: A Survey , 2017, IEEE Access.

[18]  Shalabh Bhatnagar,et al.  Reinforcement Learning With Function Approximation for Traffic Signal Control , 2011, IEEE Transactions on Intelligent Transportation Systems.

[19]  Paul K. Romano,et al.  Optimizations of the energy grid search algorithm in continuous-energy Monte Carlo particle transport codes , 2015, Comput. Phys. Commun..

[20]  Abhisek Ukil,et al.  Support Vector Machine , 2007 .

[21]  Klaus-Dieter Thoben,et al.  Application of learning pallets for real-time scheduling by the use of radial basis function network , 2013, Neurocomputing.

[22]  R. Sreerama Kumar,et al.  A Bezier curve based path planning in a multi-agent robot soccer system without violating the acceleration limits , 2009, Robotics Auton. Syst..

[23]  Peter Stone,et al.  Cooperating with Unknown Teammates in Complex Domains: A Robot Soccer Case Study of Ad Hoc Teamwork , 2015, AAAI.

[24]  Shin Ishii,et al.  Incremental State Aggregation for Value Function Estimation in Reinforcement Learning , 2011, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[25]  Yuanqing Xia,et al.  SMC Design for Robust Stabilization of Nonlinear Markovian Jump Singular Systems , 2018, IEEE Transactions on Automatic Control.

[26]  Kenneth Sundaraj,et al.  Vision in robot soccer: a review , 2015, Artificial Intelligence Review.

[27]  Si Wu,et al.  Improving support vector machine classifiers by modifying kernel functions , 1999, Neural Networks.

[28]  Kao-Shing Hwang,et al.  An adaptive decision-making method with fuzzy Bayesian reinforcement learning for robot soccer , 2018, Inf. Sci..

[29]  Taghi M. Khoshgoftaar,et al.  System regression test planning with a fuzzy expert system , 2014, Inf. Sci..

[30]  Wenwu Yu,et al.  An Overview of Recent Progress in the Study of Distributed Multi-Agent Coordination , 2012, IEEE Transactions on Industrial Informatics.

[31]  Prashant Chatur,et al.  Medical decision support system for extremely imbalanced datasets , 2017, Inf. Sci..

[32]  Fang Wu,et al.  Fault Detection and Diagnosis in Process Data Using Support Vector Machines , 2014, J. Appl. Math..

[33]  Elena Montañés,et al.  Enhancing directed binary trees for multi-class classification , 2013, Inf. Sci..

[34]  Han-Pang Huang,et al.  Strategy-based decision making of a soccer robot system using a real-time self-organizing fuzzy decision tree , 2002, Fuzzy Sets Syst..

[35]  Hamid Reza Karimi,et al.  LogDet Divergence-Based Metric Learning With Triplet Constraints and Its Applications , 2014, IEEE Transactions on Image Processing.

[36]  H. Karimi,et al.  Study on Support Vector Machine-Based Fault Detection in Tennessee Eastman Process , 2014 .

[37]  Will Tao,et al.  Trusted interaction approach for dynamic service selection using multi-criteria decision making technique , 2012, Knowl. Based Syst..