A self-learning strategy for artificial cognitive control systems

This paper presents a self-learning strategy for an artificial cognitive control based on a reinforcement learning strategy, in particular, an on-line version of a Q-learning algorithm. One architecture for artificial cognitive control was initially reported in [1], but without an effective self-learning strategy in order to deal with nonlinear and time variant behavior. The anticipation mode (i.e., inverse model control) and the single loop mode are two operating modes of the artificial cognitive control architecture. The main goal of the Q-learning algorithm is to deal with intrinsic uncertainty, nonlinearities and noisy behavior of processes in run-time. In order to validate the proposed method, experimental works are carried out for measuring and control the microdrilling process. The real-time application to control the drilling force is presented as a proof of concept. The performance of the artificial cognitive control system by means of the reinforcement learning is improved on the basis of good transient responses and acceptable steady-state error. The Q-learning mechanism built into a low-cost computing platform demonstrates the suitability of its implementation in an industrial setup.

[1]  Kwangyeol Ryu,et al.  Reinforcement learning approach to goal-regulation in a self-evolutionary manufacturing system , 2012, Expert Syst. Appl..

[2]  Qing Hu,et al.  Application of Fuzzy Self-learning Sliding Mode Variable Structure Control in Linear AC Servo System , 2006, 2006 CES/IEEE 5th International Power Electronics and Motion Control Conference.

[3]  Pierre-Yves Glorennec,et al.  Tuning fuzzy PD and PI controllers using reinforcement learning. , 2010, ISA transactions.

[4]  Jianguo Jiang,et al.  Path selection in disaster response management based on Q-learning , 2011, Int. J. Autom. Comput..

[5]  Peter Ford Dominey,et al.  Robot Cognitive Control with a Neurophysiologically Inspired Reinforcement Learning Model , 2011, Front. Neurorobot..

[6]  Simon Haykin,et al.  Cognitive Control: Theory and Application , 2014, IEEE Access.

[7]  Minghai Wang,et al.  An examination of the fundamental mechanics of cutting force coefficients , 2014 .

[8]  Youtong Zhang,et al.  Simulation research of idle damping control for vehicle engine based on PMSM , 2014, 2014 IEEE Conference and Expo Transportation Electrification Asia-Pacific (ITEC Asia-Pacific).

[9]  Shaoping Wang,et al.  Experimental analysis of performance degradation of solid lubricated bearings with vibration and friction torque signal , 2012, IEEE 10th International Conference on Industrial Informatics.

[10]  Chieh-Li Chen,et al.  Application of fuzzy logic controllers in single-loop tuning of multivariable system design , 1991 .

[11]  Andrew W. Moore,et al.  Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[12]  Wen Xuhui,et al.  Speed ripple minimization for interior-type PMSM using self-learning fuzzy control strategy , 2014, 2014 IEEE Conference and Expo Transportation Electrification Asia-Pacific (ITEC Asia-Pacific).

[13]  Ernesto Martínez,et al.  Agent learning in autonomic manufacturing execution systems for enterprise networking , 2012, Comput. Ind. Eng..

[14]  Agustín Gajate,et al.  Artificial cognitive control system based on the shared circuits model of sociocognitive capacities. A first approach , 2011, Eng. Appl. Artif. Intell..

[15]  Luis Miramontes Hercog,et al.  Better manufacturing process organization using multi-agent self-organization and co-evolutionary classifier systems: The multibar problem , 2013, Appl. Soft Comput..

[16]  O. P. Malik,et al.  Self-Learning Knowledge Systems and Fuzzy Systems and Their Applications , 2000 .

[17]  Yaochu Jin,et al.  Techniques in Neural-Network-Based Fuzzy System Identification and Their Application to Control of Complex Systems , 1999 .

[18]  Shalabh Bhatnagar,et al.  A novel Q-learning algorithm with function approximation for constrained Markov decision processes , 2012, 2012 50th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[19]  Zhiguo Shi,et al.  The optimization of path planning for multi-robot system using Boltzmann Policy based Q-learning algorithm , 2013, 2013 IEEE International Conference on Robotics and Biomimetics (ROBIO).

[20]  Leopoldo J. Gutierrez Gutierrez,et al.  The Relationship between Exploration and Exploitation Strategies, Manufacturing Flexibility, and Organizational Learning: An Empirical Comparison between Non-ISO and ISO Certified Firms , 2014, Eur. J. Oper. Res..

[21]  Hong-Sen Yan,et al.  An interoperable adaptive scheduling strategy for knowledgeable manufacturing based on SMGWQ-learning , 2016, J. Intell. Manuf..