SVM-Based Tree-Type Neural Networks as a Critic in Adaptive Critic Designs for Control

In this paper, we use the approach of adaptive critic design (ACD) for control, specifically, the action-dependent heuristic dynamic programming (ADHDP) method. A least squares support vector machine (SVM) regressor has been used for generating the control actions, while an SVM-based tree-type neural network (NN) is used as the critic. After a failure occurs, the critic and action are retrained in tandem using the failure data. Failure data is binary classification data, where the number of failure states are very few as compared to the number of no-failure states. The difficulty of conventional multilayer feedforward NNs in learning this type of classification data has been overcome by using the SVM-based tree-type NN, which due to its feature to add neurons to learn misclassified data, has the capability to learn any binary classification data without a priori choice of the number of neurons or the structure of the network. The capability of the trained controller to handle unforeseen situations is demonstrated.

[1]  Asim Roy,et al.  A neural-network learning theory and a polynomial time RBF algorithm , 1997, IEEE Trans. Neural Networks.

[2]  Alexander Gammerman,et al.  Ridge Regression Learning Algorithm in Dual Variables , 1998, ICML.

[3]  Vladimir Cherkassky,et al.  The Nature Of Statistical Learning Theory , 1997, IEEE Trans. Neural Networks.

[4]  Donald C. Wunsch,et al.  Adaptive critic designs and their applications , 1997 .

[5]  Johan A. K. Suykens,et al.  Least Squares Support Vector Machine Classifiers , 1999, Neural Processing Letters.

[6]  O. Mangasarian,et al.  Multicategory discrimination via linear programming , 1994 .

[7]  Kurt Hornik,et al.  Multilayer feedforward networks are universal approximators , 1989, Neural Networks.

[8]  Asim Roy,et al.  An algorithm to generate radial basis function (RBF)-like nets for classification problems , 1995, Neural Networks.

[9]  Somnath Mukhopadhyay,et al.  Iterative generation of higher-order nets in polynomial time using linear programming , 1997, IEEE Trans. Neural Networks.

[10]  Suresh Chandra,et al.  Binary classification by SVM based tree type neural networks , 2002, Proceedings of the 2002 International Joint Conference on Neural Networks. IJCNN'02 (Cat. No.02CH37290).

[11]  Hamid R. Berenji,et al.  Learning and tuning fuzzy logic controllers through reinforcements , 1992, IEEE Trans. Neural Networks.

[12]  Hajime Kita,et al.  Inverting feedforward neural networks using linear and nonlinear programming , 1999, IEEE Trans. Neural Networks.

[13]  Johan A. K. Suykens,et al.  Automatic relevance determination for Least Squares Support Vector Machines classifiers , 2001, ESANN.

[14]  Jayadeva,et al.  Algorithm for building a neural network for function approximation , 2002 .

[15]  John A. Bullinaria Evolving efficient learning algorithms for binary mappings , 2003, Neural Networks.

[16]  N. K. Bose,et al.  Neural Network Fundamentals with Graphs, Algorithms and Applications , 1995 .

[17]  Olvi L. Mangasarian,et al.  Mathematical Programming in Neural Networks , 1993, INFORMS J. Comput..

[18]  Chin-Teng Lin,et al.  Reinforcement learning for an ART-based fuzzy adaptive learning control network , 1996, IEEE Trans. Neural Networks.

[19]  Nello Cristianini,et al.  An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[20]  L. P. Ricotti,et al.  A pyramidal delayed perceptron , 1990 .

[21]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[22]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[23]  Richard S. Sutton,et al.  A Menu of Designs for Reinforcement Learning Over Time , 1995 .

[24]  Jacek M. Zurada,et al.  Introduction to artificial neural systems , 1992 .

[25]  George Cybenko,et al.  Approximation by superpositions of a sigmoidal function , 1992, Math. Control. Signals Syst..

[26]  Roberto A. Santiago,et al.  Adaptive critic designs: A case study for neurocontrol , 1995, Neural Networks.

[27]  Paul J. Webros A menu of designs for reinforcement learning over time , 1990 .

[28]  Derong Liu,et al.  Adaptive critic designs for problems with known analytical form of cost function , 2002, Proceedings of the 2002 International Joint Conference on Neural Networks. IJCNN'02 (Cat. No.02CH37290).

[29]  Johan A. K. Suykens,et al.  Optimal control by least squares support vector machines , 2001, Neural Networks.