Support vector learning for fuzzy rule-based classification systems

To design a fuzzy rule-based classification system (fuzzy classifier) with good generalization ability in a high dimensional feature space has been an active research topic for a long time. As a powerful machine learning approach for pattern recognition problems, the support vector machine (SVM) is known to have good generalization ability. More importantly, an SVM can work very well on a high- (or even infinite) dimensional feature space. This paper investigates the connection between fuzzy classifiers and kernel machines, establishes a link between fuzzy rules and kernels, and proposes a learning algorithm for fuzzy classifiers. We first show that a fuzzy classifier implicitly defines a translation invariant kernel under the assumption that all membership functions associated with the same input variable are generated from location transformation of a reference function. Fuzzy inference on the IF-part of a fuzzy rule can be viewed as evaluating the kernel function. The kernel function is then proven to be a Mercer kernel if the reference functions meet a certain spectral requirement. The corresponding fuzzy classifier is named positive definite fuzzy classifier (PDFC). A PDFC can be built from the given training samples based on a support vector learning approach with the IF-part fuzzy rules given by the support vectors. Since the learning process minimizes an upper bound on the expected risk (expected prediction error) instead of the empirical risk (training error), the resulting PDFC usually has good generalization. Moreover, because of the sparsity properties of the SVMs, the number of fuzzy rules is irrelevant to the dimension of input space. In this sense, we avoid the "curse of dimensionality." Finally, PDFCs with different reference functions are constructed using the support vector learning approach. The performance of the PDFCs is illustrated by extensive experimental results. Comparisons with other methods are also provided.

[1]  Riccardo Rovatti,et al.  Fuzzy piecewise multilinear and piecewise linear systems as universal approximators in Sobolev norms , 1998, IEEE Trans. Fuzzy Syst..

[2]  Yuh-Jye Lee,et al.  SSVM: A Smooth Support Vector Machine for Classification , 2001, Comput. Optim. Appl..

[3]  Peter L. Bartlett,et al.  The Sample Complexity of Pattern Classification with Neural Networks: The Size of the Weights is More Important than the Size of the Network , 1998, IEEE Trans. Inf. Theory.

[4]  Elie Bienenstock,et al.  Neural Networks and the Bias/Variance Dilemma , 1992, Neural Computation.

[5]  Shyi-Ming Chen,et al.  Document retrieval using fuzzy-valued concept networks , 2001, IEEE Trans. Syst. Man Cybern. Part B.

[6]  Christopher J. Merz,et al.  UCI Repository of Machine Learning Databases , 1996 .

[7]  David B. H. Tay,et al.  Enhancement of document images using multiresolution and fuzzy logic techniques , 1999, IEEE Signal Processing Letters.

[8]  J. Mercer Functions of Positive and Negative Type, and their Connection with the Theory of Integral Equations , 1909 .

[9]  Vladimir Vapnik,et al.  Chervonenkis: On the uniform convergence of relative frequencies of events to their probabilities , 1971 .

[10]  Hung-Yuan Chung,et al.  A self-learning fuzzy logic controller using genetic algorithms with reinforcements , 1997, IEEE Trans. Fuzzy Syst..

[11]  Li-Xin Wang,et al.  Analysis and design of hierarchical fuzzy systems , 1999, IEEE Trans. Fuzzy Syst..

[13]  Charles R. Johnson,et al.  Matrix analysis , 1985, Statistical Inference for Engineers and Data Scientists.

[14]  Bart Kosko,et al.  Fuzzy function approximation with ellipsoidal rules , 1996, IEEE Trans. Syst. Man Cybern. Part B.

[15]  Alexander J. Smola,et al.  Support Vector Method for Function Approximation, Regression Estimation and Signal Processing , 1996, NIPS.

[16]  Bart Kosko,et al.  Fuzzy Systems as Universal Approximators , 1994, IEEE Trans. Computers.

[17]  Peter L. Bartlett,et al.  For Valid Generalization the Size of the Weights is More Important than the Size of the Network , 1996, NIPS.

[18]  Shigeo Abe,et al.  Function approximation based on fuzzy rules extracted from partitioned numerical data , 1999, IEEE Trans. Syst. Man Cybern. Part B.

[19]  Hao Ying,et al.  General SISO Takagi-Sugeno fuzzy systems with linear rule consequent are universal approximators , 1998, IEEE Trans. Fuzzy Syst..

[20]  H. Zimmermann,et al.  Fuzzy Set Theory and Its Applications , 1993 .

[21]  Vladimir Cherkassky,et al.  The Nature Of Statistical Learning Theory , 1997, IEEE Trans. Neural Networks.

[22]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[23]  Michael R. Berthold,et al.  Input features' impact on fuzzy decision processes , 2000, IEEE Trans. Syst. Man Cybern. Part B.

[24]  Nikola K. Kasabov,et al.  Learning fuzzy rules and approximate reasoning in fuzzy neural networks and hybrid systems , 1996, Fuzzy Sets Syst..

[25]  Matthias W. Seeger,et al.  Using the Nyström Method to Speed Up Kernel Machines , 2000, NIPS.

[26]  Gunnar Rätsch,et al.  An introduction to kernel-based learning algorithms , 2001, IEEE Trans. Neural Networks.

[27]  Ching-Chang Wong,et al.  A GA-based method for constructing fuzzy systems directly from numerical data , 2000, IEEE Trans. Syst. Man Cybern. Part B.

[28]  J. Maeda,et al.  Signal processing and pattern recognition with soft computing , 2001, Proc. IEEE.

[29]  Shigeo Abe,et al.  A fuzzy classifier with ellipsoidal regions , 1997, IEEE Trans. Fuzzy Syst..

[30]  Bernhard Schölkopf,et al.  Nonlinear Component Analysis as a Kernel Eigenvalue Problem , 1998, Neural Computation.

[31]  Chuen-Tsai Sun,et al.  Neuro-fuzzy modeling and control , 1995, Proc. IEEE.

[32]  Chuen-Chien Lee FUZZY LOGIC CONTROL SYSTEMS: FUZZY LOGIC CONTROLLER - PART I , 1990 .

[33]  Marc G. Genton,et al.  Classes of Kernels for Machine Learning: A Statistics Perspective , 2002, J. Mach. Learn. Res..

[34]  Paul S. Bradley,et al.  Feature Selection via Concave Minimization and Support Vector Machines , 1998, ICML.

[35]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[36]  Bernhard Schölkopf,et al.  Comparing support vector machines with Gaussian kernels to radial basis function classifiers , 1997, IEEE Trans. Signal Process..

[37]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[38]  Linda Kaufman,et al.  Solving the quadratic programming problem arising in support vector classification , 1999 .

[39]  Jung-Hsien Chiang,et al.  Support vector learning mechanism for fuzzy rule-based modeling: a new approach , 2004, IEEE Trans. Fuzzy Syst..

[40]  Bart Kosko,et al.  The shape of fuzzy sets in adaptive function approximation , 2001, IEEE Trans. Fuzzy Syst..

[41]  Thorsten Joachims,et al.  Making large scale SVM learning practical , 1998 .

[42]  Frank Klawonn,et al.  Mathematical Analysis of Fuzzy Classifiers , 1997, IDA.

[43]  Meng Joo Er,et al.  A fast approach for automatic generation of fuzzy rules by generalized dynamic fuzzy neural networks , 2001, IEEE Trans. Fuzzy Syst..

[44]  Bernhard Schölkopf,et al.  Improving the Accuracy and Speed of Support Vector Machines , 1996, NIPS.

[45]  D. Dubois,et al.  Operations on fuzzy numbers , 1978 .

[46]  Bernhard Schölkopf,et al.  Generalized Support Vector Machines , 2000 .

[47]  Bernhard Schölkopf,et al.  Sparse Greedy Matrix Approximation for Machine Learning , 2000, International Conference on Machine Learning.

[48]  Robert Babuska,et al.  Rule base reduction: some comments on the use of orthogonal transforms , 2001, IEEE Trans. Syst. Man Cybern. Syst..

[49]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[50]  Bart Kosko,et al.  Adaptive fuzzy frequency hopper , 1995, IEEE Trans. Commun..

[51]  Hideo Tanaka,et al.  Construction of fuzzy classification systems with rectangular fuzzy rules using genetic algorithms , 1994, CVPR 1994.

[52]  Ludmila I. Kuncheva,et al.  How good are fuzzy If-Then classifiers? , 2000, IEEE Trans. Syst. Man Cybern. Part B.

[53]  Bernhard Schölkopf,et al.  The connection between regularization operators and support vector kernels , 1998, Neural Networks.

[54]  Michio Sugeno,et al.  Fuzzy identification of systems and its applications to modeling and control , 1985, IEEE Transactions on Systems, Man, and Cybernetics.

[55]  George J. Klir,et al.  Fuzzy sets and fuzzy logic - theory and applications , 1995 .

[56]  Bernhard Schölkopf,et al.  Improving the accuracy and speed of support vector learning machines , 1997, NIPS 1997.

[57]  Stephen Watson,et al.  Set Theory and its Applications , 1989 .

[58]  Andrew A. Goldenberg,et al.  Development of a systematic methodology of fuzzy logic modeling , 1998, IEEE Trans. Fuzzy Syst..

[59]  L X Wang,et al.  Fuzzy basis functions, universal approximation, and orthogonal least-squares learning , 1992, IEEE Trans. Neural Networks.

[60]  Magne Setnes,et al.  Supervised fuzzy clustering for rule extraction , 1999, FUZZ-IEEE'99. 1999 IEEE International Fuzzy Systems. Conference Proceedings (Cat. No.99CH36315).

[61]  Kim-Fung Man,et al.  Minimal fuzzy memberships and rules using hierarchical genetic algorithms , 1998, IEEE Trans. Ind. Electron..

[62]  James C. Bezdek,et al.  Fuzzy c-means clustering of incomplete data , 2001, IEEE Trans. Syst. Man Cybern. Part B.

[63]  John Yen,et al.  Application of statistical information criteria for optimal fuzzy model construction , 1998, IEEE Trans. Fuzzy Syst..

[64]  David G. Stork,et al.  Pattern Classification (2nd ed.) , 1999 .

[65]  Nello Cristianini,et al.  An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[66]  Chuen-Tsai Sun,et al.  Functional equivalence between radial basis function networks and fuzzy inference systems , 1993, IEEE Trans. Neural Networks.

[67]  Two approaches for information retrieval through fuzzy associations , 1989, IEEE Trans. Syst. Man Cybern..

[68]  Hisao Ishibuchi,et al.  Effect of rule weights in fuzzy rule-based classification systems , 2001, IEEE Trans. Fuzzy Syst..

[69]  M. Sugeno,et al.  Structure identification of fuzzy model , 1988 .

[70]  John C. Platt,et al.  Fast training of support vector machines using sequential minimal optimization, advances in kernel methods , 1999 .

[71]  John Yen,et al.  Fuzzy Logic - A Modern Perspective , 1999, IEEE Trans. Knowl. Data Eng..

[72]  Li-Xin Wang,et al.  Adaptive fuzzy systems and control - design and stability analysis , 1994 .

[73]  Simon Haykin,et al.  Generalized support vector machines , 1999, ESANN.