A Fast and Efficient Method for Training Categorical Radial Basis Function Networks

This brief presents a novel learning scheme for categorical data based on radial basis function (RBF) networks. The proposed approach replaces the numerical vectors known as RBF centers with categorical tuple centers, and employs specially designed measures for calculating the distance between the center and the input tuples. Furthermore, a fast noniterative categorical clustering algorithm is proposed to accomplish the first stage of RBF training involving categorical center selection, whereas the weights are calculated through linear regression. The method is applied on 22 categorical data sets and compared with several different learning schemes, including neural networks, support vector machines, naïve Bayes classifier, and decision trees. Results show that the proposed method is very competitive, outperforming its rivals in terms of predictive capabilities in the majority of the tested cases.

[1]  Jan Gorodkin,et al.  Comparing two K-category assignments by a K-category correlation coefficient , 2004, Comput. Biol. Chem..

[2]  Steven Salzberg,et al.  A Weighted Nearest Neighbor Algorithm for Learning with Symbolic Features , 2004, Machine Learning.

[3]  Joshua Zhexue Huang,et al.  A Novel Variable-order Markov Model for Clustering Categorical Sequences , 2014, IEEE Transactions on Knowledge and Data Engineering.

[4]  Gongde Guo,et al.  Nearest neighbor classification of categorical data by attributes weighting , 2015, Expert Syst. Appl..

[5]  Roman Ilin,et al.  Unsupervised Learning of Categorical Data With Competing Models , 2012, IEEE Transactions on Neural Networks and Learning Systems.

[6]  Shengrui Wang,et al.  Soft subspace clustering of categorical data with probabilistic distance , 2016, Pattern Recognit..

[7]  Jiye Liang,et al.  Trend analysis of categorical data streams with a concept change method , 2014, Inf. Sci..

[8]  Jiye Liang,et al.  The k-modes type clustering plus between-cluster information for categorical data , 2014, Neurocomputing.

[9]  Hong Jia,et al.  A New Distance Metric for Unsupervised Learning of Categorical Data , 2014, IEEE Transactions on Neural Networks and Learning Systems.

[10]  Alex Alexandridis,et al.  A medical diagnostic tool based on radial basis function classifiers and evolutionary simulated annealing , 2014, J. Biomed. Informatics.

[11]  Tony R. Martinez,et al.  Improved Heterogeneous Distance Functions , 1996, J. Artif. Intell. Res..

[12]  Carlos Renjifo,et al.  Improving radial basis function kernel classification through incremental learning and automatic parameter selection , 2008, Neurocomputing.

[13]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[14]  Vipin Kumar,et al.  Similarity Measures for Categorical Data: A Comparative Evaluation , 2008, SDM.

[15]  Fei Zhou,et al.  Coupled Attribute Similarity Learning on Categorical Data , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[16]  Andy Liaw,et al.  Classification and Regression by randomForest , 2007 .

[17]  Renée J. Miller,et al.  LIMBO: Scalable Clustering of Categorical Data , 2004, EDBT.

[18]  Chung-Chian Hsu,et al.  Visualized Analysis of Mixed Numeric and Categorical Data Via Extended Self-Organizing Map , 2012, IEEE Transactions on Neural Networks and Learning Systems.

[19]  Victor Cheng,et al.  Dissimilarity learning for nominal data , 2004, Pattern Recognit..

[20]  Karen Spärck Jones A statistical interpretation of term specificity and its application in retrieval , 2021, J. Documentation.

[21]  Pat Langley,et al.  Estimating Continuous Distributions in Bayesian Classifiers , 1995, UAI.

[22]  Mohammad Bagher Menhaj,et al.  Training feedforward networks with the Marquardt algorithm , 1994, IEEE Trans. Neural Networks.

[23]  Haralambos Sarimveis,et al.  Radial Basis Function Network Training Using a Nonsymmetric Partition of the Input Space and Particle Swarm Optimization , 2013, IEEE Transactions on Neural Networks and Learning Systems.

[24]  Jiye Liang,et al.  Space Structure and Clustering of Categorical Data , 2016, IEEE Transactions on Neural Networks and Learning Systems.

[25]  John Moody,et al.  Fast Learning in Networks of Locally-Tuned Processing Units , 1989, Neural Computation.

[26]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[27]  David L. Waltz,et al.  Toward memory-based reasoning , 1986, CACM.