A fast feature weighting algorithm of data gravitation classification

The data gravitation classification (DGC) model is a new classification method that has gained great research interest in recent years. Feature weights are the key parameters in DGC models, because the classification performances of these models are very sensitive to such feature weights. The available DGC models use wrapper-like algorithms to obtain their optimised feature weights. Although such algorithms produce high classification accuracies, but they contribute to the high computational complexities of the DGC models. In this study, we propose a fast feature weight algorithm for DGC models called FFW-DGC. We use the concepts of feature discrimination and redundancy to measure the importance of a feature, after which two fuzzy subsets are constructed to respectively represent these concepts. Next, we combine the two fuzzy subsets to compute the feature weights used in gravitational computing. We conduct our experiments on 25 standard data sets and 22 imbalance data sets, and compare FFW-DGC with 11 kinds of classifiers, including the swarm-intelligence-based DGC (PSO-DGC) model. Competitive results of FFW-DGC demonstrate that it can obtain high classification accuracies, but also hundreds of times of speedup ratios compared with PSO-DGC.

[1]  Witold Pedrycz,et al.  Feature selection using structural similarity , 2012, Inf. Sci..

[2]  Uzay Kaymak,et al.  Fuzzy criteria for feature selection , 2012, Fuzzy Sets Syst..

[3]  Eibe Frank,et al.  Speeding Up Logistic Model Tree Induction , 2005, PKDD.

[4]  Lei Liu,et al.  Feature selection with dynamic mutual information , 2009, Pattern Recognit..

[5]  Glenn Platt,et al.  Unsupervised feature selection using swarm intelligence and consensus clustering for automatic fault detection and diagnosis in Heating Ventilation and Air Conditioning systems , 2015, Appl. Soft Comput..

[6]  Ian H. Witten,et al.  Generating Accurate Rule Sets Without Global Optimization , 1998, ICML.

[7]  Huan Liu,et al.  Feature Selection for Classification , 1997, Intell. Data Anal..

[8]  Francisco Herrera,et al.  Advanced nonparametric tests for multiple comparisons in the design of experiments in computational intelligence and data mining: Experimental analysis of power , 2010, Inf. Sci..

[9]  L. Chen,et al.  Cognitive gravitation model for classification on small noisy data , 2013, Neurocomputing.

[10]  Andrew P. Bradley,et al.  The use of the area under the ROC curve in the evaluation of machine learning algorithms , 1997, Pattern Recognit..

[11]  Jesús Alcalá-Fdez,et al.  KEEL Data-Mining Software Tool: Data Set Repository, Integration of Algorithms and Experimental Analysis Framework , 2011, J. Multiple Valued Log. Soft Comput..

[12]  Dae-Won Kim,et al.  Fast multi-label feature selection based on information-theoretic feature ranking , 2015, Pattern Recognit..

[13]  Sadoghi Yazdi Hadi,et al.  Gravitation based classification , 2013 .

[14]  Dun Liu,et al.  A fuzzy rough set approach for incremental feature selection on hybrid information systems , 2015, Fuzzy Sets Syst..

[15]  Richard Weber,et al.  Kernel Penalized K-means: A feature selection method based on Kernel K-means , 2015, Inf. Sci..

[16]  Chien-Hsing Chen,et al.  Unsupervised margin-based feature selection using linear transformations with neighbor preservation , 2016, Neurocomputing.

[17]  Zhengya Sun,et al.  L0-norm Based Structural Sparse Least Square Regression for Feature Selection , 2015, Pattern Recognit..

[18]  Eibe Frank,et al.  Combining Naive Bayes and Decision Tables , 2008, FLAIRS.

[19]  Francisco Herrera,et al.  A study of statistical techniques and performance measures for genetics-based machine learning: accuracy and interpretability , 2009, Soft Comput..

[20]  Qinghua Hu,et al.  Soft fuzzy rough sets for robust feature evaluation and selection , 2010, Inf. Sci..

[21]  Sankar K. Pal,et al.  Fuzzy rough sets, and a granular neural network for unsupervised feature selection , 2013, Neural Networks.

[22]  Michela Antonelli,et al.  On the influence of feature selection in fuzzy rule-based regression model generation , 2016, Inf. Sci..

[23]  Lalit R. Bahl,et al.  Maximum mutual information estimation of hidden Markov model parameters for speech recognition , 1986, ICASSP '86. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[24]  Arie Ben-David,et al.  Comparison of classification accuracy using Cohen's Weighted Kappa , 2008, Expert Syst. Appl..

[25]  Francisco Herrera,et al.  Enhancing evolutionary instance selection algorithms by means of fuzzy rough set based feature selection , 2012, Inf. Sci..

[26]  David J. Sheskin,et al.  Handbook of Parametric and Nonparametric Statistical Procedures , 1997 .

[27]  Feng Jiang,et al.  A relative decision entropy-based feature selection approach , 2015, Pattern Recognit..

[28]  Fuhui Long,et al.  Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy , 2003, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[29]  Dorra Sellami Masmoudi,et al.  Feature selection in possibilistic modeling , 2015, Pattern Recognit..

[30]  Peter E. Hart,et al.  Nearest neighbor pattern classification , 1967, IEEE Trans. Inf. Theory.

[31]  Charles X. Ling,et al.  Using AUC and accuracy in evaluating learning algorithms , 2005, IEEE Transactions on Knowledge and Data Engineering.

[32]  Qinghua Hu,et al.  Multi-label feature selection based on max-dependency and min-redundancy , 2015, Neurocomputing.

[33]  Guy Marchal,et al.  Multimodality image registration by maximization of mutual information , 1997, IEEE Transactions on Medical Imaging.

[34]  Ron Kohavi,et al.  Scaling Up the Accuracy of Naive-Bayes Classifiers: A Decision-Tree Hybrid , 1996, KDD.

[35]  Geoffrey I. Webb,et al.  Learning by extrapolation from marginal to full-multivariate probability distributions: decreasingly naive Bayesian classification , 2011, Machine Learning.

[36]  Eibe Frank Fully supervised training of Gaussian radial basis function networks in WEKA , 2014 .

[37]  Joseph Aguilar-Martin,et al.  Membership-margin based feature selection for mixed type and high-dimensional data: Theory and applications , 2015, Inf. Sci..

[38]  Yuehui Chen,et al.  A new approach for imbalanced data classification based on data gravitation , 2014, Inf. Sci..

[39]  Swagatam Das,et al.  Simultaneous feature selection and weighting - An evolutionary multi-objective optimization approach , 2015, Pattern Recognit. Lett..

[40]  Sebastián Ventura,et al.  Weighted Data Gravitation Classification for Standard and Imbalanced Data , 2013, IEEE Transactions on Cybernetics.

[41]  Geoffrey I. Webb,et al.  Not So Naive Bayes: Aggregating One-Dependence Estimators , 2005, Machine Learning.

[42]  Mahdi Eftekhari,et al.  Feature selection using multimodal optimization techniques , 2016, Neurocomputing.

[43]  Parham Moradi,et al.  Relevance-redundancy feature selection based on ant colony optimization , 2015, Pattern Recognit..

[44]  Richard Jensen,et al.  Towards scalable fuzzy-rough feature selection , 2015, Inf. Sci..

[45]  Rui Zhang,et al.  A novel feature selection method considering feature interaction , 2015, Pattern Recognit..

[46]  Bernhard Schölkopf,et al.  New Support Vector Algorithms , 2000, Neural Computation.

[47]  María José del Jesús,et al.  On the 2-tuples based genetic tuning performance for fuzzy rule based classification systems in imbalanced data-sets , 2010, Inf. Sci..

[48]  Bo Yang,et al.  Data gravitation based classification , 2009, Inf. Sci..

[49]  Jian Yang,et al.  Sparse discriminative feature selection , 2015, Pattern Recognit..

[50]  Eyke Hüllermeier,et al.  FURIA: an algorithm for unordered fuzzy rule induction , 2009, Data Mining and Knowledge Discovery.

[51]  Javier Pérez-Rodríguez,et al.  Simultaneous instance and feature selection and weighting using evolutionary computation: Proposal and study , 2015, Appl. Soft Comput..