Automated feature weighting in naive bayes for high-dimensional data classification

Naive Bayes (NB for short) is one of the popular methods for supervised classification in a knowledge management system. Currently, in many real-world applications, high-dimensional data pose a major challenge to conventional NB classifiers, due to noisy or redundant features and local relevance of these features to classes. In this paper, an automated feature weighting solution is proposed to result in a NB method effective in dealing with high-dimensional data. We first propose a locally weighted probability model, for Bayesian modeling in high-dimensional spaces, to implement a soft feature selection scheme. Then we propose an optimization algorithm to find the weights in linear time complexity, based on the Logitnormal priori distribution and the Maximum a Posteriori principle. Experimental studies show the effectiveness and suitability of the proposed model for high-dimensional data classification.

[1]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[2]  Kien A. Hua,et al.  Constrained locally weighted clustering , 2008, Proc. VLDB Endow..

[3]  Fuhui Long,et al.  Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy , 2003, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..

[5]  Pat Langley,et al.  Induction of Selective Bayesian Classifiers , 1994, UAI.

[6]  Geoffrey I. Webb,et al.  A comparative study of Semi-naive Bayes methods in classification learning , 2005 .

[7]  Frank Lad,et al.  Two Moments of the Logitnormal Distribution , 2008, Commun. Stat. Simul. Comput..

[8]  Dimitrios Gunopulos,et al.  Locally adaptive metrics for clustering high dimensional data , 2007, Data Mining and Knowledge Discovery.

[9]  Ron Kohavi,et al.  Wrappers for Feature Subset Selection , 1997, Artif. Intell..

[10]  Jianqing Fan,et al.  High Dimensional Classification Using Features Annealed Independence Rules. , 2007, Annals of statistics.

[11]  J. Mesirov,et al.  Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. , 1999, Science.

[12]  Gongde Guo,et al.  Class-dependent projection based method for text categorization , 2011, Pattern Recognit. Lett..

[13]  Bernhard Pfahringer,et al.  Locally Weighted Naive Bayes , 2002, UAI.

[14]  Pat Langley,et al.  Estimating Continuous Distributions in Bayesian Classifiers , 1995, UAI.

[15]  Dejing Dou,et al.  Calculating Feature Weights in Naive Bayes with Kullback-Leibler Measure , 2011, 2011 IEEE 11th International Conference on Data Mining.

[16]  Mark A. Hall,et al.  A decision tree-based attribute weighting filter for naive Bayes , 2006, Knowl. Based Syst..

[17]  David D. Lewis,et al.  Naive (Bayes) at Forty: The Independence Assumption in Information Retrieval , 1998, ECML.

[18]  Nir Friedman,et al.  Bayesian Network Classifiers , 1997, Machine Learning.

[19]  Robert Tibshirani,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition , 2001, Springer Series in Statistics.

[20]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[21]  Matthias Seeger,et al.  Bayesian Modelling in Machine Learning: A Tutorial Review , 2006 .

[22]  David P. Wipf,et al.  A New View of Automatic Relevance Determination , 2007, NIPS.

[23]  Liangxiao Jiang,et al.  Hidden Naive Bayes , 2005, AAAI.

[24]  Thomas Gärtner,et al.  WBCsvm: Weighted Bayesian Classification based on Support Vector Machines , 2001, ICML.

[25]  Mehran Sahami,et al.  Learning Limited Dependence Bayesian Classifiers , 1996, KDD.

[26]  Geoffrey I. Webb,et al.  Not So Naive Bayes: Aggregating One-Dependence Estimators , 2005, Machine Learning.

[27]  Dimitrios Gunopulos,et al.  Feature selection for the naive bayesian classifier using decision trees , 2003, Appl. Artif. Intell..