Self-adaptive attribute weighting for Naive Bayes classification

Self-adaptive attribute weighting for Naive Bayes classification.Using Artificial Immune Systems (AIS) for attribute weighting.Seamlessly integrating learning objective and AIS affinity function for attribute weighting.Experiments on 42 real-world datasets demonstrating superb performance gain. Naive Bayes (NB) is a popular machine learning tool for classification, due to its simplicity, high computational efficiency, and good classification accuracy, especially for high dimensional data such as texts. In reality, the pronounced advantage of NB is often challenged by the strong conditional independence assumption between attributes, which may deteriorate the classification performance. Accordingly, numerous efforts have been made to improve NB, by using approaches such as structure extension, attribute selection, attribute weighting, instance weighting, local learning and so on. In this paper, we propose a new Artificial Immune System (AIS) based self-adaptive attribute weighting method for Naive Bayes classification. The proposed method, namely AISWNB, uses immunity theory in Artificial Immune Systems to search optimal attribute weight values, where self-adjusted weight values will alleviate the conditional independence assumption and help calculate the conditional probability in an accurate way. One noticeable advantage of AISWNB is that the unique immune system based evolutionary computation process, including initialization, clone, section, and mutation, ensures that AISWNB can adjust itself to the data without explicit specification of functional or distributional forms of the underlying model. As a result, AISWNB can obtain good attribute weight values during the learning process. Experiments and comparisons on 36 machine learning benchmark data sets and six image classification data sets demonstrate that AISWNB significantly outperforms its peers in classification accuracy, class probability estimation, and class ranking performance.

[1]  Marko Robnik-Sikonja,et al.  Theoretical and Empirical Analysis of ReliefF and RReliefF , 2003, Machine Learning.

[2]  Harry Zhang,et al.  Naive Bayesian Classifiers for Ranking , 2004, ECML.

[3]  Ron Kohavi,et al.  Scaling Up the Accuracy of Naive-Bayes Classifiers: A Decision-Tree Hybrid , 1996, KDD.

[4]  Sadan Kulturel-Konak,et al.  An artificial immune system based algorithm to solve unequal area facility layout problem , 2012, Expert Syst. Appl..

[5]  Igor Kononenko,et al.  Estimating Attributes: Analysis and Extensions of RELIEF , 1994, ECML.

[6]  Kwang Ryel Ryu,et al.  A Dual-Population Genetic Algorithm for Adaptive Diversity Control , 2010, IEEE Transactions on Evolutionary Computation.

[7]  Liangxiao Jiang,et al.  Learning instance greedily cloning naive Bayes for ranking , 2005, Fifth IEEE International Conference on Data Mining (ICDM'05).

[8]  Gary G. Yen,et al.  Vaccine-Enhanced Artificial Immune System for Multimodal Function Optimization , 2010, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[9]  Mark A. Hall,et al.  Correlation-based Feature Selection for Discrete and Numeric Class Machine Learning , 1999, ICML.

[10]  James Ze Wang,et al.  Real-Time Computerized Annotation of Pictures , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Armando M. Leite da Silva,et al.  A Cluster and Gradient-Based Artificial Immune System Applied in Optimization Scenarios , 2012, IEEE Transactions on Evolutionary Computation.

[12]  Liangpei Zhang,et al.  An Adaptive Artificial Immune Network for Supervised Classification of Multi-/Hyperspectral Remote Sensing Imagery , 2012, IEEE Transactions on Geoscience and Remote Sensing.

[13]  Zhibin Hong,et al.  Tracking via Robust Multi-task Multi-view Joint Sparse Representation , 2013, 2013 IEEE International Conference on Computer Vision.

[14]  Pedro M. Domingos,et al.  Learning Bayesian network classifiers by maximizing conditional likelihood , 2004, ICML.

[15]  Rainer Storn,et al.  Differential Evolution – A Simple and Efficient Heuristic for global Optimization over Continuous Spaces , 1997, J. Glob. Optim..

[16]  Thomas S. Huang,et al.  Supporting Ranked Boolean Similarity Queries in MARS , 1998, IEEE Trans. Knowl. Data Eng..

[17]  Hae-Chang Rim,et al.  Some Effective Techniques for Naive Bayes Text Classification , 2006, IEEE Transactions on Knowledge and Data Engineering.

[18]  Erik Valdemar Cuevas Jiménez,et al.  Automatic multiple circle detection based on artificial immune systems , 2012, Expert Syst. Appl..

[19]  Vipin Kumar,et al.  Text Categorization Using Weight Adjusted k-Nearest Neighbor Classification , 2001, PAKDD.

[20]  Weiyi Liu,et al.  Constructing the Bayesian network structure from dependencies implied in multiple relational schemas , 2011, Expert Syst. Appl..

[21]  Wei Zhang,et al.  A Survey of artificial immune applications , 2010, Artificial Intelligence Review.

[22]  Liangxiao Jiang,et al.  Weighted average of one-dependence estimators† , 2012, J. Exp. Theor. Artif. Intell..

[23]  Mark A. Hall,et al.  A decision tree-based attribute weighting filter for naive Bayes , 2006, Knowl. Based Syst..

[24]  Bo Luo,et al.  iLike: Bridging the Semantic Gap in Vertical Image Search by Integrating Text and Visual Features , 2013, IEEE Transactions on Knowledge and Data Engineering.

[25]  Liangxiao Jiang,et al.  A Novel Bayes Model: Hidden Naive Bayes , 2009, IEEE Transactions on Knowledge and Data Engineering.

[26]  Iñaki Inza,et al.  Learning Bayesian network classifiers from label proportions , 2013, Pattern Recognit..

[27]  Yong Yu,et al.  Web-scale classification with naive bayes , 2009, WWW '09.

[28]  Xiaoping Liu,et al.  International Journal of Geographical Information Science an Improved Artificial Immune System for Seeking the Pareto Front of Land-use Allocation Problem in Large Areas an Improved Artificial Immune System for Seeking the Pareto Front of Land-use Allocation Problem in Large Areas , 2022 .

[29]  Ian H. Witten,et al.  Data mining - practical machine learning tools and techniques, Second Edition , 2005, The Morgan Kaufmann series in data management systems.

[30]  Pat Langley,et al.  Induction of Selective Bayesian Classifiers , 1994, UAI.

[31]  Larry A. Rendell,et al.  A Practical Approach to Feature Selection , 1992, ML.

[32]  อนิรุธ สืบสิงห์,et al.  Data Mining Practical Machine Learning Tools and Techniques , 2014 .

[33]  Philip S. Yu,et al.  Bag Constrained Structure Pattern Mining for Multi-Graph Classification , 2014, IEEE Transactions on Knowledge and Data Engineering.

[34]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[35]  Nir Friedman,et al.  Bayesian Network Classifiers , 1997, Machine Learning.

[36]  Yo-Ping Huang,et al.  An adaptive knowledge evolution strategy for finding near-optimal solutions of specific problems , 2011, Expert Syst. Appl..

[37]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .

[38]  Jing-Yu Yang,et al.  Content-based image retrieval using color difference histogram , 2013, Pattern Recognit..

[39]  Chengqi Zhang,et al.  Multi-Graph Learning with Positive and Unlabeled Bags , 2014, SDM.

[40]  Geoffrey I. Webb,et al.  Alleviating naive Bayes attribute independence assumption by attribute weighting , 2013, J. Mach. Learn. Res..

[41]  Fevzullah Temurtas,et al.  Diagnosis of chest diseases using artificial immune system , 2012, Expert Syst. Appl..

[42]  Houkuan Huang,et al.  Feature selection for text classification with Naïve Bayes , 2009, Expert Syst. Appl..

[43]  Yong Luo,et al.  Multiview Vector-Valued Manifold Regularization for Multilabel Image Classification , 2013, IEEE Transactions on Neural Networks and Learning Systems.

[44]  Yuanhui Zhang,et al.  A ReliefF attribute weighting and X-means clustering methodology for top-down product family optimization , 2010, Engineering Optimization.

[45]  Zhihua Cai,et al.  Learning attribute weighted AODE for ROC area ranking , 2014, Int. J. Inf. Commun. Technol..

[46]  David J. Hand,et al.  A Simple Generalisation of the Area Under the ROC Curve for Multiple Class Classification Problems , 2001, Machine Learning.

[47]  Jia Wu,et al.  Self-adaptive probability estimation for Naive Bayes classification , 2013, The 2013 International Joint Conference on Neural Networks (IJCNN).

[48]  Harry Zhang,et al.  Learning weighted naive Bayes with accurate ranking , 2004, Fourth IEEE International Conference on Data Mining (ICDM'04).

[49]  Philip S. Yu,et al.  Top 10 algorithms in data mining , 2007, Knowledge and Information Systems.

[50]  Jia Wu,et al.  Artificial immune system for attribute weighted Naive Bayes classification , 2013, The 2013 International Joint Conference on Neural Networks (IJCNN).

[51]  Liangxiao Jiang,et al.  Not so greedy: Randomly Selected Naive Bayes , 2012, Expert Syst. Appl..

[52]  Geoffrey I. Webb,et al.  Learning by extrapolation from marginal to full-multivariate probability distributions: decreasingly naive Bayesian classification , 2011, Machine Learning.

[53]  Zhong Li,et al.  An Improved Self-organization Antibody Network for Pattern Recognition and Its Performance Study , 2012, CCPR.