Local Probabilistic Model for Bayesian Classification: a Generalized Local Classification Model

In Bayesian classification, it is important to establish a probabilistic model for each class for likelihood estimation. Most of the previous methods modeled the probability distribution in the whole sample space. However, real-world problems are usually too complex to model in the whole sample space; some fundamental assumptions are required to simplify the global model, for example, the class conditional independence assumption for naive Bayesian classification. In this paper, with the insight that the distribution in a local sample space should be simpler than that in the whole sample space, a local probabilistic model established for a local region is expected much simpler and can relax the fundamental assumptions that may not be true in the whole sample space. Based on these advantages we propose establishing local probabilistic models for Bayesian classification. In addition, a Bayesian classifier adopting a local probabilistic model can even be viewed as a generalized local classification model; by tuning the size of the local region and the corresponding local model assumption, a fitting model can be established for a particular classification problem. The experimental results on several real-world datasets demonstrate the effectiveness of local probabilistic models for Bayesian classification.

[1]  Remco R. Bouckaert Naive Bayes Classifiers That Perform Well with Continuous Variables , 2004, Australian Conference on Artificial Intelligence.

[2]  Yu-Lin He,et al.  Non-Naive Bayesian Classifiers for Classification Problems With Continuous Attributes , 2014, IEEE Transactions on Cybernetics.

[3]  Michael Luby,et al.  Approximating Probabilistic Inference in Bayesian Belief Networks is NP-Hard , 1993, Artif. Intell..

[4]  Guoqing Zhao,et al.  A Pervasive Stress Monitoring System Based on Biological Signals , 2013, 2013 Ninth International Conference on Intelligent Information Hiding and Multimedia Signal Processing.

[5]  Geoffrey I. Webb,et al.  Lazy Learning of Bayesian Rules , 2000, Machine Learning.

[6]  Klaus Hechenbichler,et al.  Weighted k-Nearest-Neighbor Techniques and Ordinal Classification , 2004 .

[7]  Ron Kohavi,et al.  Scaling Up the Accuracy of Naive-Bayes Classifiers: A Decision-Tree Hybrid , 1996, KDD.

[8]  Yoshihiko Hamamoto,et al.  A local mean-based nonparametric classifier , 2006, Pattern Recognit. Lett..

[9]  Peter E. Hart,et al.  Nearest neighbor pattern classification , 1967, IEEE Trans. Inf. Theory.

[10]  Tzu-Tsung Wong,et al.  A hybrid discretization method for naïve Bayesian classifiers , 2012, Pattern Recognit..

[11]  D. Larose k‐Nearest Neighbor Algorithm , 2005 .

[12]  Seiji Hotta,et al.  Pattern recognition using average patterns of categorical k-nearest neighbors , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..

[13]  Pedro Larrañaga,et al.  Bayesian classifiers based on kernel density estimation: Flexible classifiers , 2009, Int. J. Approx. Reason..

[14]  Farid Melgani,et al.  Nearest Neighbor Classification of Remote Sensing Images With the Maximal Margin Principle , 2008, IEEE Transactions on Geoscience and Remote Sensing.

[15]  Bernhard Pfahringer,et al.  Locally Weighted Naive Bayes , 2002, UAI.

[16]  Pat Langley,et al.  Estimating Continuous Distributions in Bayesian Classifiers , 1995, UAI.

[17]  P. Mahalanobis On the generalized distance in statistics , 1936 .

[18]  D. Wolfe,et al.  Nonparametric Statistical Methods. , 1974 .

[19]  M. C. Jones,et al.  Locally parametric nonparametric density estimation , 1996 .

[20]  C. Loader Local Likelihood Density Estimation , 1996 .

[21]  Bin Hu,et al.  Bayesian classification with local probabilistic model assumption in aiding medical diagnosis , 2015, 2015 IEEE International Conference on Bioinformatics and Biomedicine (BIBM).

[22]  Bin Hu,et al.  Learning from neighborhood for classification with local distribution characteristics , 2015, 2015 International Joint Conference on Neural Networks (IJCNN).

[23]  John Platt,et al.  Probabilistic Outputs for Support vector Machines and Comparisons to Regularized Likelihood Methods , 1999 .

[24]  Léon Bottou,et al.  Local Learning Algorithms , 1992, Neural Computation.

[25]  Bin Hu,et al.  Nearest Neighbor Method Based on Local Distribution for Classification , 2015, PAKDD.

[26]  Bernhard Schölkopf,et al.  A Local Learning Approach for Clustering , 2006, NIPS.

[27]  Bernd Bischl,et al.  Benchmarking local classification methods , 2013, Computational Statistics.

[28]  Jianping Gou,et al.  A Local Mean-Based k-Nearest Centroid Neighbor Classifier , 2012, Comput. J..

[29]  Michael R. Lyu,et al.  Local Learning vs. Global Learning: An Introduction to Maxi-Min Margin Machine , 2005 .

[30]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..

[31]  Jitendra Malik,et al.  SVM-KNN: Discriminative Nearest Neighbor Classification for Visual Category Recognition , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[32]  Leon Wenliang Zhong,et al.  Accurate Probability Calibration for Multiple Classifiers , 2013, IJCAI.

[33]  Jia Wu,et al.  A Correlation-Based Feature Weighting Filter for Naive Bayes , 2019, IEEE Transactions on Knowledge and Data Engineering.

[34]  Shasha Wang,et al.  Adapting naive Bayes tree for text classification , 2015, Knowledge and Information Systems.

[35]  Shasha Wang,et al.  Deep feature weighting for naive Bayes and its application to text classification , 2016, Eng. Appl. Artif. Intell..

[36]  Nir Friedman,et al.  Bayesian Network Classifiers , 1997, Machine Learning.

[37]  Robert Chambers,et al.  Maximum Likelihood Estimation for Sample Surveys , 2012 .

[38]  Yan Qiu Chen,et al.  The Nearest Neighbor Algorithm of Local Probability Centers , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[39]  Pedro M. Domingos,et al.  Beyond Independence: Conditions for the Optimality of the Simple Bayesian Classifier , 1996, ICML.

[40]  Moses Charikar,et al.  Local Density Estimation in High Dimensions , 2018, ICML.

[41]  Claus Weihs,et al.  Localized Linear Discriminant Analysis , 2006, GfKl.

[42]  Robert Tibshirani,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition , 2001, Springer Series in Statistics.

[43]  Yoshua Bengio,et al.  Locally Weighted Full Covariance Gaussian Density Estimation , 2004 .

[44]  Bin Hu,et al.  EEG-based biometric identification using local probability centers , 2015, 2015 International Joint Conference on Neural Networks (IJCNN).

[45]  Liangxiao Jiang,et al.  Weighted average of one-dependence estimators† , 2012, J. Exp. Theor. Artif. Intell..

[46]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[47]  Mong-Li Lee,et al.  SNNB: A Selective Neighborhood Based Naïve Bayes for Lazy Learning , 2002, PAKDD.

[48]  Liangxiao Jiang,et al.  A Novel Bayes Model: Hidden Naive Bayes , 2009, IEEE Transactions on Knowledge and Data Engineering.

[49]  D. W. Scott,et al.  On Locally Adaptive Density Estimation , 1996 .

[50]  Gery Geenens,et al.  Curse of dimensionality and related issues in nonparametric functional regression , 2011 .

[51]  David G. Stork,et al.  Pattern Classification , 1973 .

[52]  Rong Jin,et al.  Localized Support Vector Machine and Its Efficient Algorithm , 2007, SDM.

[53]  Nurislam Tursynbek,et al.  Medical Decision Support Tool from a Fuzzy-Rules Driven Bayesian Network , 2018, ICAART.

[54]  Alex Pentland,et al.  A Bayesian Computer Vision System for Modeling Human Interactions , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[55]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[56]  L. Liu,et al.  Improve Affective Learning with EEG Approach , 2010, Comput. Informatics.

[57]  Geoffrey I. Webb,et al.  Not So Naive Bayes: Aggregating One-Dependence Estimators , 2005, Machine Learning.

[58]  Shasha Wang,et al.  Structure extended multinomial naive Bayes , 2016, Inf. Sci..

[59]  Hae-Chang Rim,et al.  Some Effective Techniques for Naive Bayes Text Classification , 2006, IEEE Transactions on Knowledge and Data Engineering.

[60]  Philip H. S. Torr,et al.  Locally Linear Support Vector Machines , 2011, ICML.

[61]  D. Hand,et al.  Idiot's Bayes—Not So Stupid After All? , 2001 .

[62]  A. Asuncion,et al.  UCI Machine Learning Repository, University of California, Irvine, School of Information and Computer Sciences , 2007 .

[63]  L. Tenorio,et al.  Minimum local distance density estimation , 2014, 1412.2851.

[64]  Gerhard Tutz,et al.  Localized classification , 2005, Stat. Comput..

[65]  Gregory F. Cooper,et al.  The Computational Complexity of Probabilistic Inference Using Bayesian Belief Networks , 1990, Artif. Intell..

[66]  Bin Hu,et al.  A pervasive EEG-based biometric system , 2011, UAAII '11.

[67]  Ying Yang,et al.  A comparative study of discretization methods for naive-Bayes classifiers , 2002 .