Incremental Bayesian classification for multivariate normal distribution data

Bayesian classifier is an effective and fundamental methodology for solving classification problems. However, it is computationally efficient when all features are considered simultaneously. But sometimes all the features do not contribute significantly to classification. Also the noisy attributes sometimes may decrease the accuracy of classifier. So before classification feature selection is used as a pre-processing step. When the features are added one by one in Bayesian classifier in batch mode in forward selection method huge computation is involved. In this paper, an incremental Bayesian classifier for multivariate normal distribution datasets are proposed. The proposed incremental Bayesian classifier is computationally efficient over batch Bayesian classifier in terms of time. The effectiveness of the proposed incremental Bayesian classifier has been demonstrated through experiments on different datasets. It is found on the basis of experiments that the incremental Bayesian classifier has an equivalent power compared to batch Bayesian classifier in terms classification accuracy. However, the proposed incremental Bayesian classifier has very high speed efficiency in comparison to batch Bayesian classifier.

[1]  Francesc J. Ferri,et al.  Comparative study of techniques for large-scale feature selection* *This work was suported by a SERC grant GR/E 97549. The first author was also supported by a FPI grant from the Spanish MEC, PF92 73546684 , 1994 .

[2]  Gene H. Golub,et al.  Matrix computations , 1983 .

[3]  Anil K. Jain,et al.  Small Sample Size Effects in Statistical Pattern Recognition: Recommendations for Practitioners , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[4]  R Kahavi,et al.  Wrapper for feature subset selection , 1997 .

[5]  Ron Kohavi,et al.  Feature Subset Selection Using the Wrapper Method: Overfitting and Dynamic Search Space Topology , 1995, KDD.

[6]  Günter Tusch Sequential classification for microarray and clinical data , 2005, 2005 IEEE Computational Systems Bioinformatics Conference - Workshops (CSBW'05).

[7]  P. Langley Selection of Relevant Features in Machine Learning , 1994 .

[8]  Thomas G. Dietterich,et al.  Learning with Many Irrelevant Features , 1991, AAAI.

[9]  Claude E. Shannon,et al.  A mathematical theory of communication , 1948, MOCO.

[10]  Ron Kohavi,et al.  Wrappers for Feature Subset Selection , 1997, Artif. Intell..

[11]  C. E. SHANNON,et al.  A mathematical theory of communication , 1948, MOCO.

[12]  Anil K. Jain,et al.  Feature Selection: Evaluation, Application, and Small Sample Performance , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[13]  David G. Stork,et al.  Pattern Classification , 1973 .

[14]  Anil K. Jain,et al.  On the optimal number of features in the classification of multivariate Gaussian data , 1978, Pattern Recognit..