Weighted Bagging for Graph Based One-Class Classifiers

Most conventional learning algorithms require both positive and negative training data for achieving accurate classification results. However, the problem of learning classifiers from only positive data arises in many applications where negative data are too costly, difficult to obtain, or not available at all. Minimum Spanning Tree Class Descriptor (MST_CD) was presented as a method that achieves better accuracies than other one-class classifiers in high dimensional data. However, the presence of outliers in the target class severely harms the performance of this classifier. In this paper we propose two bagging strategies for MST_CD that reduce the influence of outliers in training data. We show the improved performance on both real and artificially contaminated data.

[1]  Andrew P. Bradley,et al.  The use of the area under the ROC curve in the evaluation of machine learning algorithms , 1997, Pattern Recognit..

[2]  Ronald L. Graham,et al.  On the History of the Minimum Spanning Tree Problem , 1985, Annals of the History of Computing.

[3]  Eric Bauer,et al.  An Empirical Comparison of Voting Classification Algorithms: Bagging, Boosting, and Variants , 1999, Machine Learning.

[4]  Christopher M. Bishop,et al.  Neural networks for pattern recognition , 1995 .

[5]  Subhash C. Bagui,et al.  Combining Pattern Classifiers: Methods and Algorithms , 2005, Technometrics.

[6]  Robert P. W. Duin,et al.  Support objects for domain approximation , 1998 .

[7]  David M. J. Tax,et al.  One-class classification , 2001 .

[8]  J. Kruskal On the shortest spanning subtree of a graph and the traveling salesman problem , 1956 .

[9]  Albert D. Shieh,et al.  Ensembles of One Class Support Vector Machines , 2009, MCS.

[10]  Ludmila I. Kuncheva,et al.  Combining Pattern Classifiers: Methods and Algorithms , 2004 .

[11]  E. Parzen On Estimation of a Probability Density Function and Mode , 1962 .

[12]  Charles C. Taylor,et al.  Boosting kernel density estimates: A bias reduction technique? , 2004 .

[13]  R. Prim Shortest connection networks and some generalizations , 1957 .

[14]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[15]  M. Birkner,et al.  Blow-up of semilinear PDE's at the critical dimension. A probabilistic approach , 2002 .

[16]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..

[17]  Robert P. W. Duin,et al.  Minimum spanning tree based one-class classifier , 2009, Neurocomputing.

[18]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[19]  Bernhard Schölkopf,et al.  Estimating the Support of a High-Dimensional Distribution , 2001, Neural Computation.

[20]  Thomas G. Dietterich Multiple Classifier Systems , 2000, Lecture Notes in Computer Science.

[21]  Teuvo Kohonen,et al.  Self-Organizing Maps , 2010 .