The Novelty Detection Approach for Different Degrees of Class Imbalance

We show that the novelty detection approach is a viable solution to the class imbalance and examine which approach is suitable for different degrees of imbalance. In experiments using SVM-based classifiers, when the imbalance is extreme, novelty detectors are more accurate than balanced and unbalanced binary classifiers. However, with a relatively moderate imbalance, balanced binary classifiers should be employed. In addition, novelty detectors are more effective when the classes have a non-symmetrical class relationship.

[1]  Bernhard Schölkopf,et al.  Kernel method for percentile feature extraction , 2000 .

[2]  Hyoungjoo Lee,et al.  SOM-Based Novelty Detection Using Novel Data , 2005, IDEAL.

[3]  Stan Matwin,et al.  Addressing the Curse of Imbalanced Training Sets: One-Sided Selection , 1997, ICML.

[4]  Nathalie Japkowicz,et al.  The class imbalance problem: A systematic study , 2002, Intell. Data Anal..

[5]  Sungzoon Cho,et al.  Response modeling with support vector machines , 2006, Expert Syst. Appl..

[6]  Gary M. Weiss Mining with rarity: a unifying framework , 2004, SKDD.

[7]  Charles Elkan,et al.  The Foundations of Cost-Sensitive Learning , 2001, IJCAI.

[8]  Nitesh V. Chawla,et al.  SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..

[9]  Christopher M. Bishop,et al.  Novelty detection and neural network validation , 1994 .

[10]  Adam Kowalczyk,et al.  Extreme re-balancing for SVMs: a case study , 2004, SKDD.

[11]  Nello Cristianini,et al.  An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[12]  Nathalie Japkowicz,et al.  Concept learning in the absence of counterexamples: an autoassociation-based approach to classification , 1999 .

[13]  Bernhard Schölkopf,et al.  Estimating the Support of a High-Dimensional Distribution , 2001, Neural Computation.

[14]  Chao He,et al.  Employing optimized combinations of one-class classifiers for automated currency validation , 2004, Pattern Recognit..