An Approach to Imbalanced Data Sets Based on Changing Rule Strength

. This chapter describes experiments with a challenging data set describing preterm births. The data set, collected at the Duke University Medical Center, was large but at the same time many attribute values were missing. However, the main problem was that only 20.7% of the total number of cases represented the important preterm birth class. Thus, the data set was imbalanced. For comparison, we include results of experiments on another imbalanced data set, the well-known breast cancer data set.

[1]  Jaime G. Carbonell,et al.  Machine learning: paradigms and methods , 1990 .

[2]  R. Creasy,et al.  Preterm birth prevention: where are we? , 1993, American journal of obstetrics and gynecology.

[3]  M. Mclean,et al.  Prediction and early diagnosis of preterm labor: a critical review. , 1993, Obstetrical & gynecological survey.

[4]  Jerzy W. Grzymala-Busse,et al.  Machine learning for an expert system to predict preterm birth risk. , 1994, Journal of the American Medical Informatics Association : JAMIA.

[5]  Paul Thagard,et al.  Induction: Processes Of Inference , 1989 .

[6]  Jerzy W. Grzymala-Busse,et al.  Rough sets : New horizons in commercial and industrial AI , 1995 .

[7]  Jerzy W. Grzymala-Busse,et al.  Increasing sensitivity of preterm birth by changing rule strengths , 2003, Pattern Recognit. Lett..

[8]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[9]  D.E. Goldberg,et al.  Classifier Systems and Genetic Algorithms , 1989, Artif. Intell..

[10]  J. Iams,et al.  Prevention of preterm birth. , 1988, Seminars in perinatology.

[11]  John H. Holland,et al.  Induction: Processes of Inference, Learning, and Discovery , 1987, IEEE Expert.

[12]  Ryszard S. Michalski,et al.  The AQ15 Inductive Learning System: An Overview and Experiments , 1986 .

[13]  Jerzy W. Grzymala-Busse,et al.  On the Unknown Attribute Values in Learning from Examples , 1991, ISMIS.

[14]  金田 重郎,et al.  C4.5: Programs for Machine Learning (書評) , 1995 .

[15]  Jerzy W. Grzymala-Busse,et al.  A Closest Fit Approach to Missing Attribute VAlues in Preterm Birth Data , 1999, RSFDGrC.

[16]  Janusz Zalewski,et al.  Rough sets: Theoretical aspects of reasoning about data , 1996 .

[17]  Creasy Rk,et al.  Prevention of preterm birth. , 1981 .

[18]  R. Słowiński Intelligent Decision Support: Handbook of Applications and Advances of the Rough Sets Theory , 1992 .

[19]  Jerzy W. Grzymala-Busse,et al.  LERS-A System for Learning from Examples Based on Rough Sets , 1992, Intelligent Decision Support.

[20]  Roman Słowiński,et al.  Intelligent Decision Support , 1992, Theory and Decision Library.

[21]  Patrick Brézillon,et al.  Lecture Notes in Artificial Intelligence , 1999 .