On the influence of an adaptive inference system in fuzzy rule based classification systems for imbalanced data-sets

Classification with imbalanced data-sets supposes a new challenge for researches in the framework of data mining. This problem appears when the number of examples that represents one of the classes of the data-set (usually the concept of interest) is much lower than that of the other classes. In this manner, the learning model must be adapted to this situation, which is very common in real applications.In this paper, we will work with fuzzy rule based classification systems using a preprocessing step in order to deal with the class imbalance. Our aim is to analyze the behaviour of fuzzy rule based classification systems in the framework of imbalanced data-sets by means of the application of an adaptive inference system with parametric conjunction operators.Our results shows empirically that the use of the this parametric conjunction operators implies a higher performance for all data-sets with different imbalanced ratios.

[1]  Gary M. Weiss Mining with rarity: a unifying framework , 2004, SKDD.

[2]  Ester Bernadó-Mansilla,et al.  Evolutionary rule-based systems for imbalanced data sets , 2008, Soft Comput..

[3]  María José del Jesús,et al.  A study of the behaviour of linguistic fuzzy rule based classification systems in the framework of imbalanced data-sets , 2008, Fuzzy Sets Syst..

[4]  Sheng Chen,et al.  A Kernel-Based Two-Class Classifier for Imbalanced Data Sets , 2007, IEEE Transactions on Neural Networks.

[5]  María José del Jesús,et al.  A proposal on reasoning methods in fuzzy rule-based classification systems , 1999, Int. J. Approx. Reason..

[6]  Francisco Herrera,et al.  Genetic fuzzy systems: taxonomy, current research trends and prospects , 2008, Evol. Intell..

[7]  Sam Kwong,et al.  Genetic-fuzzy rule mining approach and evaluation of feature selection techniques for anomaly intrusion detection , 2007, Pattern Recognition.

[8]  Hong Yan,et al.  Fuzzy Algorithms: With Applications to Image Processing and Pattern Recognition , 1996, Advances in Fuzzy Systems - Applications and Theory.

[9]  Damminda Alahakoon,et al.  Minority report in fraud detection: classification of skewed data , 2004, SKDD.

[10]  Gustavo E. A. P. A. Batista,et al.  A study of the behavior of several methods for balancing machine learning training data , 2004, SKDD.

[11]  Jerry M. Mendel,et al.  Generating fuzzy rules by learning from examples , 1992, IEEE Trans. Syst. Man Cybern..

[12]  Hisao Ishibuchi,et al.  A weighted fuzzy classifier and its application to image processing tasks , 2007, Fuzzy Sets Syst..

[13]  Nikhil R. Pal,et al.  A fuzzy rule based approach to cloud cover estimation , 2006 .

[14]  Zuhair Bandar,et al.  On constructing a fuzzy inference framework using crisp decision trees , 2006, Fuzzy Sets Syst..

[15]  Pedro M. Domingos MetaCost: a general method for making classifiers cost-sensitive , 1999, KDD '99.

[16]  Mo-Yuen Chow,et al.  Power Distribution Fault Cause Identification With Imbalanced Data Using the Data Mining-Based Fuzzy Classification $E$-Algorithm , 2007, IEEE Transactions on Power Systems.

[17]  D. Sheskin Handbook of parametric and nonparametric statistical procedures, 2nd ed. , 2000 .

[18]  Taeho Jo,et al.  A Multiple Resampling Method for Learning from Imbalanced Data Sets , 2004, Comput. Intell..

[19]  Wei-Pang Yang,et al.  An approach to mining the multi-relational imbalanced database , 2008, Expert Syst. Appl..

[20]  Nitesh V. Chawla,et al.  Editorial: special issue on learning from imbalanced data sets , 2004, SKDD.

[21]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..

[22]  JapkowiczNathalie,et al.  The class imbalance problem: A systematic study , 2002 .

[23]  Yang Wang,et al.  Cost-sensitive boosting for classification of imbalanced data , 2007, Pattern Recognit..

[24]  Nathalie Japkowicz,et al.  The class imbalance problem: A systematic study , 2002, Intell. Data Anal..

[25]  Shigeo Abe,et al.  A neural-network-based fuzzy classifier , 1995, IEEE Trans. Syst. Man Cybern..

[26]  Bin-Da Liu,et al.  Design of adaptive fuzzy logic controller based on linguistic-hedge concepts and genetic algorithms , 2001, IEEE Trans. Syst. Man Cybern. Part B.

[27]  Yueh-Min Huang,et al.  Conflict-sensitivity contexture learning algorithm for mining interesting patterns using neuro-fuzzy network with decision rules , 2008, Expert Syst. Appl..

[28]  Francisco Herrera,et al.  Cooperative Evolutionary Learning of Linguistic Fuzzy Rules and Parametric Aggregation Connectors for Mamdani Fuzzy Systems , 2007, IEEE Transactions on Fuzzy Systems.

[29]  Songbo Tan,et al.  Neighbor-weighted K-nearest neighbor for unbalanced text corpus , 2005, Expert Syst. Appl..

[30]  R. Barandelaa,et al.  Strategies for learning in class imbalance problems , 2003, Pattern Recognit..

[31]  Yuehwern Yih,et al.  Knowledge acquisition through information granulation for imbalanced data , 2006, Expert Syst. Appl..

[32]  T. Warren Liao,et al.  Classification of weld flaws with imbalanced class data , 2008, Expert Syst. Appl..

[33]  María José del Jesús,et al.  A Study on the Use of the Fuzzy Reasoning Method Based on the Winning Rule vs. Voting Procedure for Classification with Imbalanced Data Sets , 2007, IWANN.

[34]  Hisao Ishibuchi,et al.  Classification and modeling with linguistic information granules - advanced approaches to linguistic data mining , 2004, Advanced information processing.

[35]  Tom Fawcett,et al.  Adaptive Fraud Detection , 1997, Data Mining and Knowledge Discovery.

[36]  Francisco Herrera,et al.  A study of statistical techniques and performance measures for genetics-based machine learning: accuracy and interpretability , 2009, Soft Comput..

[37]  Larry J. Eshelman,et al.  The CHC Adaptive Search Algorithm: How to Have Safe Search When Engaging in Nontraditional Genetic Recombination , 1990, FOGA.

[38]  Jerry M. Mendel,et al.  On choosing models for linguistic connector words for Mamdani fuzzy logic systems , 2004, IEEE Transactions on Fuzzy Systems.

[39]  F. Gomide,et al.  Ten years of genetic fuzzy systems: current framework and new trends , 2001, Proceedings Joint 9th IFSA World Congress and 20th NAFIPS International Conference (Cat. No. 01TH8569).

[40]  Jacek M. Zurada,et al.  Training neural network classifiers for medical decision making: The effects of imbalanced datasets on classification performance , 2008, Neural Networks.

[41]  Zuhair Bandar,et al.  Genetic tuning of fuzzy inference within fuzzy classifier systems , 2006, Expert Syst. J. Knowl. Eng..

[42]  M. Mizumoto Pictorial representations of fuzzy connectives, part I: cases of t-norms, t-conorms and averaging operators , 1989 .

[43]  Foster J. Provost,et al.  Learning When Training Data are Costly: The Effect of Class Distribution on Tree Induction , 2003, J. Artif. Intell. Res..

[44]  Maliha S. Nash,et al.  Handbook of Parametric and Nonparametric Statistical Procedures , 2001, Technometrics.

[45]  Francisco Herrera,et al.  Ten years of genetic fuzzy systems: current framework and new trends , 2004, Fuzzy Sets Syst..

[46]  Stan Matwin,et al.  Machine Learning for the Detection of Oil Spills in Satellite Radar Images , 1998, Machine Learning.

[47]  Hisao Ishibuchi,et al.  Rule weight specification in fuzzy rule-based classification systems , 2005, IEEE Transactions on Fuzzy Systems.

[48]  Nitesh V. Chawla,et al.  SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..

[49]  Francisco Herrera,et al.  Increasing fuzzy rules cooperation based on evolutionary adaptive inference systems , 2007, Int. J. Intell. Syst..

[50]  F. Wilcoxon Individual Comparisons by Ranking Methods , 1945 .

[51]  Hewijin Christine Jiau,et al.  Evaluation of neural networks and data mining methods on a credit assessment task for class imbalance problem , 2006 .