Effective classification using feature selection and fuzzy integration

Many classification problems involve features whose specificity demand some form of feature space transformation (preprocessing) coupled with post-processing consensus analysis in order to accomplish a successful discrimination between different classes. In this study, we present a new methodology, which systematically addresses these design classification issues. At the preprocessing phase we offer a new approach of stochastic feature selection. This type of feature selection, collates quadratically transformed feature subsets for presentation to a collection of respective classifiers. In the sequel, independent classification outcomes are aggregated through fuzzy integration. The motivation behind the proposed methodology is twofold. Often, only a subset of features possesses discriminatory power while the remainder has a tendency to confound the effectiveness of the underlying classifier. Quite commonly, classification based on some consensus of classification outcomes coming from a set of classifiers operating upon different feature subsets becomes more accurate than the classification results produced by any individual classifier. To illustrate this design methodology, we discuss a classification problem coming from software engineering. Here we are concerned with a dataset comprosed of features describing a collection of qualitative attributes of a software system. The experiments demonstrate that the aggregated classification results using fuzzy integration are superior to the predictions from the respective best single classifiers.

[1]  Jason Weston,et al.  Gene Selection for Cancer Classification using Support Vector Machines , 2002, Machine Learning.

[2]  Chris F. Kemerer,et al.  A Metrics Suite for Object Oriented Design , 2015, IEEE Trans. Software Eng..

[3]  George J. Klir,et al.  Fuzzy sets, uncertainty and information , 1988 .

[4]  Anas N. Al-Rabadi,et al.  A comparison of modified reconstructability analysis and Ashenhurst‐Curtis decomposition of Boolean functions , 2004 .

[5]  Ray L. Somorjai,et al.  Scopira - a system for the analysis of biomedical data , 2002, IEEE CCECE2002. Canadian Conference on Electrical and Computer Engineering. Conference Proceedings (Cat. No.02CH37373).

[6]  Jack Dongarra,et al.  MPI: The Complete Reference , 1996 .

[7]  Roger S. Pressman,et al.  Software Engineering: A Practitioner's Approach (McGraw-Hill Series in Computer Science) , 2004 .

[8]  Shari Lawrence Pfleeger,et al.  Software metrics (2nd ed.): a rigorous and practical approach , 1997 .

[9]  J. Bezdek,et al.  FCM: The fuzzy c-means clustering algorithm , 1984 .

[10]  Radu Marinescu,et al.  Detecting design flaws via metrics in object-oriented systems , 2001, Proceedings 39th International Conference and Exhibition on Technology of Object-Oriented Languages and Systems. TOOLS 39.

[11]  M. Sugeno,et al.  Fuzzy measure of fuzzy events defined by fuzzy integrals , 1992 .

[12]  Witold Pedrycz,et al.  Software Engineering: An Engineering Approach , 1999 .

[13]  Shari Lawrence Pfleeger,et al.  Software Metrics : A Rigorous and Practical Approach , 1998 .

[14]  W. Pedrycz,et al.  Predicting Qualitative Assessments Using Fuzzy Aggregation , 2006, NAFIPS 2006 - 2006 Annual Meeting of the North American Fuzzy Information Processing Society.

[15]  Geert Poels,et al.  Distance-based software measurement: necessary and sufficient properties for software measures , 2000, Inf. Softw. Technol..

[16]  Barbara A. Kitchenham,et al.  Modeling Software Measurement Data , 2001, IEEE Trans. Software Eng..

[17]  G. Seber Multivariate observations / G.A.F. Seber , 1983 .

[18]  Xin Yao,et al.  Gene selection algorithms for microarray data based on least squares support vector machine , 2006, BMC Bioinformatics.

[19]  Rodrigo A. Vivanco,et al.  Scopira: an open source C++ framework for biomedical data analysis applications -- a research project report , 2005, OOPSLA '05.

[20]  Elaine J. Weyuker,et al.  Evaluating Software Complexity Measures , 2010, IEEE Trans. Software Eng..

[21]  Norman Fenton,et al.  Metrics and software structure , 1987 .

[22]  Nikola K. Kasabov,et al.  DENFIS: dynamic evolving neural-fuzzy inference system and its application for time-series prediction , 2002, IEEE Trans. Fuzzy Syst..

[23]  Karl J. Lieberherr,et al.  Assuring good style for object-oriented programs , 1989, IEEE Software.

[24]  Mark Mayfield,et al.  Java design (2nd ed.): building better apps and applets , 1999 .

[25]  G. Choquet Theory of capacities , 1954 .

[26]  Roger S. Pressman,et al.  Software Engineering: A Practitioner's Approach , 1982 .

[27]  Witold Pedrycz,et al.  Software quality analysis with the use of computational intelligence , 2003, Inf. Softw. Technol..

[28]  James M. Keller,et al.  Information fusion in computer vision using the fuzzy integral , 1990, IEEE Trans. Syst. Man Cybern..

[29]  N J Pizzi,et al.  EvIdent(TM): a functional magnetic resonance image analysis system , 2001, Artif. Intell. Medicine.

[30]  Witold Pedrycz,et al.  Classification of Biomedical Spectra Using Fuzzy Interquartile Encoding and Stochastic Feature Selection , 2007, 2007 IEEE Symposium on Computational Intelligence and Data Mining.

[31]  Qingzhong Liu,et al.  Feature mining and pattern classification for steganalysis of LSB matching steganography in grayscale images , 2008, Pattern Recognit..

[32]  Michael R. Lyu,et al.  Handbook of software reliability engineering , 1996 .

[33]  Hong Yan,et al.  Fuzzy Algorithms: With Applications to Image Processing and Pattern Recognition , 1996, Advances in Fuzzy Systems - Applications and Theory.

[34]  M. Sugeno,et al.  An interpretation of fuzzy measures and the Choquet integral as an integral with respect to a fuzzy , 1989 .

[35]  菅野 道夫,et al.  Theory of fuzzy integrals and its applications , 1975 .

[36]  Mark Mayfield,et al.  Java Design: Building Better Apps and Applets , 1997 .

[37]  C. Jones,et al.  Software metrics: good, bad and missing , 1994, Computer.

[38]  M. Sugeno FUZZY MEASURES AND FUZZY INTEGRALS—A SURVEY , 1993 .

[39]  Maurice H. Halstead,et al.  Elements of software science , 1977 .