On Feature Selection and Rule Extraction for High Dimensional Data: A Case of Diffuse Large B-Cell Lymphomas Microarrays Classification

Neurofuzzy methods capable of selecting a handful of useful features are very useful in analysis of high dimensional datasets. A neurofuzzy classification scheme that can create proper linguistic features and simultaneously select informative features for a high dimensional dataset is presented and applied to the diffuse large B-cell lymphomas (DLBCL) microarray classification problem. The classification scheme is the combination of embedded linguistic feature creation and tuning algorithm, feature selection, and rule-based classification in one neural network framework. The adjustable linguistic features are embedded in the network structure via fuzzy membership functions. The network performs the classification task on the high dimensional DLBCL microarray dataset either by the direct calculation or by the rule-based approach. The 10-fold cross validation is applied to ensure the validity of the results. Very good results from both direct calculation and logical rules are achieved. The results show that the network can select a small set of informative features in this high dimensional dataset. By a comparison to other previously proposed methods, our method yields better classification performance.

[1]  S. Hashimoto,et al.  An Interpretable Neural Network Ensemble , 2007, IECON 2007 - 33rd Annual Conference of the IEEE Industrial Electronics Society.

[2]  Kwong-Sak Leung,et al.  Classification of Heterogeneous Fuzzy Data by Choquet Integral With Fuzzy-Valued Integrand , 2007, IEEE Transactions on Fuzzy Systems.

[3]  Christian W. Omlin,et al.  What inductive bias gives good neural network training performance? , 2000, Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks. IJCNN 2000. Neural Computing: New Challenges and Perspectives for the New Millennium.

[4]  LiMin Fu,et al.  Rule Generation from Neural Networks , 1994, IEEE Trans. Syst. Man Cybern. Syst..

[5]  Sansanee Auephanwiriyakul,et al.  A novel neuro-fuzzy method for linguistic feature selection and rule-based classification , 2010, 2010 The 2nd International Conference on Computer and Automation Engineering (ICCAE).

[6]  Sansanee Auephanwiriyakul,et al.  Colon Tumor Microarray Classification Using Neural Network with Feature Selection and Rule-Based Classification , 2010 .

[7]  Nor Ashidi Mat Isa,et al.  Multiple Adaptive Neuro-Fuzzy Inference System with Automatic Features Extraction Algorithm for Cervical Cancer Recognition , 2014, Comput. Math. Methods Medicine.

[8]  Bouderah Brahim,et al.  Comparison of Neuro-fuzzy Models for Classification Fingerprint Images , 2013 .

[9]  Blaz Zupan,et al.  Data and text mining Visualization-based cancer microarray data classification analysis , 2007 .

[10]  Wlodzislaw Duch,et al.  A new methodology of extraction, optimization and application of crisp and fuzzy logical rules , 2001, IEEE Trans. Neural Networks.

[11]  LiMin Fu Learning capacity and sample complexity on expert networks , 1996, IEEE Trans. Neural Networks.

[12]  Nikhil R. Pal,et al.  A neuro-fuzzy scheme for simultaneous feature selection and fuzzy rule-based classification , 2004, IEEE Transactions on Neural Networks.

[13]  Xi-Zhao Wang,et al.  A new approach to weighted fuzzy production rule extraction from neural networks , 2004, Proceedings of 2004 International Conference on Machine Learning and Cybernetics (IEEE Cat. No.04EX826).

[14]  Gaoxiang Ouyang,et al.  Adaptive Neuro-Fuzzy Inference System for Classification of Background EEG Signals from ESES Patients and Controls , 2014, TheScientificWorldJournal.

[15]  Li-Min Fu,et al.  Knowledge-based connectionism for revising domain theories , 1993, IEEE Trans. Syst. Man Cybern..

[16]  Kayalvizhi Jayavel,et al.  An Enhanced Architecture for an Infrastructure based Middle-ware using Adaptive Network Coding and Global Data Formatting to Achieve Efficient IoTs , 2013 .

[17]  J. Manuel Cano Izquierdo,et al.  Feature Selection Applying Statistical and Neurofuzzy Methods to EEG-Based BCI , 2015, Comput. Intell. Neurosci..

[18]  Jacek M. Zurada,et al.  Computational intelligence methods for rule-based data understanding , 2004, Proceedings of the IEEE.

[19]  Dezhao Chen,et al.  Fuzzy Information Granulation Based Decision Support Applications , 2008, 2008 International Symposiums on Information Processing.

[20]  Amine Chikh,et al.  Design of fuzzy classifier for diabetes disease using Modified Artificial Bee Colony algorithm , 2013, Comput. Methods Programs Biomed..

[21]  Xia Hong,et al.  A neurofuzzy classifier for two class problems , 2012, 2012 12th UK Workshop on Computational Intelligence (UKCI).

[22]  Aboul Ella Hassanien,et al.  Fuzzy and hard clustering analysis for thyroid disease , 2013, Comput. Methods Programs Biomed..

[23]  De-xian Zhang,et al.  Effectively extracting rules from trained neural networks based on the characteristics of the classification hypersurfaces , 2005, 2005 International Conference on Machine Learning and Cybernetics.

[24]  Cheng-Jian Lin,et al.  Compensatory Neurofuzzy Inference Systems for Pattern Classification , 2012, 2012 International Symposium on Computer, Consumer and Control.

[25]  Ahmad Taher Azar,et al.  Adaptive network based on fuzzy inference system for equilibrated urea concentration prediction , 2013, Comput. Methods Programs Biomed..

[26]  Todd,et al.  Diffuse large B-cell lymphoma outcome prediction by gene-expression profiling and supervised machine learning , 2002, Nature Medicine.

[27]  Alan Liu,et al.  Pattern discovery of fuzzy time series for financial prediction , 2006, IEEE Transactions on Knowledge and Data Engineering.

[28]  Jude Shavlik,et al.  Refinement ofApproximate Domain Theories by Knowledge-Based Neural Networks , 1990, AAAI.