The Use of Rough Sets as a Data Mining Tool for Experimental Bio-data

The Rough Sets methodology has great potential for mining experimental data. Since its introduction by Pawlak, it has received a lot of attention in the computing community. However, due to the mathematical nature of the Rough Sets methodology, many experimental scientists lacking sufficient mathematical background have been hesitant to use it. The goal of this chapter is twofold: (1) to introduce “Rough Sets” methodology (along with one of its derivatives, “Modified Rough Sets”) in a non-mathematical fashion hoping to share the potentials of this approach with a larger group of non-computationally-oriented scientists (Mining of one specific form of implicit data within a bio-dataset is also discussed), and (2) to apply this methodology to a dataset of children with and without Attention Deficit/Hyperactivity Disorder (ADHD), to demonstrate the usefulness of the approach in patient differentiation. Discriminant Analysis statistical approach as well as the ID3 approach were also applied to the same dataset for comparison purposes to find out which approach is most effective.

[1]  Sadaaki Miyamoto,et al.  Rough Sets and Current Trends in Computing , 2012, Lecture Notes in Computer Science.

[2]  Zdzisław Pawlak,et al.  Combining Rough Sets and Bayes' Rule , 2001, Comput. Intell..

[3]  Ryszard S. Michalski,et al.  A Theory and Methodology of Inductive Learning , 1983, Artificial Intelligence.

[4]  Zdzislaw Pawlak,et al.  Rough classification , 1984, Int. J. Hum. Comput. Stud..

[5]  Robert Tibshirani,et al.  Bootstrap Methods for Standard Errors, Confidence Intervals, and Other Measures of Statistical Accuracy , 1986 .

[6]  Weida Tong,et al.  BUILDING AN ORGAN-SPECIFIC CARCINOGENIC DATABASE FOR SAR ANALYSES , 2004, Journal of toxicology and environmental health. Part A.

[7]  Lotfi A. Zadeh,et al.  A fuzzy-algorithmic approach to the definition of complex or imprecise concepts , 1976 .

[8]  Jerzy W. Grzymala-Busse,et al.  Rough sets : New horizons in commercial and industrial AI , 1995 .

[9]  Ray R. Hashemi,et al.  Identifying and testing of signatures for non-volatile biomolecules using tandem mass spectra , 1996 .

[10]  Aboul Ella Hassanien,et al.  Fuzzy rough sets hybrid scheme for breast cancer detection , 2007, Image Vis. Comput..

[11]  Wojciech Ziarko,et al.  Variable Precision Rough Set Model , 1993, J. Comput. Syst. Sci..

[12]  Andrzej Skowron,et al.  On Covering Attribute Sets by Reducts , 2007, RSEISP.

[13]  Jörg H. Siekmann,et al.  Artificial Intelligence and Soft Computing - ICAISC 2004 , 2004, Lecture Notes in Computer Science.

[14]  Tsau Young Lin,et al.  Rough Sets and Data Mining: Analysis of Imprecise Data , 1996 .

[15]  Laurene V. Fausett,et al.  Fundamentals Of Neural Networks , 1994 .

[16]  Ulrich Rückert,et al.  A statistical approach to rule learning , 2006, ICML.

[17]  Ron Kohavi,et al.  A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection , 1995, IJCAI.

[18]  Pat Langley,et al.  Editorial: On Machine Learning , 1986, Machine Learning.

[19]  Ray R. Hashemi,et al.  A signature-based liver cancer predictive system , 2005, International Conference on Information Technology: Coding and Computing (ITCC'05) - Volume II.

[20]  Leon O. Chua,et al.  Neural networks for nonlinear programming , 1988 .

[21]  Krzysztof Pancerz,et al.  Flow Graphs as a Tool for Mining Prediction Rules of Changes of Components in Temporal Information Systems , 2007, RSKT.

[22]  Astrid A. Prinz,et al.  Hybridization of Independent Component Analysis, Rough Sets, and Multi-Objective Evolutionary Algorithms for Classificatory Decomposition of Cortical Evoked Potentials , 2006, PRIB.

[23]  Roman Słowiński,et al.  Intelligent Decision Support , 1992, Theory and Decision Library.

[24]  J. Ross Quinlan,et al.  Learning Efficient Classification Procedures and Their Application to Chess End Games , 1983 .

[25]  R L Kodell,et al.  Risk Assessment for Quantitative Responses Using a Mixture Model , 2000, Biometrics.

[26]  Jiawei Han,et al.  Data Mining: Concepts and Techniques , 2000 .

[27]  Samuel G. Schiflett,et al.  Potential Contributions of Rough Sets Data Analysis to Training Evaluations , 2003 .

[28]  Aboul Ella Hassanien,et al.  Rough neural intelligent approach for image classification: A case of patients with suspected breast cancer , 2006, Int. J. Hybrid Intell. Syst..

[29]  S. K. Michael Wong,et al.  Comparison of Rough-Set and Statistical Methods in Inductive Learning , 1986, Int. J. Man Mach. Stud..

[30]  Jacek M. Zurada,et al.  Hybridization of Blind Source Separation and Rough Sets for Proteomic Biomarker Indentification , 2004, ICAISC.

[31]  Renpu Li,et al.  Mining classification rules using rough sets and neural networks , 2004, Eur. J. Oper. Res..

[32]  John R. Anderson,et al.  MACHINE LEARNING An Artificial Intelligence Approach , 2009 .

[33]  Ray R. Hashemi,et al.  A Fusion of Rough Sets, Modified Rough Sets, and Genetic Algorithms for Hybrid Diagnostic Systems , 1997 .

[34]  John F. Young,et al.  The Prediction of Methylmercury Elimination Half-Life in Humans using Animal Data: A Neural Network/Rough Sets Analysis , 2003, Journal of toxicology and environmental health. Part A.

[35]  Jerzy W. Grzymala-Busse,et al.  Mining Numerical Data - A Rough Set Approach , 2007, Trans. Rough Sets.

[36]  Goldberg,et al.  Genetic algorithms , 1993, Robust Control Systems with Genetic Algorithms.

[37]  Constantin Zopounidis,et al.  Application of the Rough Set Approach to Evaluation of Bankruptcy Risk , 1995 .

[38]  Lars Lundberg,et al.  Statistical models vs. expert estimation for fault prediction in modified code - an industrial case study , 2007, J. Syst. Softw..

[39]  Jerzy Stefanowski,et al.  On Combined Classifiers, Rule Induction and Rough Sets , 2007, Trans. Rough Sets.

[40]  Jerzy W. Grzymala-Busse,et al.  Rough Sets , 1995, Commun. ACM.

[41]  Aboul Ella Hassanien,et al.  Rough set approach for attribute reduction and rule generation: A case of patients with suspected breast cancer , 2004, J. Assoc. Inf. Sci. Technol..

[42]  Ray R. Hashemi,et al.  Knowledge discovery from sparse pharmacokinetic data , 2000, SAC '00.

[43]  Jacek M. Zurada,et al.  Evolutionary Algorithms and Rough Sets-Based Hybrid Approach to Classificatory Decomposition of Cortical Evoked Potentials , 2002, Rough Sets and Current Trends in Computing.

[44]  Daryl H. Hepting,et al.  Consumer Modelling in Support of Interface Design , 2006, 2006 International Conference on Hybrid Information Technology.