Knowledge-based Systems and Interestingness Measures: Analysis with Clinical Datasets

Knowledge mined from clinical data can be used for medical diagnosis and prognosis. By improving the quality of knowledge base, the efficiency of prediction of a knowledge-based system can be enhanced. Designing accurate and precise clinical decision support systems, which use the mined knowledge, is still a broad area of research. This work analyses the variation in classification accuracy for such knowledge-based systems using different rule lists. The purpose of this work is not to improve the prediction accuracy of a decision support system, but analyze the factors that influence the efficiency and design of the knowledge base in a rule-based decision support system. Three benchmark medical datasets are used. Rules are extracted using a supervised machine learning algorithm (PART). Each rule in the ruleset is validated using nine frequently used rule interestingness measures. After calculating the measure values, the rule lists are used for performance evaluation. Experimental results show variation in classification accuracy for different rule lists. Confidence and Laplace measures yield relatively superior accuracy: 81.188% for heart disease dataset and 78.255% for diabetes dataset. The accuracy of the knowledge-based prediction system is predominantly dependent on the organization of the ruleset. Rule length needs to be considered when deciding the rule ordering. Subset of a rule, or combination of rule elements, may form new rules and sometimes be a member of the rule list. Redundant rules should be eliminated. Prior knowledge about the domain will enable knowledge engineers to design a better knowledge base. ACM CCS (2012) Classification : Information systems→Information systems applications→Decision support systems→Expert systems *To cite this article: J. J. Christopher et al. , "Knowledge-based Systems and Interestingness Measures: Analysis with Clinical Datasets", CIT. Journal of Computing and Information Technology , vol. 24, no. 1, pp. 65-78, 2016.

[1]  Gregor Papa,et al.  A multi-objective approach to the application of real-world production scheduling , 2013, Expert Syst. Appl..

[2]  A. J. Meadows The Origins of information science , 1987 .

[3]  Rajeev Motwani,et al.  Beyond market baskets: generalizing association rules to correlations , 1997, SIGMOD '97.

[4]  Helen M. Brooks,et al.  Expert systems and intelligent information retrieval , 1987, Inf. Process. Manag..

[5]  Byoung-Tak Zhang,et al.  AptaCDSS-E: A classifier ensemble-based clinical decision support system for cardiovascular disease level prediction , 2008, Expert Syst. Appl..

[6]  Ian H. Witten,et al.  Weka: Practical machine learning tools and techniques with Java implementations , 1999 .

[7]  Vjekoslav Galzina,et al.  An adaptive network-based fuzzy inference system (ANFIS) for the forecasting: The case of close price indices , 2013, Expert Syst. Appl..

[8]  Jorng-Tzong Horng,et al.  An expert system to predict protein thermostability using decision tree , 2009, Expert Syst. Appl..

[9]  Harichandran Khanna Nehemiah,et al.  Fuzzy neuro genetic approach for predicting the risk of cardiovascular diseases , 2010, Int. J. Data Min. Model. Manag..

[10]  Ian H. Witten,et al.  Generating Accurate Rule Sets Without Global Optimization , 1998, ICML.

[11]  Hongfei Lin,et al.  BioPPIExtractor: A protein-protein interaction extraction system for biomedical literature , 2009, Expert Syst. Appl..

[12]  Mrinal K. Naskar,et al.  A Fuzzy Based Distributed Algorithm for Maintaining Connected Network Topology in Mobile Ad-Hoc Networks Considering Freeway Mobility Model , 2012, J. Comput. Inf. Technol..

[13]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[14]  C. J. van Rijsbergen,et al.  Information Retrieval , 1979, Encyclopedia of GIS.

[15]  William Nick Street,et al.  Healthcare information systems: data mining methods in the creation of a clinical recommender system , 2011, Enterp. Inf. Syst..

[16]  E. Keedwell,et al.  Evolving rules from neural networks trained on continuous data , 2000, Proceedings of the 2000 Congress on Evolutionary Computation. CEC00 (Cat. No.00TH8512).

[17]  Patrick Meyer,et al.  On selecting interestingness measures for association rules: User oriented description and multiple criteria decision aid , 2008, Eur. J. Oper. Res..

[18]  Chih-Lin Chi,et al.  A decision support system for cost-effective diagnosis , 2010, Artif. Intell. Medicine.

[19]  Mohammed J. Zaki Generating non-redundant association rules , 2000, KDD '00.

[20]  Geoffrey I. Webb,et al.  Generality Is Predictive of Prediction Accuracy , 2006, Selected Papers from AusDM.

[21]  Mustapha Bourahla,et al.  Decision Support Technique for Supply Chain Management , 2013, J. Comput. Inf. Technol..

[22]  Stephen I. Gallant,et al.  Connectionist expert systems , 1988, CACM.

[23]  Mahdi Mahfouf,et al.  A hybrid hierarchical decision support system for cardiac surgical intensive care patients. Part II. Clinical implementation and evaluation , 2009, Artif. Intell. Medicine.

[24]  Jan Piecha The Neural Network Selection for a Medical Diagnostic System using an Artificial Data Set , 2001 .

[25]  Shi-jie Zheng,et al.  A genetic fuzzy radial basis function neural network for structural health monitoring of composite laminated beams , 2011, Expert Syst. Appl..

[26]  Alex Alves Freitas,et al.  Understanding the crucial differences between classification and discovery of association rules: a position paper , 2000, SKDD.

[27]  William W. Cohen Fast Effective Rule Induction , 1995, ICML.

[28]  Irving John Good,et al.  The Estimation of Probabilities: An Essay on Modern Bayesian Methods , 1965 .

[29]  Kate Smith-Miles,et al.  A New Approach of Eliminating Redundant Association Rules , 2004, DEXA.

[30]  Philippe Lenca,et al.  A Clustering of Interestingness Measures , 2004, Discovery Science.

[31]  Mustapha Bourahla,et al.  Generating Diagnoses for Probabilistic Model Checking Using Causality , 2013, J. Comput. Inf. Technol..

[32]  Howard J. Hamilton,et al.  Choosing the Right Lens: Finding What is Interesting in Data Mining , 2007, Quality Measures in Data Mining.

[33]  D. Haussler,et al.  Boolean Feature Discovery in Empirical Learning , 1990, Machine Learning.

[34]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[35]  Jaideep Srivastava,et al.  Selecting the right objective measure for association analysis , 2004, Inf. Syst..

[36]  Ali Karci,et al.  Estimation of protein structures by classification of angles between alpha-carbons of amino acids based on artificial neural networks , 2009, Expert Syst. Appl..

[37]  Lucila Ohno-Machado,et al.  Small, fuzzy and interpretable gene expression based classifiers , 2005, Bioinform..

[38]  Padhraic Smyth,et al.  Knowledge Discovery and Data Mining: Towards a Unifying Framework , 1996, KDD.

[39]  Patrick Meyer,et al.  Association Rule Interestingness Measures: Experimental and Theoretical Studies , 2007, Quality Measures in Data Mining.

[40]  Cheng-Jian Lin,et al.  A functional neural fuzzy network for classification applications , 2011, Expert Syst. Appl..

[41]  Christian Baumgartner,et al.  An evaluation of heuristics for rule ranking , 2010, Artif. Intell. Medicine.

[42]  R. Suganya,et al.  Data Mining Concepts and Techniques , 2010 .

[43]  E. Berner,et al.  Clinical Decision Support Systems: Theory and Practice , 1998 .

[44]  Claire Cassie,et al.  Marketing decision support systems , 1997 .

[45]  Vasiliki Mantzana,et al.  Identifying and Classifying Benefits of Integrated Healthcare Systems Using an Actor-Oriented Approach , 2004 .

[46]  Jaideep Srivastava,et al.  Selecting the right interestingness measure for association patterns , 2002, KDD.

[47]  Ana Regina Cavalcanti da Rocha,et al.  An expert system for diagnosis of acute myocardial infarction with ECG analysis , 1997, Artif. Intell. Medicine.

[48]  Das Amrita,et al.  Mining Association Rules between Sets of Items in Large Databases , 2013 .