A case-based reasoning system for supervised classification problems in the medical field

Abstract Case-Based Reasoning (CBR) system relies on reuse for solving new problems. The system uses the experiences it previously acquired and stored into its case base to address the newly faced problems. A static and non-evolutive case base hinders the system and limits the accuracy of the CBR in problem-solving. While a massive case base can affect the resolution time. Randomization represents a way to generate data without deteriorating the spatial image of the case base and by extension the search time as well. However, the cases generated by randomization are not necessarily valid and require a thorough validation process to access their validity. This paper presents a new amplification technique based on randomization for a CBR system incorporating a structured case-base that speeds up case retrieval while supporting case retention. The generated data by randomization is validated through a three-layer validation process: coherence verification, stochastic validation, and absolute validation. Furthermore, we propose a new way to segment the case base along with new similarity functions based on features’ weights to speed CBR retrieval. We carried out experiments on mammography mass and thyroid disease datasets to validate our approach, where the proposed approach is compared to several popular supervised machine-learning methods and other related works that utilize the same datasets. Experiments have shown that our approach can generate relevant data, which significantly improves the resolution accuracy and makes CBR a good competitor to classification methods.

[1]  Knowledge Amplification through Randomization for Scheduling Systems , 2017, 2017 IEEE International Conference on Information Reuse and Integration (IRI).

[2]  Stuart Harvey Rubin,et al.  Randomization-Based Knowledge Discovery with Application to Weather Prediction , 2018, 2018 IEEE International Conference on Information Reuse and Integration (IRI).

[3]  Xuesong Yan,et al.  Survey of Improving Naive Bayes for Classification , 2007, ADMA.

[4]  Zied Elouedi,et al.  Using clustering for maintaining case based reasoning systems , 2013, 2013 5th International Conference on Modeling, Simulation and Applied Optimization (ICMSAO).

[5]  Esin Dogantekin,et al.  An automatic diagnosis system based on thyroid gland: ADSTG , 2010, Expert Syst. Appl..

[6]  Agnar Aamodt,et al.  Case-Based Reasoning: Foundational Issues, Methodological Variations, and System Approaches , 1994, AI Commun..

[7]  Jacek M Zurada,et al.  Selection of examples in case-based computer-aided decision systems , 2008, Physics in medicine and biology.

[8]  Léon Bottou,et al.  Stochastic Gradient Descent Tricks , 2012, Neural Networks: Tricks of the Trade.

[9]  Ali Zeinal Hamadani,et al.  Case-based reasoning for classification in the mixed data sets employing the compound distance methods , 2013, Eng. Appl. Artif. Intell..

[10]  Esin Dogantekin,et al.  An expert system based on Generalized Discriminant Analysis and Wavelet Support Vector Machine for diagnosis of thyroid diseases , 2011, Expert Syst. Appl..

[11]  Thouraya Bouabana-Tebibel,et al.  An Approach Transmutation-Based in Case-Based Reasoning , 2016 .

[12]  S. H. Rubin,et al.  Learning in the large: case-based software systems design , 1991, Conference Proceedings 1991 IEEE International Conference on Systems, Man, and Cybernetics.

[13]  Thouraya Bouabana-Tebibel,et al.  Knowledge Amplification Using Randomization in Case-Based Reasoning -- Case Study: Severity of Mammography Mass , 2018, 2018 IEEE International Conference on Information Reuse and Integration (IRI).

[14]  Mei-Ling Huang,et al.  Usage of Case-Based Reasoning, Neural Network and Adaptive Neuro-Fuzzy Inference System Classification Techniques in Breast Cancer Dataset Classification Diagnosis , 2012, Journal of Medical Systems.

[15]  Tulay Yildirim,et al.  Diagnosis of thyroid disease using artificial neural network methods , 2002, Proceedings of the 9th International Conference on Neural Information Processing, 2002. ICONIP '02..

[16]  Hava T. Siegelmann,et al.  Support Vector Clustering , 2002, J. Mach. Learn. Res..

[17]  Stuart Harvey Rubin,et al.  KASER: knowledge amplification by structured expert randomization , 2004, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[18]  M. Elter,et al.  CADx of mammographic masses and clustered microcalcifications: a review. , 2009, Medical physics.

[19]  S. A. Rubin Computing with words , 1999, IEEE Trans. Syst. Man Cybern. Part B.

[20]  M. Elter,et al.  The prediction of breast cancer biopsy outcomes using two CAD approaches that both emphasize an intelligible decision process. , 2007, Medical physics.

[21]  G. Chaitin Randomness and Mathematical Proof , 1975 .

[22]  Antanas Verikas,et al.  Mining data with random forests: A survey and results of new tests , 2011, Pattern Recognit..

[23]  Zied Elouedi,et al.  SCBM: soft case base maintenance method based on competence model , 2018, J. Comput. Sci..

[24]  Keinosuke Fukunaga,et al.  A Branch and Bound Algorithm for Computing k-Nearest Neighbors , 1975, IEEE Transactions on Computers.

[25]  Michael M. Richter,et al.  Case-Based Reasoning , 2013, Springer Berlin Heidelberg.

[26]  Moninder Singh,et al.  Construction of Bayesian network structures from data: A brief survey and an efficient algorithm , 1995, Int. J. Approx. Reason..

[27]  Isabelle Bichindaritz,et al.  Advances in case-based reasoning in the health sciences , 2011, Artif. Intell. Medicine.

[28]  Thouraya Bouabana-Tebibel,et al.  NNCS: Randomization and Informed Search for Novel Naval Cyber Strategies , 2016, Recent Advances in Computational Intelligence in Defense and Security.

[29]  Zied Elouedi,et al.  COID: Maintaining Case Method Based on Clustering, Outliers and Internal Detection , 2010 .

[30]  Sotiris B. Kotsiantis,et al.  Supervised Machine Learning: A Review of Classification Techniques , 2007, Informatica.

[31]  Thouraya Bouabana-Tebibel,et al.  Knowledge-Based Randomization for Amplification , 2018, 2018 IEEE International Conference on Information Reuse and Integration (IRI).

[32]  Witold Pedrycz,et al.  Data compactification and computing with words , 2010, Eng. Appl. Artif. Intell..

[33]  Stephen I. Gallant,et al.  Perceptron-based learning algorithms , 1990, IEEE Trans. Neural Networks.

[34]  Ramón López de Mántaras,et al.  Machine Learning from Examples: Inductive and Lazy Methods , 1998, Data Knowl. Eng..

[35]  Geoff Holmes,et al.  Evaluation methods and decision theory for classification of streaming data with temporal dependence , 2015, Machine Learning.

[36]  Thouraya Bouabana-Tebibel,et al.  Case Indexing by Component, Context, and Encapsulation for Knowledge Reuse , 2016, Theoretical Information Reuse and Integration.

[37]  Ibrahim F. Moawad,et al.  A New Hybrid Case-Based Reasoning Approach for Medical Diagnosis Systems , 2014, Journal of Medical Systems.

[38]  Pu Wang,et al.  Case-Based Reasoning Model with Genetic Algorithms, Group Decision-Making and Template Reduction , 2016, Int. J. Artif. Intell. Tools.

[39]  Gwénolé Quellec,et al.  Case Retrieval in Medical Databases by Fusing Heterogeneous Information , 2015, IEEE Transactions on Medical Imaging.

[40]  Hsinchun Chen Machine learning for information retrieval: neural networks, symbolic learning, and genetic algorithms , 1995 .

[41]  Thouraya Bouabana-Tebibel,et al.  Naval Intelligent Authentication and Support Through Randomization and Transformative Search , 2016, New Approaches in Intelligent Control.

[42]  A. Jemal,et al.  Cancer statistics, 2019 , 2019, CA: a cancer journal for clinicians.

[43]  Qinghua Hu,et al.  Rule extraction from support vector machines based on consistent region covering reduction , 2013, Knowl. Based Syst..

[44]  M. Abdul Rehman Soomrani,et al.  TDTD: Thyroid disease type diagnostics , 2016, 2016 International Conference on Intelligent Systems Engineering (ICISE).

[45]  Hyeran Byun,et al.  Applications of Support Vector Machines for Pattern Recognition: A Survey , 2002, SVM.

[46]  Thouraya Bouabana-Tebibel,et al.  Knowledge Induction Based on Randomization in Case-Based Reasoning , 2016, 2016 IEEE 17th International Conference on Information Reuse and Integration (IRI).

[47]  Gang Wang,et al.  A Three-Stage Expert System Based on Support Vector Machines for Thyroid Disease Diagnosis , 2012, Journal of Medical Systems.

[48]  Fevzullah Temurtas,et al.  A comparative study on thyroid disease diagnosis using neural networks , 2009, Expert Syst. Appl..

[49]  K.Z. Mao,et al.  Orthogonal forward selection and backward elimination algorithms for feature subset selection , 2004, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[50]  Zied Elouedi,et al.  Maintaining Case Based Reasoning Systems Based on Soft Competence Model , 2014, HAIS.

[51]  Pei-Chann Chang,et al.  A hybrid model combining case-based reasoning and fuzzy decision tree for medical data classification , 2011, Appl. Soft Comput..

[52]  Stuart Harvey Rubin On randomization and discovery , 2007, Inf. Sci..

[53]  Yoichi Hayashi,et al.  Synergy effects between grafting and subdivision in Re-RX with J48graft for the diagnosis of thyroid disease , 2017, Knowl. Based Syst..

[54]  Ivanoe De Falco,et al.  Differential Evolution for automatic rule extraction from medical databases , 2013, Appl. Soft Comput..

[55]  Wei-Chang Yeh,et al.  A hybrid immune-estimation distribution of algorithm for mining thyroid gland data , 2010, Expert Syst. Appl..

[56]  Kyoung-jae Kim,et al.  Global optimization of case-based reasoning for breast cytology diagnosis , 2009, Expert Syst. Appl..