Data-driven decision model based on dynamical classifier selection

Abstract In the era of big data, large volumes of data have been accumulated in different fields. To help make decisions by using the accumulated data, this paper proposes a data-driven decision model based on dynamical classifier selection (DCS). Under the framework of multi-criteria decision making, historical data including individual assessments on criteria and overall assessments are collected and represented by interval-valued numbers. A set of base classifiers is selected and trained using historical data. For each new alternative, its similar historical alternatives are identified from historical data through using its prediction derived from each trained classifier. By using the predictive accuracy of each base classifier and the average similarity between the new alternative and its similar historical alternatives, a new DCS strategy is developed to select an appropriate classifier from base classifiers for the new alternative. The developed DCS strategy effectively avoids the subjective determination of the size of the local region in the field of DCS. Based on the similar historical alternatives determined by the selected base classifier, an optimization model is constructed to learn criterion weights from the individual assessments of the alternatives on criteria and the predicted overall assessments derived from the selected base classifier. Then, the explainable decisions are generated with the learned criterion weights. The proposed decision model is used to aid the diagnosis of thyroid nodules. Through its comparison with traditional decision model and three representative DCS methods, the advantage of the proposed decision model in balancing accuracy and interpretability is validated.

[1]  Christoph Molnar,et al.  Interpretable Machine Learning , 2020 .

[2]  Jeremy N. V. Miles,et al.  R Squared, Adjusted R Squared† , 2005 .

[3]  Ankur Teredesai,et al.  Interpretable Machine Learning in Healthcare , 2018, 2018 IEEE International Conference on Healthcare Informatics (ICHI).

[4]  Chao Fu,et al.  Data-driven multiple criteria decision making for diagnosis of thyroid cancer , 2018, Annals of Operations Research.

[5]  B. Pradhan,et al.  A comparative assessment of flood susceptibility modeling using Multi-Criteria Decision-Making Analysis and Machine Learning Methods , 2019, Journal of Hydrology.

[6]  Qing Xie,et al.  An improved early detection method of type-2 diabetes mellitus using multiple classifier system , 2015, Inf. Sci..

[7]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[8]  Kevin W. Bowyer,et al.  Combination of Multiple Classifiers Using Local Accuracy Estimates , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[9]  Li Xia Rank of Interval Numbers Based on a New Distance Measure , 2008 .

[10]  Xue Zhao,et al.  Case-based reasoning approach for supporting building green retrofit decisions , 2019, Building and Environment.

[11]  Emilio Corchado,et al.  A survey of multiple classifier systems as hybrid systems , 2014, Inf. Fusion.

[12]  Antonio Irpino,et al.  Dynamic clustering of interval data using a Wasserstein-based distance , 2008, Pattern Recognit. Lett..

[13]  Vladimir Vapnik,et al.  Support-vector networks , 2004, Machine Learning.

[14]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[15]  G. Russ,et al.  Le système TIRADS en échographie thyroïdienne , 2011 .

[16]  Senén Barro,et al.  Do we need hundreds of classifiers to solve real world classification problems? , 2014, J. Mach. Learn. Res..

[17]  Paul C. Smits,et al.  Multiple classifier systems for supervised remote sensing image classification based on dynamic classifier selection , 2002, IEEE Trans. Geosci. Remote. Sens..

[18]  George D. C. Cavalcanti,et al.  Dynamic classifier selection: Recent advances and perspectives , 2018, Inf. Fusion.

[19]  Lars Kai Hansen,et al.  Neural Network Ensembles , 1990, IEEE Trans. Pattern Anal. Mach. Intell..

[20]  Minghe Sun,et al.  A hierarchical multiple kernel support vector machine for customer churn prediction using longitudinal behavioral data , 2012, Eur. J. Oper. Res..

[21]  Francisco Herrera,et al.  Explainable Artificial Intelligence (XAI): Concepts, Taxonomies, Opportunities and Challenges toward Responsible AI , 2020, Inf. Fusion.

[22]  M. Castellano,et al.  The predictive value of ultrasound findings in the management of thyroid nodules. , 2006, QJM : monthly journal of the Association of Physicians.

[23]  Stephen Farrell,et al.  US Features of thyroid malignancy: pearls and pitfalls. , 2007, Radiographics : a review publication of the Radiological Society of North America, Inc.

[24]  Roger J. Calantone,et al.  Artificial Neural Network Decision Support Systems for New Product Development Project Selection , 2000 .

[25]  M. Friedman The Use of Ranks to Avoid the Assumption of Normality Implicit in the Analysis of Variance , 1937 .

[26]  Lucien Duckstein,et al.  Comparison of fuzzy numbers using a fuzzy distance measure , 2002, Fuzzy Sets Syst..

[27]  Sung Hoon An,et al.  Comparison of construction cost estimating models based on regression analysis, neural networks, and case-based reasoning , 2004 .

[28]  Guang-Zhong Yang,et al.  XAI—Explainable artificial intelligence , 2019, Science Robotics.

[29]  Arun Rai,et al.  Explainable AI: from black box to glass box , 2019, Journal of the Academy of Marketing Science.

[30]  D. Opitz,et al.  Popular Ensemble Methods: An Empirical Study , 1999, J. Artif. Intell. Res..

[31]  Luiz Eduardo Soares de Oliveira,et al.  Dynamic selection of classifiers - A comprehensive review , 2014, Pattern Recognit..

[32]  Fabio Roli,et al.  Dynamic classifier selection based on multiple classifier behaviour , 2001, Pattern Recognit..

[33]  R. Jeffrey,et al.  Management of thyroid nodules detected at US: Society of Radiologists in Ultrasound consensus conference statement. , 2005, Ultrasound quarterly.

[34]  E. Horvath,et al.  Prospective validation of the ultrasound based TIRADS (Thyroid Imaging Reporting And Data System) classification: results in surgically resected thyroid nodules , 2017, European Radiology.

[35]  Jeong Hyun Lee,et al.  Benign and malignant thyroid nodules: US differentiation--multicenter retrospective study. , 2008, Radiology.

[36]  Sohrab Zendehboudi,et al.  Decision tree-based diagnosis of coronary artery disease: CART model , 2020, Comput. Methods Programs Biomed..

[37]  Chao Ye,et al.  Application of artificial neural network in the diagnostic system of osteoporosis , 2016, Neurocomputing.

[38]  Sang Won Yoon,et al.  A support vector machine-based ensemble algorithm for breast cancer diagnosis , 2017, Eur. J. Oper. Res..

[39]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..

[40]  Peide Liu,et al.  Method for Multiple Attribute Decision-making under Risk with Interval Numbers , 2010 .

[41]  Wonho Lee,et al.  A proposal for a thyroid imaging reporting and data system for ultrasound features of thyroid carcinoma. , 2009, Thyroid : official journal of the American Thyroid Association.

[42]  Dorit S. Hochbaum,et al.  A comparative study of the leading machine learning techniques and two new optimization algorithms , 2019, Eur. J. Oper. Res..

[43]  Chandan Singh,et al.  Definitions, methods, and applications in interpretable machine learning , 2019, Proceedings of the National Academy of Sciences.

[44]  Weiguo Fan,et al.  Review of Medical Decision Support and Machine-Learning Methods , 2019, Veterinary pathology.

[45]  Nir Friedman,et al.  Bayesian Network Classifiers , 1997, Machine Learning.

[46]  D. PraveenKumar,et al.  Machine learning algorithms for wireless sensor networks: A survey , 2019, Inf. Fusion.

[47]  Liana G. Apostolova,et al.  Comparison of AdaBoost and Support Vector Machines for Detecting Alzheimer's Disease Through Automated Hippocampal Segmentation , 2010, IEEE Transactions on Medical Imaging.

[48]  Przemyslaw Grzegorzewski,et al.  Distance-based linear discriminant analysis for interval-valued data , 2016, Inf. Sci..

[49]  Francisco de A. T. de Carvalho,et al.  Fuzzy clustering of interval-valued data with City-Block and Hausdorff distances , 2017, Neurocomputing.

[50]  Ben Glocker,et al.  Learning and combining image neighborhoods using random forests for neonatal brain disease classification , 2017, Medical Image Anal..

[51]  E. J. Ha,et al.  US Fine-Needle Aspiration Biopsy for Thyroid Malignancy: Diagnostic Performance of Seven Society Guidelines Applied to 2000 Thyroid Nodules. , 2018, Radiology.

[52]  Jie Zhang,et al.  An efficient assembly retrieval method based on Hausdorff distance , 2018, Robotics and Computer-Integrated Manufacturing.

[53]  Shanlin Yang,et al.  Data-driven group decision making for diagnosis of thyroid nodule , 2019, Science China Information Sciences.

[54]  Ludmila I. Kuncheva,et al.  A Theoretical Study on Six Classifier Fusion Strategies , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[55]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1997, EuroCOLT.

[56]  Antonio Criminisi Machine learning for medical images analysis , 2016, Medical Image Anal..

[57]  Terry S Desser,et al.  Common and Uncommon Sonographic Features of Papillary Thyroid Carcinoma , 2003, Journal of ultrasound in medicine : official journal of the American Institute of Ultrasound in Medicine.

[58]  Ricardo Rossi,et al.  An ultrasonogram reporting system for thyroid nodules stratifying cancer risk for clinical management. , 2009, The Journal of clinical endocrinology and metabolism.

[59]  D C CavalcantiGeorge,et al.  Dynamic classifier selection , 2018 .

[60]  Elpida T. Keravnou,et al.  Incorporating repeating temporal association rules in Naïve Bayes classifiers for coronary heart disease diagnosis , 2018, J. Biomed. Informatics.