Multi-criterion mammographic risk analysis supported with multi-label fuzzy-rough feature selection

CONTEXT AND BACKGROUND Breast cancer is one of the most common diseases threatening the human lives globally, requiring effective and early risk analysis for which learning classifiers supported with automated feature selection offer a potential robust solution. MOTIVATION Computer aided risk analysis of breast cancer typically works with a set of extracted mammographic features which may contain significant redundancy and noise, thereby requiring technical developments to improve runtime performance in both computational efficiency and classification accuracy. HYPOTHESIS Use of advanced feature selection methods based on multiple diagnosis criteria may lead to improved results for mammographic risk analysis. METHODS An approach for multi-criterion based mammographic risk analysis is proposed, by adapting the recently developed multi-label fuzzy-rough feature selection mechanism. RESULTS A system for multi-criterion mammographic risk analysis is implemented with the aid of multi-label fuzzy-rough feature selection and its performance is positively verified experimentally, in comparison with representative popular mechanisms. CONCLUSIONS The novel approach for mammographic risk analysis based on multiple criteria helps improve classification accuracy using selected informative features, without suffering from the redundancy caused by such complex criteria, with the implemented system demonstrating practical efficacy.

[1]  C. D'Orsi Breast Imaging Reporting and Data System (BI-RADS) , 2018 .

[2]  Ian W. Ricketts,et al.  The Mammographic Image Analysis Society digital mammogram database , 1994 .

[3]  Xiangyang Wang,et al.  Feature selection based on rough sets and particle swarm optimization , 2007, Pattern Recognit. Lett..

[4]  Rina Dechter,et al.  Generalized best-first search strategies and the optimality of A* , 1985, JACM.

[5]  A. Miller,et al.  Quantitative classification of mammographic densities and breast cancer risk: results from the Canadian National Breast Screening Study. , 1995, Journal of the National Cancer Institute.

[6]  Ramakrishnan Srikant,et al.  Fast algorithms for mining association rules , 1998, VLDB 1998.

[7]  Yoshua Bengio,et al.  Inference for the Generalization Error , 1999, Machine Learning.

[8]  Brijesh Verma,et al.  A computer-aided diagnosis system for digital mammograms based on fuzzy-neural and feature extraction techniques , 2001, IEEE Transactions on Information Technology in Biomedicine.

[9]  Wei Wu,et al.  Kernel-based Fuzzy-rough Nearest-neighbour Classification for Mammographic Risk Analysis , 2015, International Journal of Fuzzy Systems.

[10]  Robert Marti,et al.  A Novel Breast Tissue Density Classification Methodology , 2008, IEEE Transactions on Information Technology in Biomedicine.

[11]  Wei Wu,et al.  Multi-functional nearest-neighbour classification , 2018, Soft Comput..

[12]  Hong-yu Zhang,et al.  A hybrid PSO-SVM model based on clustering algorithm for short-term atmospheric pollutant concentration forecasting , 2019, Technological Forecasting and Social Change.

[13]  A. Vadivel,et al.  A fuzzy rule-based approach for characterization of mammogram masses into BI-RADS shape categories , 2013, Comput. Biol. Medicine.

[14]  David W. Aha,et al.  Instance-Based Learning Algorithms , 1991, Machine Learning.

[15]  S. Cessie,et al.  Ridge Estimators in Logistic Regression , 1992 .

[16]  Rich Caruana,et al.  Greedy Attribute Selection , 1994, ICML.

[17]  Qiang Shen,et al.  New Approaches to Fuzzy-Rough Feature Selection , 2009, IEEE Transactions on Fuzzy Systems.

[18]  Charles X. Ling,et al.  Using AUC and accuracy in evaluating learning algorithms , 2005, IEEE Transactions on Knowledge and Data Engineering.

[19]  Gary Geunbae Lee,et al.  Information gain and divergence-based feature selection for machine learning-based text categorization , 2006, Inf. Process. Manag..

[20]  Amr Sharawy,et al.  Computer aided detection system for micro calcifications in digital mammograms , 2014, Comput. Methods Programs Biomed..

[21]  Grigorios Tsoumakas,et al.  Random k -Labelsets: An Ensemble Method for Multilabel Classification , 2007, ECML.

[22]  Vipin Kumar,et al.  Introduction to Data Mining , 2022, Data Mining and Machine Learning Applications.

[23]  Sang Won Yoon,et al.  Breast cancer diagnosis based on feature extraction using a hybrid of K-means and support vector machine algorithms , 2014, Expert Syst. Appl..

[24]  Ansheng Deng,et al.  Associated multi-label fuzzy-rough feature selection , 2017, 2017 Joint 17th World Congress of International Fuzzy Systems Association and 9th International Conference on Soft Computing and Intelligent Systems (IFSA-SCIS).

[25]  J R Beck,et al.  The use of relative operating characteristic (ROC) curves in test performance evaluation. , 1986, Archives of pathology & laboratory medicine.

[26]  J. C. Fu,et al.  Image segmentation feature selection and pattern classification for mammographic microcalcifications. , 2005, Computerized medical imaging and graphics : the official journal of the Computerized Medical Imaging Society.

[27]  Eibe Frank,et al.  Large-scale attribute selection using wrappers , 2009, 2009 IEEE Symposium on Computational Intelligence and Data Mining.

[28]  Mark A. Hall,et al.  Correlation-based Feature Selection for Machine Learning , 2003 .

[29]  Pat Langley,et al.  Estimating Continuous Distributions in Bayesian Classifiers , 1995, UAI.

[30]  F. Bray,et al.  The changing global patterns of female breast cancer incidence and mortality , 2004, Breast Cancer Research.

[31]  Wei Wu,et al.  Evolutionary Fuzzy Extreme Learning Machine for Mammographic Risk Analysis , 2011 .

[32]  Samir Brahim Belhaouari,et al.  A statistical based feature extraction method for breast cancer diagnosis in digital mammogram using multiresolution representation , 2012, Comput. Biol. Medicine.

[33]  Chris Cornelis,et al.  Feature Selection with Fuzzy Decision Reducts , 2008, RSKT.

[34]  R. Brem,et al.  Breast Cancer: The Art and Science of Early Detection With Mammography: Perception, Interpretation, Histopathologic Correlation , 2008 .

[35]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[36]  N. Graham,et al.  Areas beneath the relative operating characteristics (ROC) and relative operating levels (ROL) curves: Statistical significance and interpretation , 2002 .

[37]  Juan-juan Peng,et al.  A multi-criteria decision-making framework for risk ranking of energy performance contracting project under picture fuzzy environment , 2018, Journal of Cleaner Production.

[38]  J. Wolfe Risk for breast cancer development determined by mammographic parenchymal pattern , 1976, Cancer.

[39]  Huan Liu,et al.  Feature Selection and Classification - A Probabilistic Wrapper Approach , 1996, IEA/AIE.

[40]  Das Amrita,et al.  Mining Association Rules between Sets of Items in Large Databases , 2013 .

[41]  D. Kibler,et al.  Instance-based learning algorithms , 2004, Machine Learning.

[42]  Jianqiang Wang,et al.  Investment risk evaluation for new energy resources: An integrated decision support model based on regret theory and ELECTRE III , 2019, Energy Conversion and Management.

[43]  Nico Karssemeijer,et al.  Artificial Intelligence in Medicine , 2022 .

[44]  Xinbo Gao,et al.  Latent feature mining of spatial and marginal characteristics for mammographic mass classification , 2014, Neurocomputing.

[45]  Rhian Gabe,et al.  The challenge of evaluating annual mammography screening for young women with a family history of breast cancer , 2006, Journal of medical screening.

[46]  Daniel T. Larose,et al.  Discovering Knowledge in Data: An Introduction to Data Mining , 2005 .