Quantitative analysis of breast cancer diagnosis using a probabilistic modelling approach

BACKGROUND Breast cancer is the most prevalent cancer in women in most countries of the world. Many computer-aided diagnostic methods have been proposed, but there are few studies on quantitative discovery of probabilistic dependencies among breast cancer data features and identification of the contribution of each feature to breast cancer diagnosis. METHODS This study aims to fill this void by utilizing a Bayesian network (BN) modelling approach. A K2 learning algorithm and statistical computation methods are used to construct BN structure and assess the obtained BN model. The data used in this study were collected from a clinical ultrasound dataset derived from a Chinese local hospital and a fine-needle aspiration cytology (FNAC) dataset from UCI machine learning repository. RESULTS Our study suggested that, in terms of ultrasound data, cell shape is the most significant feature for breast cancer diagnosis, and the resistance index presents a strong probabilistic dependency on blood signals. With respect to FNAC data, bare nuclei are the most important discriminating feature of malignant and benign breast tumours, and uniformity of both cell size and cell shape are tightly interdependent. CONTRIBUTIONS The BN modelling approach can support clinicians in making diagnostic decisions based on the significant features identified by the model, especially when some other features are missing for specific patients. The approach is also applicable to other healthcare data analytics and data modelling for disease diagnosis.

[1]  Zidong Wang,et al.  Inference of Nonlinear State-Space Models for Sandwich-Type Lateral Flow Immunoassay Using Extended Kalman Filtering , 2011, IEEE Transactions on Biomedical Engineering.

[2]  Ah-Hwee Tan,et al.  Explaining inferences in Bayesian networks , 2008, Applied Intelligence.

[3]  J. R. Landis,et al.  The measurement of observer agreement for categorical data. , 1977, Biometrics.

[4]  Hiroyuki Takei,et al.  Diagnostic value of fine needle aspiration and core needle biopsy in special types of breast cancer , 2016, Breast Cancer.

[5]  Jill L. King,et al.  Computer-assisted diagnosis of breast cancer using a data-driven Bayesian belief network , 1999, Int. J. Medical Informatics.

[6]  Jeon-Hor Chen,et al.  Computer-aided diagnosis of breast masses using quantified BI-RADS findings , 2013, Comput. Methods Programs Biomed..

[7]  Kuo-Chu Chang,et al.  Comparison of score metrics for Bayesian network learning , 2002, IEEE Trans. Syst. Man Cybern. Part A.

[8]  Faïez Gargouri,et al.  Improving algorithms for structure learning in Bayesian Networks using a new implicit score , 2010, Expert Syst. Appl..

[9]  Qiang Shen,et al.  Learning Bayesian networks: approaches and issues , 2011, The Knowledge Engineering Review.

[10]  Barbara Di Eugenio,et al.  Squibs and Discussions: The Kappa Statistic: A Second Look , 2004, CL.

[11]  P. Bartels,et al.  Expert system support using Bayesian belief networks in the diagnosis of fine needle aspiration biopsy specimens of the breast. , 1994, Journal of clinical pathology.

[13]  José Mira Mira,et al.  DIAVAL, a Bayesian expert system for echocardiography , 1997, Artif. Intell. Medicine.

[14]  Zidong Wang,et al.  A Hybrid EKF and Switching PSO Algorithm for Joint State and Parameter Estimation of Lateral Flow Immunoassay Models , 2012, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[15]  FayeIbrahima,et al.  A statistical based feature extraction method for breast cancer diagnosis in digital mammogram using multiresolution representation , 2012 .

[16]  อนิรุธ สืบสิงห์,et al.  Data Mining Practical Machine Learning Tools and Techniques , 2014 .

[17]  Tsuyoshi Murata,et al.  {m , 1934, ACML.

[18]  William Marsh,et al.  From complex questionnaire and interviewing data to intelligent Bayesian Network models for medical decision support , 2016, Artif. Intell. Medicine.

[19]  Shu Ichihara,et al.  Breast cancer prognostic classification in the molecular era: the role of histological grade , 2010, Breast Cancer Research.

[20]  Peter J. F. Lucas,et al.  Understanding disease processes by partitioned dynamic Bayesian networks , 2016, J. Biomed. Informatics.

[21]  G. Hortobagyi,et al.  Past, present, and future challenges in breast cancer treatment. , 2014, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[22]  Gregory F. Cooper,et al.  A Bayesian method for the induction of probabilistic networks from data , 1992, Machine-mediated learning.

[23]  Edward J. Coyle,et al.  Stack filters and the mean absolute error criterion , 1988, IEEE Trans. Acoust. Speech Signal Process..

[24]  C. E. SHANNON,et al.  A mathematical theory of communication , 1948, MOCO.

[25]  Nicandro Cruz-Ramírez,et al.  Diagnosis of breast cancer using Bayesian networks: A case study , 2007, Comput. Biol. Medicine.

[26]  Pieter Kraaijeveld,et al.  GeNIeRate: An Interactive Generator of Diagnostic Bayesian Network Models , 2005 .

[27]  Mitch Dowsett,et al.  Current and emerging biomarkers in breast cancer: prognosis and prediction. , 2010, Endocrine-related cancer.

[28]  Zidong Wang,et al.  Inferring nonlinear lateral flow immunoassay state-space models via an unscented Kalman filter , 2016, Science China Information Sciences.

[29]  Yongbin Lu,et al.  Efficacy and safety of long-term treatment with statins for coronary heart disease: A Bayesian network meta-analysis. , 2016, Atherosclerosis.

[30]  Christopher J. Merz,et al.  UCI Repository of Machine Learning Databases , 1996 .

[31]  Richard E. Neapolitan,et al.  Discovering causal interactions using Bayesian network scoring and information gain , 2016, BMC Bioinformatics.

[32]  Nicandro Cruz-Ramírez,et al.  Evaluation of the Diagnostic Power of Thermography in Breast Cancer Using Bayesian Network Classifiers , 2013, Comput. Math. Methods Medicine.

[33]  Bianca Zadrozny,et al.  A Bayesian network decision model for supporting the diagnosis of dementia, Alzheimer's disease and mild cognitive impairment , 2014, Comput. Biol. Medicine.

[34]  Mohammad Mansour Riahi Kashani,et al.  Bayesian network modeling for diagnosis of social anxiety using some cognitive-behavioral factors , 2013, Network Modeling Analysis in Health Informatics and Bioinformatics.

[35]  Jeon-Hor Chen,et al.  Quantitative Ultrasound Analysis for Classification of BI-RADS Category 3 Breast Masses , 2013, Journal of Digital Imaging.

[36]  Juan Shan,et al.  Computer-Aided Diagnosis for Breast Ultrasound Using Computerized BI-RADS Features and Machine Learning Methods. , 2016, Ultrasound in medicine & biology.

[37]  Wanqing Chen,et al.  Breast cancer in China. , 2014, The Lancet. Oncology.

[38]  Murat Karabatak,et al.  A new classifier for breast cancer detection based on Naïve Bayesian , 2015 .

[39]  Ahmed Rebai,et al.  ANALYSIS OF BREAST CANCER PROFILES USING BAYESIAN NETWORK MODELING , 2013 .

[40]  Nicandro Cruz-Ramírez,et al.  Discovering interobserver variability in the cytodiagnosis of breast cancer using decision trees and Bayesian networks , 2009, Appl. Soft Comput..

[41]  Emad A. Mohammed,et al.  Breast tumor classification using a new OWA operator , 2016, Expert Syst. Appl..

[42]  T. Chai,et al.  Root mean square error (RMSE) or mean absolute error (MAE)? – Arguments against avoiding RMSE in the literature , 2014 .

[43]  David Maxwell Chickering,et al.  Learning Bayesian Networks: The Combination of Knowledge and Statistical Data , 1994, Machine Learning.

[44]  S. Lauritzen The EM algorithm for graphical association models with missing data , 1995 .

[45]  H. Iwase,et al.  [Breast cancer]. , 2006, Nihon rinsho. Japanese journal of clinical medicine.

[46]  Eric C Ford,et al.  Bayesian network models for error detection in radiotherapy plans , 2015, Physics in medicine and biology.

[47]  Dar-Ren Chen,et al.  Computer-aided diagnosis with textural features for breast lesions in sonograms , 2011, Comput. Medical Imaging Graph..

[48]  Carlo Sansone,et al.  Pattern Recognition Approaches for Breast Cancer DCE-MRI Classification: A Systematic Review , 2016, Journal of Medical and Biological Engineering.

[49]  Samir Brahim Belhaouari,et al.  A statistical based feature extraction method for breast cancer diagnosis in digital mammogram using multiresolution representation , 2012, Comput. Biol. Medicine.