Prediction of breast cancer risk using a machine learning approach embedded with a locality preserving projection algorithm

In order to automatically identify a set of effective mammographic image features and build an optimal breast cancer risk stratification model, this study aims to investigate advantages of applying a machine learning approach embedded with a locally preserving projection (LPP) based feature combination and regeneration algorithm to predict short-term breast cancer risk. A dataset involving negative mammograms acquired from 500 women was assembled. This dataset was divided into two age-matched classes of 250 high risk cases in which cancer was detected in the next subsequent mammography screening and 250 low risk cases, which remained negative. First, a computer-aided image processing scheme was applied to segment fibro-glandular tissue depicted on mammograms and initially compute 44 features related to the bilateral asymmetry of mammographic tissue density distribution between left and right breasts. Next, a multi-feature fusion based machine learning classifier was built to predict the risk of cancer detection in the next mammography screening. A leave-one-case-out (LOCO) cross-validation method was applied to train and test the machine learning classifier embedded with a LLP algorithm, which generated a new operational vector with 4 features using a maximal variance approach in each LOCO process. Results showed a 9.7% increase in risk prediction accuracy when using this LPP-embedded machine learning approach. An increased trend of adjusted odds ratios was also detected in which odds ratios increased from 1.0 to 11.2. This study demonstrated that applying the LPP algorithm effectively reduced feature dimensionality, and yielded higher and potentially more robust performance in predicting short-term breast cancer risk.

[1]  Syed Muhammad Anwar,et al.  Deep Learning in Medical Image Analysis , 2017 .

[2]  Mitchell H Gail,et al.  Comparing breast cancer risk assessment models. , 2010, Journal of the National Cancer Institute.

[3]  D. Kopans,et al.  Cumulative Probability of False-Positive Recall or Biopsy Recommendation After 10 Years of Screening Mammography: A Cohort Study , 2012 .

[4]  Bin Zheng,et al.  Computerized prediction of risk for developing breast cancer based on bilateral mammographic breast tissue asymmetry. , 2011, Medical engineering & physics.

[5]  Qiang Li,et al.  Reduction of bias and variance for evaluation of computer-aided diagnostic schemes. , 2006, Medical physics.

[6]  David Gur,et al.  Association Between Changes in Mammographic Image Features and Risk for Near-Term Breast Cancer Development , 2016, IEEE Transactions on Medical Imaging.

[7]  Bin Zheng,et al.  Applying a new quantitative global breast MRI feature analysis scheme to assess tumor response to chemotherapy , 2016, Journal of magnetic resonance imaging : JMRI.

[8]  Y H Chang,et al.  Computerized detection of masses in digitized mammograms using single-image segmentation and a multilayer topographic feature analysis. , 1995, Academic radiology.

[9]  M. Gail,et al.  Projecting individualized probabilities of developing breast cancer for white females who are being examined annually. , 1989, Journal of the National Cancer Institute.

[10]  Berkman Sahiner,et al.  Association of computerized mammographic parenchymal pattern measure with breast cancer risk: a pilot case-control study. , 2011, Radiology.

[11]  David Gur,et al.  Association between Computed Tissue Density Asymmetry in Bilateral Mammograms and Near‐term Breast Cancer Risk , 2014, The breast journal.

[12]  Yan Leng,et al.  Combining active learning and semi-supervised learning to construct SVM classifier , 2013, Knowl. Based Syst..

[13]  Shiju Yan,et al.  Improving lung cancer prognosis assessment by incorporating synthetic minority oversampling technique and score fusion method. , 2016, Medical physics.

[14]  M. Yaffe,et al.  American Cancer Society Guidelines for Breast Screening with MRI as an Adjunct to Mammography , 2007 .

[15]  Bin Zheng,et al.  Optimization of breast mass classification using sequential forward floating selection (SFFS) and a support vector machine (SVM) model , 2014, International Journal of Computer Assisted Radiology and Surgery.

[16]  A. Miller,et al.  Impact of screening mammography on mortality from breast cancer before age 60 in women 40 to 49 years of age. , 2014, Current oncology.

[17]  D. Miglioretti,et al.  Individual and Combined Effects of Age, Breast Density, and Hormone Replacement Therapy Use on the Accuracy of Screening Mammography , 2003, Annals of Internal Medicine.

[18]  D. Evans,et al.  Assessing women at high risk of breast cancer: a review of risk assessment models. , 2010, Journal of the National Cancer Institute.

[19]  Otis W. Brawley Risk-Based Mammography Screening: An Effort to Maximize the Benefits and Minimize the Harms , 2012, Annals of Internal Medicine.

[20]  D. Miglioretti,et al.  Cumulative Probability of False-Positive Recall or Biopsy Recommendation After 10 Years of Screening Mammography , 2011, Annals of Internal Medicine.

[21]  Kilian Q. Weinberger,et al.  Distance Metric Learning for Large Margin Nearest Neighbor Classification , 2005, NIPS.

[22]  Bin Zheng,et al.  Computer-aided classification of mammographic masses using visually sensitive image features. , 2017, Journal of X-ray science and technology.

[23]  Stephen W Duffy,et al.  A breast cancer prediction model incorporating familial and personal risk factors , 2004, Hereditary Cancer in Clinical Practice.

[24]  Xiaofei He,et al.  Locality Preserving Projections , 2003, NIPS.

[25]  David Gur,et al.  A method to improve visual similarity of breast masses for an interactive computer-aided diagnosis environment. , 2005, Medical physics.

[26]  Daniel B Kopans,et al.  Basic physics and doubts about relationship between mammographically determined tissue density and breast cancer risk. , 2008, Radiology.

[27]  Leonard Berlin,et al.  More mammography muddle: emotions, politics, science, costs, and polarization. , 2010, Radiology.

[28]  Mahadev Satyanarayanan,et al.  Optimization of reference library used in content-based medical image retrieval scheme. , 2007, Medical physics.

[29]  P. Langenberg,et al.  Breast Imaging Reporting and Data System: inter- and intraobserver variability in feature analysis and final assessment. , 2000, AJR. American journal of roentgenology.

[30]  B. Zheng,et al.  Bilateral mammographic density asymmetry and breast cancer risk: a preliminary assessment. , 2012, European journal of radiology.

[31]  Andrew N Freedman,et al.  Genome-Based Prediction of Breast Cancer Risk in the General Population: A Modeling Study Based on Meta-Analyses of Genetic Associations , 2011, Cancer Epidemiology, Biomarkers & Prevention.

[32]  Mark F McEntee,et al.  Mammographic Breast Density Assessment Using Automated Volumetric Software and Breast Imaging Reporting and Data System (BIRADS) Categorization by Expert Radiologists. , 2016, Academic radiology.