Data Analysis Strategies in Medical Imaging

Radiographic imaging continues to be one of the most effective and clinically useful tools within oncology. Sophistication of artificial intelligence has allowed for detailed quantification of radiographic characteristics of tissues using predefined engineered algorithms or deep learning methods. Precedents in radiology as well as a wealth of research studies hint at the clinical relevance of these characteristics. However, critical challenges are associated with the analysis of medical imaging data. Although some of these challenges are specific to the imaging field, many others like reproducibility and batch effects are generic and have already been addressed in other quantitative fields such as genomics. Here, we identify these pitfalls and provide recommendations for analysis strategies of medical imaging data, including data normalization, development of robust models, and rigorous statistical analyses. Adhering to these recommendations will not only improve analysis quality but also enhance precision medicine by allowing better integration of imaging data with other biomedical data sources. Clin Cancer Res; 24(15); 3492–9. ©2018 AACR.

[1]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[2]  J. van Leeuwen,et al.  Neural Networks: Tricks of the Trade , 2002, Lecture Notes in Computer Science.

[3]  G. A. Whitmore,et al.  Importance of replication in microarray gene expression studies: statistical methods and evidence from repetitive cDNA hybridizations. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[4]  Martin Vingron,et al.  Variance stabilization applied to microarray data calibration and to the quantification of differential expression , 2002, ISMB.

[5]  John Quackenbush Microarray data normalization and transformation , 2002, Nature Genetics.

[6]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[7]  D. Allison,et al.  Microarray data analysis: from disarray to consolidation and consensus , 2006, Nature Reviews Genetics.

[8]  Wen-Lin Kuo,et al.  A collection of breast cancer cell lines for the study of functionally distinct cancer subtypes. , 2006, Cancer cell.

[9]  Cheng Li,et al.  Adjusting batch effects in microarray expression data using empirical Bayes methods. , 2007, Biostatistics.

[10]  Yehuda Koren,et al.  Lessons from the Netflix prize challenge , 2007, SKDD.

[11]  Hon J. Yu,et al.  Quantitative analysis of lesion morphology and texture features for diagnostic prediction in breast MRI. , 2008, Academic radiology.

[12]  Mathias Prokop,et al.  Pulmonary ground-glass nodules: increase in mass as an early indicator of growth. , 2010, Radiology.

[13]  David M. Simcha,et al.  Tackling the widespread and critical impact of batch effects in high-throughput data , 2010, Nature Reviews Genetics.

[14]  Qiang Yang,et al.  A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[15]  Ashutosh Kumar Singh,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2010 .

[16]  Lutz Prechelt,et al.  Early Stopping - But When? , 2012, Neural Networks: Tricks of the Trade.

[17]  Gavin Brown,et al.  Conditional Likelihood Maximisation: A Unifying Framework for Information Theoretic Feature Selection , 2012, J. Mach. Learn. Res..

[18]  Andre Dekker,et al.  Radiomics: the process and the challenges. , 2012, Magnetic resonance imaging.

[19]  Patrick Granton,et al.  Radiomics: extracting more information from medical images using advanced feature analysis. , 2012, European journal of cancer.

[20]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[21]  Ameet Talwalkar,et al.  Foundations of Machine Learning , 2012, Adaptive computation and machine learning.

[22]  Pascal Vincent,et al.  Representation Learning: A Review and New Perspectives , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23]  Benjamin Haibe-Kains,et al.  Significance Analysis of Prognostic Signatures , 2013, PLoS Comput. Biol..

[24]  Senén Barro,et al.  Do we need hundreds of classifiers to solve real world classification problems? , 2014, J. Mach. Learn. Res..

[25]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[26]  E. Krishnan,et al.  Big Data and Clinicians: A Review on the State of the Science , 2014, JMIR medical informatics.

[27]  P. Lambin,et al.  Decoding tumour phenotype by noninvasive imaging using a quantitative radiomics approach , 2014, Nature Communications.

[28]  Scott N. Hwang,et al.  Outcome prediction in patients with glioblastoma by using imaging, clinical, and genomic biomarkers: focus on the nonenhancing component of the tumor. , 2014, Radiology.

[29]  Jürgen Schmidhuber,et al.  Deep learning in neural networks: An overview , 2014, Neural Networks.

[30]  P. Lambin,et al.  CT-based radiomic signature predicts distant metastasis in lung adenocarcinoma. , 2015, Radiotherapy and oncology : journal of the European Society for Therapeutic Radiology and Oncology.

[31]  Benjamin Haibe-Kains,et al.  Radiomic feature clusters and Prognostic Signatures specific for Lung and Head & Neck cancer , 2015, Scientific Reports.

[32]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[33]  P. Lambin,et al.  Radiomic Machine-Learning Classifiers for Prognostic Biomarkers of Head and Neck Cancer , 2015, Front. Oncol..

[34]  C. Lee What do we know about ground-glass opacity nodules in the lung? , 2015, Translational lung cancer research.

[35]  P. Lambin,et al.  Machine Learning methods for Quantitative Radiomic Biomarkers , 2015, Scientific Reports.

[36]  Ruijiang Li,et al.  Machine learning in radiation oncology : theory and applications , 2015 .

[37]  Rita Strack,et al.  Highly multiplexed imaging , 2015, Nature Methods.

[38]  H. Aerts Semantics Features : Phenotype Quantification by a Radiologist ’ s Expert Eye , 2016 .

[39]  Lubomir M. Hadjiiski,et al.  Radiomics of Lung Nodules: A Multi-Institutional Study of Robustness and Agreement of Quantitative Imaging Features , 2016, Tomography.

[40]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[41]  Ge Wang,et al.  A Perspective on Deep Imaging , 2016, IEEE Access.

[42]  Geoffrey E. Hinton,et al.  Layer Normalization , 2016, ArXiv.

[43]  Ronald M. Summers,et al.  Deep Convolutional Neural Networks for Computer-Aided Detection: CNN Architectures, Dataset Characteristics and Transfer Learning , 2016, IEEE Transactions on Medical Imaging.

[44]  P. Lambin,et al.  Exploratory Study to Identify Radiomics Classifiers for Lung Cancer Histology , 2016, Front. Oncol..

[45]  Howard Bowman,et al.  I Tried a Bunch of Things: The Dangers of Unexpected Overfitting in Classification , 2016, bioRxiv.

[46]  Jake Luo,et al.  Big Data Application in Biomedical Research and Health Care: A Literature Review , 2016, Biomedical informatics insights.

[47]  P. Lambin,et al.  Defining the biological basis of radiomic phenotypes in lung cancer , 2017, eLife.

[48]  Bram van Ginneken,et al.  A survey on deep learning in medical image analysis , 2017, Medical Image Anal..

[49]  Bin Zhang,et al.  Radiomic machine-learning classifiers for prognostic biomarkers of advanced nasopharyngeal carcinoma. , 2017, Cancer letters.

[50]  Chintan Parmar,et al.  Associations of Radiomic Data Extracted from Static and Respiratory-Gated CT Scans with Disease Recurrence in Lung Cancer Patients Treated with SBRT , 2017, PloS one.

[51]  Hugo J W L Aerts,et al.  Data Science in Radiology: A Path Forward , 2017, Clinical Cancer Research.

[52]  John Quackenbush,et al.  Somatic Mutations Drive Distinct Imaging Phenotypes in Lung Cancer. , 2017, Cancer research.

[53]  Chintan Parmar,et al.  Associations between radiologist-defined semantic and automatically computed radiomic features in non-small cell lung cancer , 2017, Scientific Reports.

[54]  Andriy Fedorov,et al.  Computational Radiomics System to Decode the Radiographic Phenotype. , 2017, Cancer research.

[55]  Stuart A. Taylor,et al.  Imaging biomarker roadmap for cancer studies , 2016, Nature Reviews Clinical Oncology.

[56]  Christoph Meinel,et al.  Deep Learning for Medical Image Analysis , 2018, Journal of Pathology Informatics.

[57]  Jing Wang,et al.  Machine learning-based analysis of MR radiomics can help to improve the diagnostic performance of PI-RADS v2 in clinically relevant prostate cancer , 2017, European Radiology.

[58]  J. Choi,et al.  Pathologic stratification of operable lung adenocarcinoma using radiomics features extracted from dual energy CT images , 2016, Oncotarget.

[59]  Fei Wang,et al.  Deep learning for healthcare: review, opportunities and challenges , 2018, Briefings Bioinform..

[60]  B. Merkely,et al.  Cardiac Computed Tomography Radiomics: A Comprehensive Review on Radiomic Techniques , 2018, Journal of thoracic imaging.