A Multicenter, Scan-Rescan, Human and Machine Learning CMR Study to Test Generalizability and Precision in Imaging Biomarker Analysis.

BACKGROUND Automated analysis of cardiac structure and function using machine learning (ML) has great potential, but is currently hindered by poor generalizability. Comparison is traditionally against clinicians as a reference, ignoring inherent human inter- and intraobserver error, and ensuring that ML cannot demonstrate superiority. Measuring precision (scan:rescan reproducibility) addresses this. We compared precision of ML and humans using a multicenter, multi-disease, scan:rescan cardiovascular magnetic resonance data set. METHODS One hundred ten patients (5 disease categories, 5 institutions, 2 scanner manufacturers, and 2 field strengths) underwent scan:rescan cardiovascular magnetic resonance (96% within one week). After identification of the most precise human technique, left ventricular chamber volumes, mass, and ejection fraction were measured by an expert, a trained junior clinician, and a fully automated convolutional neural network trained on 599 independent multicenter disease cases. Scan:rescan coefficient of variation and 1000 bootstrapped 95% CIs were calculated and compared using mixed linear effects models. RESULTS Clinicians can be confident in detecting a 9% change in left ventricular ejection fraction, with greater than half of coefficient of variation attributable to intraobserver variation. Expert, trained junior, and automated scan:rescan precision were similar (for left ventricular ejection fraction, coefficient of variation 6.1 [5.2%-7.1%], P=0.2581; 8.3 [5.6%-10.3%], P=0.3653; 8.8 [6.1%-11.1%], P=0.8620). Automated analysis was 186× faster than humans (0.07 versus 13 minutes). CONCLUSIONS Automated ML analysis is faster with similar precision to the most precise human techniques, even when challenged with real-world scan:rescan data. Assessment of multicenter, multi-vendor, multi-field strength scan:rescan data (available at www.thevolumesresource.com) permits a generalizable assessment of ML precision and may facilitate direct translation of ML to clinical practice.

[1]  S. Plein,et al.  Deep Learning-based Method for Fully Automatic Quantification of Left Ventricle Function from Cine MR Images: A Multivendor, Multicenter Study. , 2019, Radiology.

[2]  T. Marwick Ejection Fraction Pros and Cons: JACC State-of-the-Art Review. , 2018, Journal of the American College of Cardiology.

[3]  Sanjay K. Prasad,et al.  Myocardial Scar and Mortality in Severe Aortic Stenosis , 2018, Circulation.

[4]  Jürgen Hennig,et al.  Determination of aortic stiffness using 4D flow cardiovascular magnetic resonance - a population-based study , 2018, Journal of Cardiovascular Magnetic Resonance.

[5]  Xin Yang,et al.  Deep Learning Techniques for Automatic MRI Cardiac Multi-Structures Segmentation and Diagnosis: Is the Problem Solved? , 2018, IEEE Transactions on Medical Imaging.

[6]  Ben Glocker,et al.  Automated cardiovascular magnetic resonance image analysis with fully convolutional networks , 2017, Journal of Cardiovascular Magnetic Resonance.

[7]  Pavel V Hushcha,et al.  Machine Learning Approaches in Cardiovascular Imaging , 2017, Circulation. Cardiovascular imaging.

[8]  D. Bluemke,et al.  Community delivery of semiautomated fractal analysis tool in cardiac mr for trabecular phenotyping , 2017, Journal of magnetic resonance imaging : JMRI.

[9]  J. Totman,et al.  Influence of the short-axis cine acquisition protocol on the cardiac function evaluation: A reproducibility study , 2016, European journal of radiology open.

[10]  Hamid Jafarkhani,et al.  A combined deep-learning and deformable-model approach to fully automatic segmentation of the left ventricle in cardiac MRI , 2015, Medical Image Anal..

[11]  Alistair A. Young,et al.  Quantification of LV function and mass by cardiovascular magnetic resonance: multi-center variability and consensus contours , 2015, Journal of Cardiovascular Magnetic Resonance.

[12]  R P Steeds,et al.  Variability in cardiac MR measurement of left ventricular ejection fraction, volumes and mass in healthy adults: defining a significant change at 1 year. , 2015, The British journal of radiology.

[13]  Nicholas Ayache,et al.  A collaborative resource to build consensus for automated left ventricular segmentation of cardiac MR images , 2014, Medical Image Anal..

[14]  Scott D Flamm,et al.  Standardized cardiovascular magnetic resonance (CMR) protocols 2013 update , 2013, Journal of Cardiovascular Magnetic Resonance.

[15]  David Clark,et al.  Quantification of left ventricular indices from SSFP cine imaging: Impact of real‐world variability in analysis methodology and utility of geometric modeling , 2013, Journal of magnetic resonance imaging : JMRI.

[16]  Scott D Flamm,et al.  Standardized image interpretation and post processing in cardiovascular magnetic resonance: Society for Cardiovascular Magnetic Resonance (SCMR) Board of Trustees Task Force on Standardized Post Processing , 2013, Journal of Cardiovascular Magnetic Resonance.

[17]  Caroline Petitjean,et al.  A review of segmentation methods in short axis cardiac MR images , 2011, Medical Image Anal..

[18]  G. Wright,et al.  Evaluation Framework for Algorithms Segmenting Short Axis Cardiac MRI. , 2009, The MIDAS Journal.

[19]  David A Bluemke,et al.  The relationship of left ventricular mass and geometry to incident cardiovascular events: the MESA (Multi-Ethnic Study of Atherosclerosis) study. , 2008, Journal of the American College of Cardiology.

[20]  Luigi Ferrucci,et al.  Pulse wave velocity is an independent predictor of the longitudinal increase in systolic blood pressure and of incident hypertension in the Baltimore Longitudinal Study of Aging. , 2008, Journal of the American College of Cardiology.

[21]  J. Francis,et al.  Operator induced variability in left ventricular measurements with cardiovascular magnetic resonance is improved after training. , 2007, Journal of Cardiovascular Magnetic Resonance.

[22]  D. Pennell,et al.  Comparison of interstudy reproducibility of cardiovascular magnetic resonance with two-dimensional echocardiography in normal subjects and in patients with heart failure or left ventricular hypertrophy. , 2002, The American journal of cardiology.

[23]  Dudley J Pennell,et al.  Comparison of techniques for the measurement of left ventricular function following cardiac transplantation. , 2002, Journal of cardiovascular magnetic resonance : official journal of the Society for Cardiovascular Magnetic Resonance.

[24]  D J Pennell,et al.  Reduction in sample size for studies of remodeling in heart failure by the use of cardiovascular magnetic resonance. , 2000, Journal of cardiovascular magnetic resonance : official journal of the Society for Cardiovascular Magnetic Resonance.

[25]  L. Lin,et al.  A concordance correlation coefficient to evaluate reproducibility. , 1989, Biometrics.

[26]  D. Altman,et al.  STATISTICAL METHODS FOR ASSESSING AGREEMENT BETWEEN TWO METHODS OF CLINICAL MEASUREMENT , 1986, The Lancet.

[27]  J. Fleiss,et al.  Risk stratification and survival after myocardial infarction. , 1983, The New England journal of medicine.