Why Does Synthesized Data Improve Multi-sequence Classification?

The classification and registration of incomplete multi-modal medical images, such as multi-sequence MRI with missing sequences, can sometimes be improved by replacing the missing modalities with synthetic data. This may seem counter-intuitive: synthetic data is derived from data that is already available, so it does not add new information. Why can it still improve performance? In this paper we discuss possible explanations. If the synthesis model is more flexible than the classifier, the synthesis model can provide features that the classifier could not have extracted from the original data. In addition, using synthetic information to complete incomplete samples increases the size of the training set. We present experiments with two classifiers, linear support vector machines SVMs and random forests, together with two synthesis methods that can replace missing data in an image classification problem: neural networks and restricted Boltzmann machines RBMs. We used data from the BRATS 2013 brain tumor segmentation challenge, which includes multi-modal MRI scans with T1, T1 post-contrast, T2 and FLAIR sequences. The linear SVMs appear to benefit from the complex transformations offered by the synthesis models, whereas the random forests mostly benefit from having more training data. Training on the hidden representation from the RBM brought the accuracy of the linear SVMs close to that of random forests.

[1]  B. Schölkopf,et al.  Towards quantitative PET/MRI: a review of MR-based attenuation correction techniques , 2009, European Journal of Nuclear Medicine and Molecular Imaging.

[2]  Arne Skretting,et al.  A simulation of MRI based dose calculations on the basis of radiotherapy planning CT images , 2008, Acta oncologica.

[3]  Adam Johansson,et al.  Evaluation of an attenuation correction method for PET/MR imaging of the head based on substitute CT images , 2013, Magnetic Resonance Materials in Physics, Biology and Medicine.

[4]  D. Rubin,et al.  Statistical Analysis with Missing Data. , 1989 .

[5]  Nobuhiko Hata,et al.  Medical Image Computing and Computer-Assisted Intervention – MICCAI 2014 , 2014, Lecture Notes in Computer Science.

[6]  Simon Ameer-Beg,et al.  Biomedical Imaging: From Nano to Macro , 2008 .

[7]  Tijmen Tieleman,et al.  Training restricted Boltzmann machines using approximations to the likelihood gradient , 2008, ICML '08.

[8]  Razvan Pascanu,et al.  Theano: A CPU and GPU Math Compiler in Python , 2010, SciPy.

[9]  Adam Johansson,et al.  CT substitute derived from MRI sequences with ultrashort echo time. , 2011, Medical physics.

[10]  Snehashis Roy,et al.  A Compressed Sensing Approach for MR Tissue Contrast Synthesis , 2011, IPMI.

[11]  Adam Johansson,et al.  CT substitutes derived from MR images reconstructed with parallel imaging. , 2014, Medical physics.

[12]  Pascal Vincent,et al.  Representation Learning: A Review and New Perspectives , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Anders M. Dale,et al.  Sequence-independent segmentation of magnetic resonance images , 2004, NeuroImage.

[14]  Mika Kapanen,et al.  T1/T2*-weighted MRI provides clinically relevant pseudo-CT density data for the pelvic bones in MRI-only based radiotherapy treatment planning , 2013, Acta oncologica.

[15]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[16]  Ben Glocker,et al.  Is Synthesizing MRI Contrast Useful for Inter-modality Analysis? , 2013, MICCAI.

[17]  F. Deconinck,et al.  Information Processing in Medical Imaging , 1984, Springer Netherlands.

[18]  Brian B. Avants,et al.  The Multimodal Brain Tumor Image Segmentation Benchmark (BRATS) , 2015, IEEE Transactions on Medical Imaging.

[19]  Søren Feodor Nielsen,et al.  1. Statistical Analysis with Missing Data (2nd edn). Roderick J. Little and Donald B. Rubin, John Wiley & Sons, New York, 2002. No. of pages: xv+381. ISBN: 0‐471‐18386‐5 , 2004 .

[20]  Geoffrey E. Hinton A Practical Guide to Training Restricted Boltzmann Machines , 2012, Neural Networks: Tricks of the Trade.

[21]  Bernhard Schölkopf,et al.  MRI-Based Attenuation Correction for PET/MRI: A Novel Approach Combining Pattern Recognition and Atlas Registration , 2008, Journal of Nuclear Medicine.

[22]  Roderick J. A. Little,et al.  Statistical Analysis with Missing Data: Little/Statistical Analysis with Missing Data , 2002 .

[23]  Dinggang Shen,et al.  Deep Learning Based Imaging Data Completion for Improved Brain Disease Diagnosis , 2014, MICCAI.