Deep Learning for Medical Image Segmentation

This report provides an overview of the current state of the art deep learning architectures and optimisation techniques, and uses the ADNI hippocampus MRI dataset as an example to compare the effectiveness and efficiency of different convolutional architectures on the task of patch-based 3-dimensional hippocampal segmentation, which is important in the diagnosis of Alzheimer's Disease. We found that a slightly unconventional "stacked 2D" approach provides much better classification performance than simple 2D patches without requiring significantly more computational power. We also examined the popular "tri-planar" approach used in some recently published studies, and found that it provides much better results than the 2D approaches, but also with a moderate increase in computational power requirement. Finally, we evaluated a full 3D convolutional architecture, and found that it provides marginally better results than the tri-planar approach, but at the cost of a very significant increase in computational power requirement.

[1]  Yoshua Bengio,et al.  Deep Sparse Rectifier Neural Networks , 2011, AISTATS.

[2]  Martin A. Riedmiller,et al.  RPROP - A Fast Adaptive Learning Algorithm , 1992 .

[3]  Owen Carmichael,et al.  Standardization of analysis sets for reporting results from ADNI MRI data , 2013, Alzheimer's & Dementia.

[4]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[5]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[6]  Roberto Cipolla,et al.  SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Derek C. Rose,et al.  Deep Machine Learning - A New Frontier in Artificial Intelligence Research [Research Frontier] , 2010, IEEE Computational Intelligence Magazine.

[8]  Richard K. Belew,et al.  Evolving networks: using the genetic algorithm with connectionist learning , 1990 .

[9]  Wei Tang,et al.  Ensembling neural networks: Many could be better than all , 2002, Artif. Intell..

[10]  Geoffrey E. Hinton Learning multiple layers of representation , 2007, Trends in Cognitive Sciences.

[11]  Qun Liu,et al.  Encoding Source Language with Convolutional Neural Network for Machine Translation , 2015, ACL.

[12]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[13]  Olga Veksler,et al.  Fast Approximate Energy Minimization via Graph Cuts , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[14]  E. Tulving,et al.  Episodic and declarative memory: Role of the hippocampus , 1998, Hippocampus.

[15]  Yoshua Bengio,et al.  Extracting and composing robust features with denoising autoencoders , 2008, ICML '08.

[16]  Jason Weston,et al.  A unified architecture for natural language processing: deep neural networks with multitask learning , 2008, ICML '08.

[17]  Kaiming He,et al.  Feature Pyramid Networks for Object Detection , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Oliver Brock,et al.  Interactive segmentation for manipulation in unstructured environments , 2009, 2009 IEEE International Conference on Robotics and Automation.

[19]  Rich Caruana,et al.  Model compression , 2006, KDD '06.

[20]  J. D. Schaffer,et al.  Combinations of genetic algorithms and neural networks: a survey of the state of the art , 1992, [Proceedings] COGANN-92: International Workshop on Combinations of Genetic Algorithms and Neural Networks.

[21]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[23]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[24]  Vitaly Schetinin A Learning Algorithm for Evolving Cascade Neural Networks , 2004, Neural Processing Letters.

[25]  Tommy W. S. Chow,et al.  A weight initialization method for improving training speed in feedforward neural network , 2000, Neurocomputing.

[26]  Christian Igel,et al.  Deep Feature Learning for Knee Cartilage Segmentation Using a Triplanar Convolutional Neural Network , 2013, MICCAI.

[27]  Rich Caruana,et al.  Do Deep Nets Really Need to be Deep? , 2013, NIPS.

[28]  Kunihiko Fukushima,et al.  Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position , 1980, Biological Cybernetics.

[29]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[30]  Nitish Srivastava,et al.  Improving neural networks by preventing co-adaptation of feature detectors , 2012, ArXiv.

[31]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  Yann LeCun,et al.  Traffic sign recognition with multi-scale Convolutional Networks , 2011, The 2011 International Joint Conference on Neural Networks.

[33]  Allan Pinkus,et al.  Multilayer Feedforward Networks with a Non-Polynomial Activation Function Can Approximate Any Function , 1991, Neural Networks.

[34]  Sergey Ioffe,et al.  Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning , 2016, AAAI.

[35]  Pascal Vincent,et al.  Representation Learning: A Review and New Perspectives , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[36]  Sepp Hochreiter,et al.  Untersuchungen zu dynamischen neuronalen Netzen , 1991 .

[37]  Hak-Keung Lam,et al.  Tuning of the structure and parameters of a neural network using an improved genetic algorithm , 2003, IEEE Trans. Neural Networks.

[38]  Ronald J. Williams,et al.  A Learning Algorithm for Continually Running Fully Recurrent Neural Networks , 1989, Neural Computation.

[39]  Paul M. Thompson,et al.  Atlas-based hippocampus segmentation in Alzheimer's disease and mild cognitive impairment , 2005, NeuroImage.

[40]  Matthew D. Zeiler ADADELTA: An Adaptive Learning Rate Method , 2012, ArXiv.

[41]  Tony Lindeberg,et al.  Segmentation and Classification of Edges Using Minimum Description Length Approximation and Complementary Junction Cues , 1996, Comput. Vis. Image Underst..

[42]  Sergey Ioffe,et al.  Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[43]  Honglak Lee,et al.  Unsupervised feature learning for audio classification using convolutional deep belief networks , 2009, NIPS.

[44]  Barak A. Pearlmutter Gradient Descent: Second Order Momentum and Saturating Error , 1991, NIPS.

[45]  Jean Ponce,et al.  A Theoretical Analysis of Feature Pooling in Visual Recognition , 2010, ICML.

[46]  Kurt Hornik,et al.  Multilayer feedforward networks are universal approximators , 1989, Neural Networks.

[47]  Dani Lischinski,et al.  A Closed-Form Solution to Natural Image Matting , 2008 .

[48]  Ronald M. Summers,et al.  A New 2.5D Representation for Lymph Node Detection Using Random Sets of Deep Convolutional Neural Network Observations , 2014, MICCAI.

[49]  Yoshua Bengio,et al.  Learning long-term dependencies with gradient descent is difficult , 1994, IEEE Trans. Neural Networks.