Efficient Medical Image Parsing

Abstract Fast and robust detection, segmentation and tracking of anatomical structures or pathologies support the entire clinical workflow enabling real-time guidance, quantification, and processing in the operating room. Most state-of-the-art solutions for parsing medical images are based on machine learning methods. While this enables the effective use of large annotated image databases, such techniques typically suffer from inherent limitations related to the efficiency in scanning high-dimensional parametric spaces and the learning of representative features for modeling the object appearance. In this context we present Marginal Space Deep Learning, a novel framework for volumetric image parsing which exploits both the strengths of efficient object parametrization in hierarchical marginal spaces and the representational power of state-of-the-art deep learning architectures. The system learns classifiers in clustered, high-probability regions of the parameter space capturing the appearance of the object under the considered pose transformations and shape variations, gradually increasing the dimensionality of the exploration space from translation (3D), translation–orientation (6D) to incorporating also the anisotropic scaling (9D) and shape variability (ND). During runtime the system uses the learned classifiers to exhaustively scan these spaces to select the most probable transformation parameters. As this implies a significant computational effort in the order of billions of scanning hypotheses we propose cascaded sparse adaptive neural networks, learning to focus the data sampling patterns of the networks on sparse, context-rich parts of the input, thereby considerably reducing the runtime and increasing the robustness of the system. While we show that this method significantly increases the performance of the state-of-the-art, we highlight its main limitation: the learning of the appearance model and the parameter scanning are completely decoupled as independent algorithmic steps. To address this we make a step toward human-like intelligent parsing, presenting an extension of the system that models the object appearance and the parameter search as a unified behavioral task for an artificial agent. As opposed to exhaustively scanning the parameter space, the system uses reinforcement learning to discover optimal navigation paths guiding the search to the optimal location. We show the initial performance of this approach on the detection of arbitrary landmarks in ultrasound, magnetic resonance, and computed tomography data, with considerable improvement over the state-of-the-art. Our future work is focused on extending this framework for generic image parsing.

[1]  Dorin Comaniciu,et al.  An Artificial Agent for Anatomical Landmark Detection in Medical Images , 2016, MICCAI.

[2]  John N. Tsitsiklis,et al.  Analysis of temporal-difference learning with function approximation , 1996, NIPS 1996.

[3]  Timothy F. Cootes,et al.  Active Shape Models-Their Training and Application , 1995, Comput. Vis. Image Underst..

[4]  Paul A. Yushkevich,et al.  Medially constrained deformable modeling for segmentation of branching medial structures: Application to aortic valve segmentation and morphometry , 2015, Medical Image Anal..

[5]  Peter Dayan,et al.  Q-learning , 1992, Machine Learning.

[6]  D. Hubel,et al.  Shape and arrangement of columns in cat's striate cortex , 1963, The Journal of physiology.

[7]  Alejandro F. Frangi,et al.  Active shape model segmentation with optimal features , 2002, IEEE Transactions on Medical Imaging.

[8]  E. Thorndike Animal Intelligence; Experimental Studies , 2009 .

[9]  Dorin Comaniciu,et al.  Marginal Space Deep Learning: Efficient Architecture for Detection in Volumetric Image Data , 2015, MICCAI.

[10]  Dorin Comaniciu,et al.  3D Deep Learning for Efficient and Robust Landmark Detection in Volumetric Data , 2015, MICCAI.

[11]  Robert D. Howe,et al.  Patient-Specific Mitral Leaflet Segmentation from 4D Ultrasound , 2011, MICCAI.

[12]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[13]  K. Appel,et al.  Every planar map is four colorable. Part II: Reducibility , 1977 .

[14]  Paul A. Viola,et al.  Rapid object detection using a boosted cascade of simple features , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[15]  James L. McClelland,et al.  Parallel distributed processing: explorations in the microstructure of cognition, vol. 1: foundations , 1986 .

[16]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[17]  Martin A. Riedmiller Neural Fitted Q Iteration - First Experiences with a Data Efficient Neural Reinforcement Learning Method , 2005, ECML.

[18]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[19]  Federico Tombari,et al.  3D Data Segmentation by Local Classification and Markov Random Fields , 2011, 2011 International Conference on 3D Imaging, Modeling, Processing, Visualization and Transmission.

[20]  Peter Dayan,et al.  Technical Note: Q-Learning , 2004, Machine Learning.

[21]  Geoffrey E. Hinton,et al.  Reinforcement Learning with Factored States and Actions , 2004, J. Mach. Learn. Res..

[22]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[23]  Xiaoguang Lu,et al.  Discriminative Context Modeling Using Auxiliary Markers for LV Landmark Detection from a Single MR Image , 2012, STACOM.

[24]  Dorin Comaniciu,et al.  Marginal Space Deep Learning: Efficient Architecture for Volumetric Image Parsing , 2016, IEEE Transactions on Medical Imaging.

[25]  Paul A. Yushkevich,et al.  Fully automatic segmentation of the mitral leaflets in 3D transesophageal echocardiographic images using multi-atlas joint label fusion and deformable medial modeling , 2014, Medical Image Anal..

[26]  Dorin Comaniciu,et al.  Four-Chamber Heart Modeling and Automatic Segmentation for 3-D Cardiac CT Volumes Using Marginal Space Learning and Steerable Features , 2008, IEEE Transactions on Medical Imaging.

[27]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[28]  Jitendra Malik,et al.  Learning to detect natural image boundaries using local brightness, color, and texture cues , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[29]  Nassir Navab,et al.  Patient-Specific Modeling and Quantification of the Aortic and Mitral Valves From 4-D Cardiac CT and TEE , 2010, IEEE Transactions on Medical Imaging.

[30]  Pascal Vincent,et al.  Unsupervised Feature Learning and Deep Learning: A Review and New Perspectives , 2012, ArXiv.

[31]  Milan Sonka,et al.  3-D active appearance models: segmentation of cardiac MR and ultrasound images , 2002, IEEE Transactions on Medical Imaging.

[32]  Richard S. Sutton,et al.  Introduction to Reinforcement Learning , 1998 .

[33]  Yee Whye Teh,et al.  A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.

[34]  Yoshua Bengio,et al.  Deep Sparse Rectifier Neural Networks , 2011, AISTATS.

[35]  R. Bellman Dynamic programming. , 1957, Science.

[36]  Dorin Comaniciu,et al.  Cardiac Anchoring in MRI through Context Modeling , 2010, MICCAI.

[37]  Peter Norvig,et al.  Artificial Intelligence: A Modern Approach , 1995 .

[38]  Byung Kook Kim,et al.  Measuring the machine intelligence quotient (MIQ) of human-machine cooperative systems , 2001, IEEE Trans. Syst. Man Cybern. Part A.

[39]  Long-Ji Lin,et al.  Reinforcement learning for robots using neural networks , 1992 .

[40]  H. Marquering,et al.  Automatic aortic root landmark detection in CTA images for preprocedural planning of transcatheter aortic valve implantation , 2015, The International Journal of Cardiovascular Imaging.

[41]  Yee Whye Teh,et al.  Actor-Critic Reinforcement Learning with Energy-Based Policies , 2012, EWRL.

[42]  Martin A. Riedmiller,et al.  Deep auto-encoder neural networks in reinforcement learning , 2010, The 2010 International Joint Conference on Neural Networks (IJCNN).