Enhancing interpretability of automatically extracted machine learning features: application to a RBM‐Random Forest system on brain lesion segmentation

HighlightsWe propose methodologies to enhance the interpretability of a machine learning system.The approach can yield two levels of interpretability (global and local), allowing us to assess how the system learned task‐specific relations and its individual predictions.Validation on brain tumor segmentation and penumbra estimation in acute stroke.Based on the evaluated clinical scenarios, the proposed approach allows us to confirm that the machine learning system learns relations coherent with expert knowledge and annotation protocols. Graphical abstract Figure. No Caption available. Abstract Machine learning systems are achieving better performances at the cost of becoming increasingly complex. However, because of that, they become less interpretable, which may cause some distrust by the end‐user of the system. This is especially important as these systems are pervasively being introduced to critical domains, such as the medical field. Representation Learning techniques are general methods for automatic feature computation. Nevertheless, these techniques are regarded as uninterpretable “black boxes”. In this paper, we propose a methodology to enhance the interpretability of automatically extracted machine learning features. The proposed system is composed of a Restricted Boltzmann Machine for unsupervised feature learning, and a Random Forest classifier, which are combined to jointly consider existing correlations between imaging data, features, and target variables. We define two levels of interpretation: global and local. The former is devoted to understanding if the system learned the relevant relations in the data correctly, while the later is focused on predictions performed on a voxel‐ and patient‐level. In addition, we propose a novel feature importance strategy that considers both imaging data and target variables, and we demonstrate the ability of the approach to leverage the interpretability of the obtained representation for the task at hand. We evaluated the proposed methodology in brain tumor segmentation and penumbra estimation in ischemic stroke lesions. We show the ability of the proposed methodology to unveil information regarding relationships between imaging modalities and extracted features and their usefulness for the task at hand. In both clinical scenarios, we demonstrate that the proposed methodology enhances the interpretability of automatically learned features, highlighting specific learning patterns that resemble how an expert extracts relevant data from medical images.

[1]  Gábor Székely,et al.  A Generative Probabilistic Model and Discriminative Extensions for Brain Lesion Segmentation— With Application to Tumor and Stroke , 2016, IEEE Transactions on Medical Imaging.

[2]  Joan Bruna,et al.  Intriguing properties of neural networks , 2013, ICLR.

[3]  Victor Alves,et al.  Brain Tumor Segmentation Using Convolutional Neural Networks in MRI Images , 2016, IEEE Transactions on Medical Imaging.

[4]  Alex Alves Freitas,et al.  Comprehensible classification models: a position paper , 2014, SKDD.

[5]  Julian D. Olden,et al.  Illuminating the “black box”: a randomization approach for understanding variable contributions in artificial neural networks , 2002 .

[6]  Konstantinos Kamnitsas,et al.  Efficient multi‐scale 3D CNN with fully connected CRF for accurate brain lesion segmentation , 2016, Medical Image Anal..

[7]  J. van Leeuwen,et al.  Neural Networks: Tricks of the Trade , 2002, Lecture Notes in Computer Science.

[8]  Rossitza Setchi,et al.  Feature selection using Joint Mutual Information Maximisation , 2015, Expert Syst. Appl..

[9]  Mark E Mullins,et al.  Radiation necrosis versus glioma recurrence: conventional MR imaging clues to diagnosis. , 2005, AJNR. American journal of neuroradiology.

[10]  Pablo A. Estévez,et al.  A review of feature selection methods based on mutual information , 2013, Neural Computing and Applications.

[11]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[12]  Sebastian Nowozin,et al.  Improved Information Gain Estimates for Decision Tree Induction , 2012, ICML.

[13]  et al.,et al.  ISLES 2015 ‐ A public evaluation benchmark for ischemic stroke lesion segmentation from multispectral MRI , 2017, Medical Image Anal..

[14]  Kwan-Liu Ma,et al.  Opening the black box - data driven visualization of neural networks , 2005, VIS 05. IEEE Visualization, 2005..

[15]  Xiantong Zhen,et al.  Multi-scale deep networks and regression forests for direct bi-ventricular volume estimation , 2016, Medical Image Anal..

[16]  Pamela W Schaefer,et al.  MR perfusion imaging in acute ischemic stroke. , 2011, Neuroimaging clinics of North America.

[17]  Paulo Cortez,et al.  Opening black box Data Mining models using Sensitivity Analysis , 2011, 2011 IEEE Symposium on Computational Intelligence and Data Mining (CIDM).

[18]  Fuhui Long,et al.  Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy , 2003, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  Satoshi Hara,et al.  Making Tree Ensembles Interpretable , 2016, 1606.05390.

[20]  Geoffrey E. Hinton Training Products of Experts by Minimizing Contrastive Divergence , 2002, Neural Computation.

[21]  Mohammad Havaei,et al.  HeMIS: Hetero-Modal Image Segmentation , 2016, MICCAI.

[22]  Mauricio Reyes,et al.  Automatic estimation of extent of resection and residual tumor volume of patients with glioblastoma. , 2017, Journal of neurosurgery.

[23]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[24]  Geoffrey E. Hinton,et al.  Restricted Boltzmann machines for collaborative filtering , 2007, ICML '07.

[25]  N. Meinshausen,et al.  Stability selection , 2008, 0809.2932.

[26]  Andrew Zisserman,et al.  Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps , 2013, ICLR.

[27]  Gilles Louppe,et al.  Understanding variable importances in forests of randomized trees , 2013, NIPS.

[28]  Yee Whye Teh,et al.  A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.

[29]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[30]  Roberto Battiti,et al.  Using mutual information for selecting features in supervised neural net learning , 1994, IEEE Trans. Neural Networks.

[31]  Yoshua Bengio,et al.  Classification using discriminative restricted Boltzmann machines , 2008, ICML '08.

[32]  Carlos Guestrin,et al.  Model-Agnostic Interpretability of Machine Learning , 2016, ArXiv.

[33]  Motoaki Kawanabe,et al.  How to Explain Individual Classification Decisions , 2009, J. Mach. Learn. Res..

[34]  Marleen de Bruijne,et al.  Combining Generative and Discriminative Representation Learning for Lung CT Analysis With Convolutional Restricted Boltzmann Machines , 2016, IEEE Transactions on Medical Imaging.

[35]  Jude W. Shavlik,et al.  in Advances in Neural Information Processing , 1996 .

[36]  Carlos Alberto Silva,et al.  Automatic brain tissue segmentation in MR images using Random Forests and Conditional Random Fields , 2016, Journal of Neuroscience Methods.

[37]  S. Bauer,et al.  Fully automated stroke tissue estimation using random forest classifiers (FASTER) , 2017, Journal of cerebral blood flow and metabolism : official journal of the International Society of Cerebral Blood Flow and Metabolism.

[38]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[39]  Ronald M. Summers,et al.  Machine learning and radiology , 2012, Medical Image Anal..

[40]  Kenney Ng,et al.  Interacting with Predictions: Visual Inspection of Black-box Machine Learning Models , 2016, CHI.

[41]  R. Meier,et al.  Clinical Evaluation of a Fully-automatic Segmentation Method for Longitudinal Brain Tumor Volumetry , 2016, Scientific Reports.

[42]  Geoffrey E. Hinton,et al.  Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.

[43]  Brian B. Avants,et al.  The Multimodal Brain Tumor Image Segmentation Benchmark (BRATS) , 2015, IEEE Transactions on Medical Imaging.

[44]  Zachary Chase Lipton The mythos of model interpretability , 2016, ACM Queue.

[45]  Gerald Tesauro,et al.  Neural Network Visualization , 1989, NIPS.

[46]  Jayaram K. Udupa,et al.  New variants of a method of MRI scale standardization , 2000, IEEE Transactions on Medical Imaging.

[47]  Rob Fergus,et al.  Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[48]  Ender Konukoglu,et al.  Relevant feature set estimation with a knock-out strategy and random forests , 2015, NeuroImage.

[49]  James L. McClelland,et al.  Parallel distributed processing: explorations in the microstructure of cognition, vol. 1: foundations , 1986 .

[50]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[51]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[52]  Jason Yosinski,et al.  Deep neural networks are easily fooled: High confidence predictions for unrecognizable images , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[53]  R. Bammer,et al.  Real‐time diffusion‐perfusion mismatch analysis in acute stroke , 2010, Journal of magnetic resonance imaging : JMRI.

[54]  Xingquan Zhu,et al.  Class Noise vs. Attribute Noise: A Quantitative Study , 2003, Artificial Intelligence Review.

[55]  Antonio Criminisi,et al.  Decision Forests for Computer Vision and Medical Image Analysis , 2013, Advances in Computer Vision and Pattern Recognition.

[56]  Shie Mannor,et al.  Visualizing Dynamics: from t-SNE to SEMI-MDPs , 2016, ArXiv.

[57]  Anne L. Martel,et al.  Interpreting extracted rules from ensemble of trees: Application to computer-aided diagnosis of breast MRI , 2016, ArXiv.

[58]  Ender Konukoglu,et al.  Approximate False Positive Rate Control in Selection Frequency for Random Forest , 2014, ArXiv.

[59]  Stefan Bauer,et al.  Patient-Specific Semi-supervised Learning for Postoperative Brain Tumor Segmentation , 2014, MICCAI.

[60]  Brian B. Avants,et al.  N4ITK: Improved N3 Bias Correction , 2010, IEEE Transactions on Medical Imaging.

[61]  Marleen de Bruijne,et al.  Why Does Synthesized Data Improve Multi-sequence Classification? , 2015, MICCAI.

[62]  Carlos Guestrin,et al.  "Why Should I Trust You?": Explaining the Predictions of Any Classifier , 2016, ArXiv.