On the Understanding and Interpretation of Machine Learning Predictions in Clinical Gait Analysis Using Explainable Artificial Intelligence

Systems incorporating Artificial Intelligence (AI) and machine learning (ML) techniques are increasingly used to guide decision-making in the healthcare sector. While AI-based systems provide powerful and promising results with regard to their classification and prediction accuracy (e.g., in differentiating between different disorders in human gait), most share a central limitation, namely their black-box character. Understanding which features classification models learn, whether they are meaningful and consequently whether their decisions are trustworthy is difficult and often impossible to comprehend. This severely hampers their applicability as decisionsupport systems in clinical practice. There is a strong need for AI-based systems to provide transparency and justification of predictions, which are necessary also for ethical and legal compliance. As a consequence, in recent years the field of explainable AI (XAI) has gained increasing importance. XAI focuses on the development of methods that enhance transparency and interpretability of complex ML models, such as Deep (Convolutional) Neural Networks. The primary aim of this article is to investigate whether XAI methods can enhance transparency, explainability and interpretability of predictions in automated clinical gait classification. We utilize a dataset comprising bilateral three-dimensional ground reaction force measurements from 132 patients with different lower-body gait disorders and 62 healthy controls. In our experiments, 1 ar X iv :1 91 2. 07 73 7v 1 [ cs .L G ] 1 6 D ec 2 01 9 Horst and Slijepcevic et al. Explainable AI in Clinical Gait Analysis we included several gait classification tasks, employed a representative set of classification methods, and a well-established XAI method – Layer-wise Relevance Propagation (LRP) – to explain decisions at the signal (input) level. The classification results are analyzed, compared and interpreted in terms of classification accuracy and relevance of input values for specific decisions. The decomposed input relevance information are evaluated from a statistical (using Statistical Parameter Mapping) and clinical (by an expert) viewpoint. There are three dimensions in our comparison: (i) different classification tasks, (ii) different classification methods, and (iii) data normalization. The presented approach exemplifies how XAI can be used to understand and interpret state-of-the-art ML models trained for gait classification tasks, and shows that the features that are considered relevant for machine learning models can be attributed to meaningful and clinically relevant biomechanical gait characteristics.

[1]  Erik Strumbelj,et al.  Explaining prediction models and individual predictions with feature contributions , 2014, Knowledge and Information Systems.

[2]  Trevor Darrell,et al.  Generating Visual Explanations , 2016, ECCV.

[3]  Jacek M. Zurada,et al.  Sensitivity analysis for minimization of input data dimension for feedforward neural network , 1994, Proceedings of IEEE International Symposium on Circuits and Systems - ISCAS '94.

[4]  Georg Langs,et al.  Causability and explainability of artificial intelligence in medicine , 2019, WIREs Data Mining Knowl. Discov..

[5]  Todd C. Pataky,et al.  One-dimensional statistical parametric mapping in Python , 2012, Computer methods in biomechanics and biomedical engineering.

[6]  Andrew Zisserman,et al.  Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps , 2013, ICLR.

[7]  Saman K. Halgamuge,et al.  Classification of Parkinson's Disease Gait Using Spatial-Temporal Gait Features , 2015, IEEE Journal of Biomedical and Health Informatics.

[8]  Amina Adadi,et al.  Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI) , 2018, IEEE Access.

[9]  Alexander Binder,et al.  Evaluating the Visualization of What a Deep Neural Network Has Learned , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[10]  Klaus-Robert Müller,et al.  Explaining the unique nature of individual gait patterns with deep learning , 2018, Scientific Reports.

[11]  Hailong Zhu,et al.  Support vector machine for classification of walking conditions of persons after stroke with dropped foot. , 2009, Human movement science.

[12]  Avanti Shrikumar,et al.  Learning Important Features Through Propagating Activation Differences , 2017, ICML.

[13]  Sebastian Thrun,et al.  Dermatologist-level classification of skin cancer with deep neural networks , 2017, Nature.

[14]  Robert Rosenthal,et al.  Meta-Analytic Procedures for Social Science Research Sage Publications: Beverly Hills, 1984, 148 pp. , 1986 .

[15]  Vinzenz von Tscharner,et al.  Gait patterns of asymmetric ankle osteoarthritis patients. , 2012, Clinical biomechanics.

[16]  Christian Biemann,et al.  What do we need to build explainable AI systems for the medical domain? , 2017, ArXiv.

[17]  Thompson Sarkodie-Gyan,et al.  Automatic classification of pathological gait patterns using ground reaction forces and machine learning algorithms , 2011, 2011 Annual International Conference of the IEEE Engineering in Medicine and Biology Society.

[18]  Amit Dhurandhar,et al.  One Explanation Does Not Fit All: A Toolkit and Taxonomy of AI Explainability Techniques , 2019, ArXiv.

[19]  Cuntai Guan,et al.  A Survey on Explainable Artificial Intelligence (XAI): Toward Medical XAI , 2019, IEEE Transactions on Neural Networks and Learning Systems.

[20]  Shinichi Nakajima,et al.  Towards Best Practice in Explaining Neural Network Decisions with LRP , 2019, 2020 International Joint Conference on Neural Networks (IJCNN).

[21]  Klaus-Robert Müller,et al.  Efficient BackProp , 2012, Neural Networks: Tricks of the Trade.

[22]  Patrik Kutilek,et al.  Variability of centre of pressure movement during gait in young and middle-aged women. , 2014, Gait & posture.

[23]  Brian Horsak,et al.  Input representations and classification strategies for automated human gait analysis. , 2019, Gait & posture.

[24]  Chih-Jen Lin,et al.  A Practical Guide to Support Vector Classication , 2008 .

[25]  W I Schöllhorn,et al.  Applications of artificial neural nets in clinical biomechanics. , 2004, Clinical biomechanics.

[26]  Joana Figueiredo,et al.  Automatic recognition of gait patterns in human motor disorders using machine learning: A review. , 2018, Medical engineering & physics.

[27]  Alexander Binder,et al.  On Pixel-Wise Explanations for Non-Linear Classifier Decisions by Layer-Wise Relevance Propagation , 2015, PloS one.

[28]  Jie Xu,et al.  The practical implementation of artificial intelligence technologies in medicine , 2019, Nature Medicine.

[29]  Brian McWilliams,et al.  The Shattered Gradients Problem: If resnets are the answer, then what is the question? , 2017, ICML.

[30]  Alexander Binder,et al.  Unmasking Clever Hans predictors and assessing what machines really learn , 2019, Nature Communications.

[31]  Thomas Brox,et al.  Synthesizing the preferred inputs for neurons in neural networks via deep generator networks , 2016, NIPS.

[32]  Scott Lundberg,et al.  A Unified Approach to Interpreting Model Predictions , 2017, NIPS.

[33]  A Baca,et al.  P 011-Towards an optimal combination of input signals and derived representations for gait classification based on ground reaction force measurements. , 2018, Gait & posture.

[34]  Klaus-Robert Müller,et al.  Explainable Artificial Intelligence: Understanding, Visualizing and Interpreting Deep Learning Models , 2017, ArXiv.

[35]  François Chollet,et al.  Deep Learning with Python , 2017 .

[36]  Carlos Guestrin,et al.  Model-Agnostic Interpretability of Machine Learning , 2016, ArXiv.

[37]  Jürgen Schmidhuber,et al.  Multi-column deep neural network for traffic sign classification , 2012, Neural Networks.

[38]  Klaus-Robert Müller,et al.  Layer-Wise Relevance Propagation: An Overview , 2019, Explainable AI.

[39]  T Chau,et al.  A review of analytical techniques for gait data. Part 2: neural network and wavelet methods. , 2001, Gait & posture.

[40]  Wojciech Samek,et al.  Methods for interpreting and understanding deep neural networks , 2017, Digit. Signal Process..

[41]  Motoaki Kawanabe,et al.  How to Explain Individual Classification Decisions , 2009, J. Mach. Learn. Res..

[42]  Todd C Pataky,et al.  Generalized n-dimensional biomechanical field analysis using statistical parametric mapping. , 2010, Journal of biomechanics.

[43]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[44]  Andrea Vedaldi,et al.  Interpretable Explanations of Black Boxes by Meaningful Perturbation , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[45]  Madalina Fiterau,et al.  Machine learning in human movement biomechanics: Best practices, common pitfalls, and new opportunities. , 2018, Journal of biomechanics.

[46]  Angkoon Phinyomark,et al.  Analysis of Big Data in Gait Biomechanics: Current Trends and Future Directions , 2017, Journal of Medical and Biological Engineering.

[47]  Arnold Baca,et al.  Automatic Classification of Functional Gait Disorders , 2017, IEEE Journal of Biomedical and Health Informatics.