An Overview of Computational Approaches for Analyzing Interpretation

It is said that beauty is in the eye of the beholder. But how exactly can we characterize such discrepancies in interpretation? For example, are there any specific features of an image that makes person A regard an image as beautiful while person B finds the same image displeasing? Such questions ultimately aim at explaining our individual ways of interpretation, an intention that has been of fundamental importance to the social sciences from the beginning. More recently, advances in computer science brought up two related questions: First, can computational tools be adopted for analyzing ways of interpretation? Second, what if the "beholder" is a computer model, i.e., how can we explain a computer model's point of view? Numerous efforts have been made regarding both of these points, while many existing approaches focus on particular aspects and are still rather separate. With this paper, in order to connect these approaches we introduce a theoretical framework for analyzing interpretation, which is applicable to interpretation of both human beings and computer models. We give an overview of relevant computational approaches from various fields, and discuss the most common and promising application areas. The focus of this paper lies on interpretation of text and image data, while many of the presented approaches are applicable to other types of data as well.

[1]  Christopher D. Manning,et al.  Effective Approaches to Attention-based Neural Machine Translation , 2015, EMNLP.

[2]  Vadlamani Ravi,et al.  A survey on opinion mining and sentiment analysis: Tasks, approaches and applications , 2015, Knowl. Based Syst..

[3]  Alexandre Tkatchenko,et al.  Quantum-chemical insights from deep tensor neural networks , 2016, Nature Communications.

[4]  Luc Van Gool,et al.  Efficient Mining of Frequent and Distinctive Feature Configurations , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[5]  Rob Fergus,et al.  Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[6]  Ye Zhao,et al.  Visual summarization of image collections by fast RANSAC , 2016, Neurocomputing.

[7]  Alberto D. Pascual-Montano,et al.  A survey of dimensionality reduction techniques , 2014, ArXiv.

[8]  Gabriela Csurka,et al.  Visual categorization with bags of keypoints , 2002, eccv 2004.

[9]  Yao Li,et al.  Mining Mid-level Visual Patterns with Deep CNN Activations , 2015, International Journal of Computer Vision.

[10]  Klaus-Robert Müller,et al.  Explainable Artificial Intelligence: Understanding, Visualizing and Interpreting Deep Learning Models , 2017, ArXiv.

[11]  Quanshi Zhang,et al.  Interpretable Convolutional Neural Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[12]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[13]  Karl Pearson F.R.S. LIII. On lines and planes of closest fit to systems of points in space , 1901 .

[14]  Bolei Zhou,et al.  Network Dissection: Quantifying Interpretability of Deep Visual Representations , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Pietro Perona,et al.  Lean Multiclass Crowdsourcing , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[16]  MAGDALINI EIRINAKI,et al.  Web mining for web personalization , 2003, TOIT.

[17]  Mark W. Schmidt,et al.  Modeling annotator expertise: Learning when everybody knows a bit of something , 2010, AISTATS.

[18]  Stephen Bazen,et al.  The Taylor Decomposition: A Unified Generalization of the Oaxaca Method to Nonlinear Models , 2013 .

[19]  Jinyan Li,et al.  Efficient mining of emerging patterns: discovering trends and differences , 1999, KDD '99.

[20]  Delbert Dueck,et al.  Clustering by Passing Messages Between Data Points , 2007, Science.

[21]  Shih-Fu Chang,et al.  Multimodal Social Media Analysis for Gang Violence Prevention , 2018, ICWSM.

[22]  Motoaki Kawanabe,et al.  How to Explain Individual Classification Decisions , 2009, J. Mach. Learn. Res..

[23]  Cynthia Rudin,et al.  Interpretable classifiers using rules and Bayesian analysis: Building a better stroke prediction model , 2015, ArXiv.

[24]  Norberto F. Ezquerra,et al.  Constraining and summarizing association rules in medical data , 2006, Knowledge and Information Systems.

[25]  J. Pearl Causal inference in statistics: An overview , 2009 .

[26]  Klaus-Robert Müller,et al.  Interpretable deep neural networks for single-trial EEG classification , 2016, Journal of Neuroscience Methods.

[27]  William Frawley,et al.  Knowledge Discovery in Databases , 1991 .

[28]  Jerome L. Myers,et al.  Research Design & Statistical Analysis , 1995 .

[29]  David A. Landgrebe,et al.  A survey of decision tree classifier methodology , 1991, IEEE Trans. Syst. Man Cybern..

[30]  Rob Law,et al.  Identifying emerging hotel preferences using Emerging Pattern Mining technique , 2015 .

[31]  Andreas Dengel,et al.  What do Deep Networks Like to See? , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[32]  Michael Wiegand,et al.  A Survey on Hate Speech Detection using Natural Language Processing , 2017, SocialNLP@EACL.

[33]  Alexander Binder,et al.  On Pixel-Wise Explanations for Non-Linear Classifier Decisions by Layer-Wise Relevance Propagation , 2015, PloS one.

[34]  S. Briggs,et al.  The role of factor analysis in the development and evaluation of personality scales , 1986 .

[35]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[36]  Zoubin Ghahramani,et al.  Unifying linear dimensionality reduction , 2014, 1406.0873.

[37]  Jacek M. Zurada,et al.  Sensitivity analysis for minimization of input data dimension for feedforward neural network , 1994, Proceedings of IEEE International Symposium on Circuits and Systems - ISCAS '94.

[38]  Bernhard Schölkopf,et al.  Probabilistic latent variable models for distinguishing between cause and effect , 2010, NIPS.

[39]  Chad Creighton,et al.  Mining gene expression databases for association rules , 2003, Bioinform..

[40]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[41]  Jaideep Srivastava,et al.  Automatic personalization based on Web usage mining , 2000, CACM.

[42]  Vishal Gupta,et al.  Recent automatic text summarization techniques: a survey , 2016, Artificial Intelligence Review.

[43]  Andrew Zisserman,et al.  Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps , 2013, ICLR.

[44]  Alexander Binder,et al.  Analyzing Classifiers: Fisher Vectors and Deep Neural Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[45]  Jason Yosinski,et al.  Deep neural networks are easily fooled: High confidence predictions for unrecognizable images , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[46]  Fabio A. González,et al.  Multimodal latent topic analysis for image collection summarization , 2016, Inf. Sci..

[47]  Bolei Zhou,et al.  Learning Deep Features for Discriminative Localization , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[48]  Eric P. Xing,et al.  Toward Controlled Generation of Text , 2017, ICML.

[49]  Lei Zhang,et al.  A Survey of Opinion Mining and Sentiment Analysis , 2012, Mining Text Data.

[50]  E. Emmer Experimental Design in Psychological Research (5th ed.). , 1986 .

[51]  Luo Si,et al.  Mining contrastive opinions on political texts using cross-perspective topic model , 2012, WSDM '12.

[52]  Lars Kai Hansen,et al.  Visualization of Nonlinear Classification Models in Neuroimaging - Signed Sensitivity Maps , 2012, BIOSIGNALS.

[53]  Florence March,et al.  2016 , 2016, Affair of the Heart.

[54]  R. Rivest Learning Decision Lists , 1987, Machine Learning.

[55]  Thomas Brox,et al.  Striving for Simplicity: The All Convolutional Net , 2014, ICLR.

[56]  Mor Naaman,et al.  Generating summaries and visualization for large collections of geo-referenced photographs , 2006, MIR '06.

[57]  Margaret J. Robertson,et al.  Design and Analysis of Experiments , 2006, Handbook of statistics.

[58]  Alexander Binder,et al.  Evaluating the Visualization of What a Deep Neural Network Has Learned , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[59]  Sourav S. Bhowmick,et al.  Association Rule Mining: A Survey , 2003 .

[60]  Krys J. Kochut,et al.  Text Summarization Techniques: A Brief Survey , 2017, International Journal of Advanced Computer Science and Applications.

[61]  Valerie J. Gillet,et al.  Emerging Pattern Mining To Aid Toxicological Knowledge Discovery , 2014, J. Chem. Inf. Model..

[62]  Jian Pei,et al.  Mining frequent patterns without candidate generation , 2000, SIGMOD '00.

[63]  Carsten Eickhoff,et al.  Cognitive Biases in Crowdsourcing , 2018, WSDM.

[64]  Andrea Vedaldi,et al.  Understanding deep image representations by inverting them , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[65]  Pascal Vincent,et al.  Visualizing Higher-Layer Features of a Deep Network , 2009 .

[66]  Saltelli Andrea,et al.  Global Sensitivity Analysis: The Primer , 2008 .

[67]  Kotagiri Ramamohanarao,et al.  Instance-Based Classification by Emerging Patterns , 2000, PKDD.

[68]  Luke S. Zettlemoyer,et al.  A Joint Model of Language and Perception for Grounded Attribute Learning , 2012, ICML.

[69]  D. Levitin The foundations of cognitive psychology: Core readings , 2005, History & Philosophy of Psychology.

[70]  Rick A Adams,et al.  Computational Psychiatry: towards a mathematically informed understanding of mental illness , 2015, Journal of Neurology, Neurosurgery & Psychiatry.

[71]  Alexander Binder,et al.  Explaining nonlinear classification decisions with deep Taylor decomposition , 2015, Pattern Recognit..

[72]  Bernhard Schölkopf,et al.  Nonlinear causal discovery with additive noise models , 2008, NIPS.

[73]  Mohsen Ebrahimi Moghaddam,et al.  A knowledge-based semantic approach for image collection summarization , 2017, Multimedia Tools and Applications.

[74]  Geoffrey E. Hinton,et al.  Learning representations by back-propagating errors , 1986, Nature.

[75]  Tao Luo,et al.  Effective personalization based on association rule discovery from web usage data , 2001, WIDM '01.

[76]  Jens Zimmermann,et al.  Hermeneutics: A Very Short Introduction , 2015 .

[77]  Dipanjan Das Andr,et al.  A Survey on Automatic Text Summarization , 2007 .

[78]  C. Spearman General intelligence Objectively Determined and Measured , 1904 .

[79]  David A. McAllester,et al.  Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence , 2009, UAI 2009.

[80]  Gerardo Hermosillo,et al.  Learning From Crowds , 2010, J. Mach. Learn. Res..

[81]  Aapo Hyvärinen,et al.  DirectLiNGAM: A Direct Method for Learning a Linear Non-Gaussian Structural Equation Model , 2011, J. Mach. Learn. Res..

[82]  Ani Nenkova,et al.  A Survey of Text Summarization Techniques , 2012, Mining Text Data.

[83]  VARUN CHANDOLA,et al.  Anomaly detection: A survey , 2009, CSUR.

[84]  Sanne Kruikemeier,et al.  Online Political Microtargeting: Promises and Threats for Democracy , 2018 .

[85]  Klaus-Robert Müller,et al.  Explaining Recurrent Neural Network Predictions in Sentiment Analysis , 2017, WASSA@EMNLP.

[86]  Maria L. Rizzo,et al.  Measuring and testing dependence by correlation of distances , 2007, 0803.4101.

[87]  John Ruscio,et al.  Constructing Confidence Intervals for Spearman’s Rank Correlation with Ordinal Data: A Simulation Study Comparing Analytic and Bootstrap Methods , 2008 .

[88]  Lior Rokach,et al.  Ensemble-based classifiers , 2010, Artificial Intelligence Review.

[89]  Klaus-Robert Müller,et al.  Exploring text datasets by visualizing relevant words , 2017, ArXiv.

[90]  Lei Zhang,et al.  PatternNet: Visual Pattern Mining with Deep Neural Network , 2018, ICMR.

[91]  A. P. Dawid,et al.  Maximum Likelihood Estimation of Observer Error‐Rates Using the EM Algorithm , 1979 .

[92]  R. Krishnan,et al.  Extracting decision trees from trained neural networks , 1999, Pattern Recognit..

[93]  David Vilares,et al.  Detecting Perspectives in Political Debates , 2017, EMNLP.

[94]  Malladi Ravisankar,et al.  Effective Pattern Discovery for Text Mining , 2018 .

[95]  Thomas Brox,et al.  Synthesizing the preferred inputs for neurons in neural networks via deep generator networks , 2016, NIPS.

[96]  Barbara Hammer,et al.  Data visualization by nonlinear dimensionality reduction , 2015, WIREs Data Mining Knowl. Discov..

[97]  Li Fei-Fei,et al.  Crowdsourcing in Computer Vision , 2016, Found. Trends Comput. Graph. Vis..

[98]  Rajeev Motwani,et al.  Dynamic itemset counting and implication rules for market basket data , 1997, SIGMOD '97.

[99]  H. Bourlard,et al.  Auto-association by multilayer perceptrons and singular value decomposition , 1988, Biological Cybernetics.

[100]  Aapo Hyvärinen,et al.  On the Identifiability of the Post-Nonlinear Causal Model , 2009, UAI.

[101]  Yoshua Bengio,et al.  Convolutional networks for images, speech, and time series , 1998 .

[102]  Yoshua Bengio,et al.  Show, Attend and Tell: Neural Image Caption Generation with Visual Attention , 2015, ICML.

[103]  Joan Bruna,et al.  Intriguing properties of neural networks , 2013, ICLR.

[104]  Sergio Escalera,et al.  ChaLearn looking at people: A review of events and resources , 2017, 2017 International Joint Conference on Neural Networks (IJCNN).

[105]  Pedro M. Domingos A few useful things to know about machine learning , 2012, Commun. ACM.

[106]  Seth Flaxman,et al.  EU regulations on algorithmic decision-making and a "right to explanation" , 2016, ArXiv.

[107]  Zachary Chase Lipton The mythos of model interpretability , 2016, ACM Queue.

[108]  Frank Dellaert,et al.  Dataset fingerprints: Exploring image collections through data mining , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[109]  R. Fisher 014: On the "Probable Error" of a Coefficient of Correlation Deduced from a Small Sample. , 1921 .

[110]  Sreerama K. Murthy,et al.  Automatic Construction of Decision Trees from Data: A Multi-Disciplinary Survey , 1998, Data Mining and Knowledge Discovery.

[111]  Max Welling,et al.  Visualizing Deep Neural Network Decisions: Prediction Difference Analysis , 2017, ICLR.

[112]  Andrea Vedaldi,et al.  Visualizing Deep Convolutional Neural Networks Using Natural Pre-images , 2015, International Journal of Computer Vision.

[113]  Yoshua Bengio,et al.  Plug & Play Generative Networks: Conditional Iterative Generation of Images in Latent Space , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[114]  Bernhard Schölkopf,et al.  Towards a Learning Theory of Causation , 2015, 1502.02398.

[115]  Ramakrishnan Srikant,et al.  Fast algorithms for mining association rules , 1998, VLDB 1998.

[116]  Andreas Kerren,et al.  Text visualization techniques: Taxonomy, visual survey, and community insights , 2015, 2015 IEEE Pacific Visualization Symposium (PacificVis).

[117]  H. Hotelling Relations Between Two Sets of Variates , 1936 .

[118]  Tao Chen,et al.  DeepSentiBank: Visual Sentiment Concept Classification with Deep Convolutional Neural Networks , 2014, ArXiv.

[119]  Andrew Zisserman,et al.  Automatic Discovery and Optimization of Parts for Image Classification , 2015, ICLR.

[120]  K. Pearson VII. Note on regression and inheritance in the case of two parents , 1895, Proceedings of the Royal Society of London.

[121]  Andreas Dengel,et al.  Adversarial Defense based on Structure-to-Signal Autoencoders , 2018, 2020 IEEE Winter Conference on Applications of Computer Vision (WACV).

[122]  Ananthram Swami,et al.  Practical Black-Box Attacks against Machine Learning , 2016, AsiaCCS.

[123]  Margarita Vázquez Campos,et al.  Subjective and Objective Aspects of Points of View , 2015 .

[124]  Olcay Boz,et al.  Extracting decision trees from trained neural networks , 2002, KDD.

[125]  Fuhui Long,et al.  Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy , 2003, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[126]  Andrew H. Sung,et al.  Ranking importance of input parameters of neural networks , 1998 .

[127]  Dumitru Erhan,et al.  The (Un)reliability of saliency methods , 2017, Explainable AI.

[128]  Yao Li,et al.  Mid-level deep pattern mining , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[129]  Hod Lipson,et al.  Understanding Neural Networks Through Deep Visualization , 2015, ArXiv.

[130]  Rishabh K. Iyer,et al.  Learning Mixtures of Submodular Functions for Image Collection Summarization , 2014, NIPS.

[131]  Theofanis Sapatinas,et al.  Discriminant Analysis and Statistical Pattern Recognition , 2005 .

[132]  Bernardete Ribeiro,et al.  Learning from multiple annotators: Distinguishing good from random labelers , 2013, Pattern Recognit. Lett..

[133]  Geoffrey I. Webb,et al.  Supervised Descriptive Rule Discovery: A Unifying Survey of Contrast Set, Emerging Pattern and Subgroup Mining , 2009, J. Mach. Learn. Res..

[134]  Duane T. Wegener,et al.  Evaluating the use of exploratory factor analysis in psychological research. , 1999 .

[135]  Robert C. Bolles,et al.  Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography , 1981, CACM.

[136]  Bolei Zhou,et al.  Interpreting Deep Visual Representations via Network Dissection , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[137]  Laurenz Wiskott,et al.  On the Analysis and Interpretation of Inhomogeneous Quadratic Forms as Receptive Fields , 2006, Neural Computation.

[138]  Karl J. Friston,et al.  Computational psychiatry , 2012, Trends in Cognitive Sciences.

[139]  Wojciech Samek,et al.  Methods for interpreting and understanding deep neural networks , 2017, Digit. Signal Process..

[140]  Jason Weston,et al.  Memory Networks , 2014, ICLR.

[141]  Klaus-Robert Müller,et al.  PatternNet and PatternLRP - Improving the interpretability of neural networks , 2017, ArXiv.

[142]  Jure Leskovec,et al.  Interpretable Decision Sets: A Joint Framework for Description and Prediction , 2016, KDD.

[143]  Pietro Perona,et al.  Lean Crowdsourcing: Combining Humans and Machines in an Online System , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[144]  Ananthram Swami,et al.  The Limitations of Deep Learning in Adversarial Settings , 2015, 2016 IEEE European Symposium on Security and Privacy (EuroS&P).

[145]  Seth Flaxman,et al.  European Union Regulations on Algorithmic Decision-Making and a "Right to Explanation" , 2016, AI Mag..

[146]  Klaus-Robert Müller,et al.  Learning how to explain neural networks: PatternNet and PatternAttribution , 2017, ICLR.

[147]  Rongrong Ji,et al.  Large-scale visual sentiment ontology and detectors using adjective noun pairs , 2013, ACM Multimedia.

[148]  Jinyan Li,et al.  CAEP: Classification by Aggregating Emerging Patterns , 1999, Discovery Science.

[149]  C. Mathys,et al.  Computational approaches to psychiatry , 2014, Current Opinion in Neurobiology.

[150]  Fabio A. González,et al.  Visual pattern mining in histology image collections using bag of features , 2011, Artif. Intell. Medicine.

[151]  Margarita Vázquez Campos,et al.  The Notion of Point of View , 2015 .

[152]  Joseph P. Forgas,et al.  Social motivation: Conscious and unconscious processes , 2004 .

[153]  B. Frey,et al.  Predicting the sequence specificities of DNA- and RNA-binding proteins by deep learning , 2015, Nature Biotechnology.

[154]  Geoffrey I. Webb Discovering Significant Patterns , 2007, Machine Learning.

[155]  B. Lewis,et al.  Ethical research standards in a world of big data , 2014, F1000Research.

[156]  Das Amrita,et al.  Mining Association Rules between Sets of Items in Large Databases , 2013 .

[157]  C. Ordonez,et al.  Constraining and summarizing association rules in medical data , 2006 .

[158]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[159]  Jane Labadin,et al.  Feature selection based on mutual information , 2015, 2015 9th International Conference on IT in Asia (CITA).

[160]  J. Rothwell Principles of Neural Science , 1982 .