Unsupervised Expressive Rules Provide Explainability and Assist Human Experts Grasping New Domains

Approaching new data can be quite deterrent; you do not know how your categories of interest are realized in it, commonly, there is no labeled data at hand, and the performance of domain adaptation methods is unsatisfactory. Aiming to assist domain experts in their first steps into a new task over a new corpus, we present an unsupervised approach to reveal complex rules which cluster the unexplored corpus by its prominent categories (or facets). These rules are human-readable, thus providing an important ingredient which has become in short supply lately - explainability. Each rule provides an explanation for the commonality of all the texts it clusters together. We present an extensive evaluation of the usefulness of these rules in identifying target categories, as well as a user study which assesses their interpretability.

[1]  Bo Pang,et al.  Seeing Stars: Exploiting Class Relationships for Sentiment Categorization with Respect to Rating Scales , 2005, ACL.

[2]  Karen Kukich,et al.  Techniques for automatically correcting words in text , 1992, CSUR.

[3]  Lalana Kagal,et al.  Explaining Explanations: An Overview of Interpretability of Machine Learning , 2018, 2018 IEEE 5th International Conference on Data Science and Advanced Analytics (DSAA).

[4]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[5]  Eyal Shnarch,et al.  GRASP: Rich Patterns for Argumentation Mining , 2017, EMNLP.

[6]  Francesca Toni,et al.  Human-grounded Evaluations of Explanation Methods for Text Classification , 2019, EMNLP.

[7]  Roi Reichart,et al.  Pivot Based Language Modeling for Improved Neural Domain Adaptation , 2018, NAACL.

[8]  Bo Shao,et al.  Universality Versus Cultural Specificity of Three Emotion Domains , 2015 .

[9]  Ali Farhadi,et al.  Fine-Tuning Pretrained Language Models: Weight Initializations, Data Orders, and Early Stopping , 2020, ArXiv.

[10]  Wulczyn Ellery,et al.  Wikipedia Talk Labels: Personal Attacks , 2017 .

[11]  A. Rappoport Pattern-based methods for Improved Lexical Semantics and Word Embeddings , 2017 .

[12]  Jacob Cohen A Coefficient of Agreement for Nominal Scales , 1960 .

[13]  Koby Crammer,et al.  Hartigan's K-Means Versus Lloyd's K-Means - Is It Time for a Change? , 2013, IJCAI.

[14]  Yaniv Dover,et al.  Post-Consumption Susceptibility of Online Reviewers to Random Weather-Related Events , 2018 .

[15]  Omri Abend,et al.  Inherent Biases in Reference-based Evaluation for Grammatical Error Correction , 2018, ACL.

[16]  Bo Pang,et al.  A Sentimental Education: Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts , 2004, ACL.

[17]  Naftali Tishby,et al.  Unsupervised document classification using sequential information maximization , 2002, SIGIR '02.

[18]  Eduard H. Hovy,et al.  Learning surface text patterns for a Question Answering System , 2002, ACL.

[19]  Kai Barron Lying to appear honest , 2019 .

[20]  Eyal Shnarch,et al.  Corpus Wide Argument Mining - a Working Solution , 2019, AAAI.

[21]  Marti A. Hearst Automatic Acquisition of Hyponyms from Large Text Corpora , 1992, COLING.

[22]  Bing Liu,et al.  Mining and summarizing customer reviews , 2004, KDD.

[23]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[24]  Jean-Daniel Fekete,et al.  Progressive Analytics: A Computation Paradigm for Exploratory Data Analysis , 2016, ArXiv.

[25]  Junliang Yao,et al.  MDBA: Detecting Malware based on Bytes N-Gram with Association Mining , 2019, 2019 26th International Conference on Telecommunications (ICT).

[26]  Bonnie J. Dorr,et al.  Machine Translation Divergences: A Formal Description and Proposed Solution , 1994, CL.

[27]  Xiaoli Z. Fern,et al.  Interpreting Recurrent and Attention-Based Neural Models: a Case Study on Natural Language Inference , 2018, EMNLP.

[28]  Xiang Zhang,et al.  Character-level Convolutional Networks for Text Classification , 2015, NIPS.

[29]  Mitesh M. Khapra,et al.  Show Me Your Evidence - an Automatic Method for Context Dependent Evidence Detection , 2015, EMNLP.

[30]  Philipp Koehn,et al.  Findings of the WMT 2019 Shared Task on Parallel Corpus Filtering for Low-Resource Conditions , 2019, WMT.

[31]  John W. Tukey,et al.  Exploratory Data Analysis. , 1979 .

[32]  M. Braga,et al.  Exploratory Data Analysis , 2018, Encyclopedia of Social Network Analysis and Mining. 2nd Ed..

[33]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[34]  Marcin Junczys-Dowmunt,et al.  The WikEd Error Corpus: A Corpus of Corrective Wikipedia Edits and Its Application to Grammatical Error Correction , 2014, PolTAL.

[35]  Carlos Guestrin,et al.  "Why Should I Trust You?": Explaining the Predictions of Any Classifier , 2016, ArXiv.

[36]  Paolo Torroni,et al.  CLAUDETTE: an automated detector of potentially unfair clauses in online terms of service , 2018, Artificial Intelligence and Law.

[37]  Claire Grover,et al.  The HOLJ Corpus. Supporting Summarisation of Legal Texts , 2004 .

[38]  Iryna Gurevych,et al.  Parsing Argumentation Structures in Persuasive Essays , 2016, CL.

[39]  Akebo Yamakami,et al.  Contributions to the study of SMS spam filtering: new collection and results , 2011, DocEng '11.

[40]  Benjamin Roth,et al.  Interpretable Question Answering on Knowledge Bases and Text , 2019, ACL.

[41]  Natalia Grabar,et al.  Simplification-induced transformations: typology and some characteristics , 2019, BioNLP@ACL.

[42]  Eyal Shnarch,et al.  Will it Blend? Blending Weak and Strong Labeled Data in a Neural Network for Argumentation Mining , 2018, ACL.

[43]  Noam Slonim,et al.  A Benchmark Dataset for Automatic Detection of Claims and Evidence in the Context of Controversial Topics , 2014, ArgMining@ACL.