Minimally-supervised learning of domain-specific causal relations using an open-domain corpus as knowledge base

We propose a novel framework for overcoming the challenges in extracting causal relations from domain-specific texts. Our technique is minimally-supervised, alleviating the need for manually-annotated, expensive training data. As our main contribution, we show that open-domain corpora can be exploited as knowledge bases to overcome data sparsity issues posed by domain-specific relation extraction, and that they enable substantial performance gains. We also address longstanding challenges of extant minimally-supervised approaches. To suppress the negative impact of semantic drift, we propose a technique based on the Latent Relational Hypothesis. In addition, our approach discovers both explicit (e.g. ''to cause'') and implicit (e.g. ''to destroy'') causal patterns/relations. Unlike existing minimally-supervised techniques, we adopt a principled seed selection strategy, which enables us to discover a more diverse set of causal patterns/relations. Our experiments reveal that our approach outperforms a state-of-the-art baseline in discovering causal relations from a real-life, domain-specific corpus.

[1]  Anne Condamines,et al.  Corpus analysis and conceptual relation patterns , 2002 .

[2]  Syin Chan,et al.  Extracting Causal Knowledge from a Medical Database Using Graphical Patterns , 2000, ACL.

[3]  Hans Uszkoreit,et al.  Analysis and Improvement of Minimally Supervised Machine Learning for Relation Extraction , 2009, NLDB.

[4]  Christopher S. G. Khoo,et al.  Automatic Extraction of Cause-Effect Information from Newspaper Text Without Knowledge-based Inferencing , 1998 .

[5]  Gerhard Weikum,et al.  WWW 2007 / Track: Semantic Web Session: Ontologies ABSTRACT YAGO: A Core of Semantic Knowledge , 2022 .

[6]  Hans Uszkoreit,et al.  Boosting Relation Extraction with Limited Closed-World Knowledge , 2010, COLING.

[7]  Luis Gravano,et al.  Snowball: extracting relations from large plain-text collections , 2000, DL '00.

[8]  Yuji Matsumoto,et al.  HITS-based Seed Selection and Stop List Construction for Bootstrapping , 2011, ACL.

[9]  Peter D. Turney The Latent Relation Mapping Engine: Algorithm and Experiments , 2008, J. Artif. Intell. Res..

[10]  Philipp Cimiano,et al.  Exploiting Wikipedia for cross-lingual and multilingual information retrieval , 2012, Data Knowl. Eng..

[11]  Simone Paolo Ponzetto,et al.  Knowledge Derived From Wikipedia For Computing Semantic Relatedness , 2007, J. Artif. Intell. Res..

[12]  Christopher S. G. Khoo Automatic identification of causal relations in text and their use for improving precision in information retrieval , 1996 .

[13]  Ian H. Witten,et al.  Mining Meaning from Wikipedia , 2008, Int. J. Hum. Comput. Stud..

[14]  Kenneth Ward Church,et al.  Word Association Norms, Mutual Information, and Lexicography , 1989, ACL.

[15]  H. B. Mann,et al.  On a Test of Whether one of Two Random Variables is Stochastically Larger than the Other , 1947 .

[16]  Fang Xuelan,et al.  Expressing Causation in Written English , 1992 .

[17]  Oren Etzioni,et al.  Open Information Extraction from the Web , 2007, CACM.

[18]  Alla Rozovskaya,et al.  Automatic Semantic Relation Extraction with Multiple Boundary Generation , 2008, AAAI.

[19]  Tara McIntosh,et al.  Unsupervised Discovery of Negative Categories in Lexicon Bootstrapping , 2010, EMNLP.

[20]  Vasudeva Varma,et al.  Effectively Mining Wikipedia for Clustering Multilingual Documents , 2011, NLDB.

[21]  L. Talmy Toward a Cognitive Semantics , 2003 .

[22]  Philipp Cimiano,et al.  Using the Web to Reduce Data Sparseness in Pattern-Based Information Extraction , 2007, PKDD.

[23]  Patrick Pantel,et al.  Espresso: Leveraging Generic Patterns for Automatically Harvesting Semantic Relations , 2006, ACL.

[24]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[25]  John B. Lowe,et al.  The Berkeley FrameNet Project , 1998, ACL.

[26]  Patrick Pantel,et al.  Automatically Labeling Semantic Classes , 2004, NAACL.

[27]  W. Mann,et al.  Rhetorical Structure Theory: looking back and moving ahead , 2006 .

[28]  Hideki Mima,et al.  Automatic recognition of multi-word terms:. the C-value/NC-value method , 2000, International Journal on Digital Libraries.

[29]  Hans Wortmann,et al.  Extracting Meronymy Relationships from Domain-Specific, Textual Corporate Databases , 2010, NLDB.

[30]  Rada Mihalcea,et al.  Wikify!: linking documents to encyclopedic knowledge , 2007, CIKM '07.

[31]  Roxana Gîrju,et al.  Automatic Detection of Causal Relations for Question Answering , 2003, ACL 2003.

[32]  Sergey Brin,et al.  Extracting Patterns and Relations from the World Wide Web , 1998, WebDB.

[33]  Rich Caruana,et al.  An empirical comparison of supervised learning algorithms , 2006, ICML.

[34]  Caroline Barrière Hierarchical refinement and representation of the causal relation , 2002 .

[35]  Malka Rappaport Hovav,et al.  Wiping the slate clean: A lexical semantic exploration , 1991, Cognition.

[36]  Hans Wortmann,et al.  Textractor: A Framework for Extracting Relevant Domain Concepts from Irregular Corporate Textual Datasets , 2010, BIS.

[37]  Caroline Barrière,et al.  Probing semantic relations : exploration and identification in specialized texts , 2010 .

[38]  William C. Mann,et al.  Rhetorical Structure Theory: Toward a functional theory of text organization , 1988 .

[39]  Gertjan van Noord,et al.  At Last Parsing Is Now Operational , 2006, JEPTALNRECITAL.

[40]  Hui Fang,et al.  Wikimantic: Disambiguation for Short Queries , 2012, NLDB.

[41]  Ann Copestake,et al.  Co-occurrence Contexts for Noun Compound Interpretation , 2007 .

[42]  Beatrice Santorini,et al.  Building a Large Annotated Corpus of English: The Penn Treebank , 1993, CL.

[43]  Gosse Bouma,et al.  On Learning Subtypes of the Part-Whole Relation: Do Not Mix Your Seeds , 2010, ACL.

[44]  Rada Mihalcea,et al.  Using Wikipedia for Automatic Word Sense Disambiguation , 2007, NAACL.

[45]  Christopher S. G. Khoo,et al.  The Many Facets of the Cause-Effect Relation , 2002 .

[46]  Gerwin Kramer,et al.  Classifying Image Galleries into a Taxonomy Using Metadata and Wikipedia , 2012, NLDB.

[47]  Alla Rozovskaya,et al.  UIUC: A Knowledge-rich Approach to Identifying Semantic Relations between Nominals , 2007, ACL 2007.

[48]  Gosse Bouma,et al.  Minimally-supervised extraction of domain-specific part-whole relations using Wikipedia as knowledge-base , 2013, Data Knowl. Eng..

[49]  Frederick J. Gravetter,et al.  Essentials of Statistics for the Behavioral Sciences , 1991 .

[50]  Eduard Hovy,et al.  Towards terascale knowledge acquisition , 2004, COLING 2004.

[51]  James R. Curran,et al.  Reducing Semantic Drift with Bagging and Distributional Similarity , 2009, ACL.

[52]  Doug Downey,et al.  Web-scale information extraction in knowitall: (preliminary results) , 2004, WWW '04.

[53]  Kenneth Ward Church,et al.  Termight: Identifying and Translating Technical Terminology , 1994, ANLP.

[54]  Dan Klein,et al.  Accurate Unlexicalized Parsing , 2003, ACL.

[55]  Steffen Staab,et al.  Learning Taxonomic Relations from Heterogeneous Sources of Evidence , 2005 .

[56]  Daniel S. Weld,et al.  Learning 5000 Relational Extractors , 2010, ACL.

[57]  Dan Klein,et al.  Feature-Rich Part-of-Speech Tagging with a Cyclic Dependency Network , 2003, NAACL.

[58]  Dan I. Moldovan,et al.  Text Mining for Causal Relations , 2002, FLAIRS.

[59]  David Hume,et al.  An Inquiry Concerning Human Understanding: With a Supplement, An Abstract of a Treatise of Human Nature , 1938 .

[60]  Dan I. Moldovan,et al.  Automatic Discovery of Part-Whole Relations , 2006, CL.

[61]  Roman Yangarber,et al.  Counter-Training in Discovery of Semantic Patterns , 2003, ACL.

[62]  Patrick Pantel,et al.  A Statistical Corpus-Based Term Extractor , 2001, Canadian Conference on AI.

[63]  Mark Stevenson,et al.  Dependency Pattern Models for Information Extraction , 2009 .

[64]  Ramanathan V. Guha,et al.  Cyc: toward programs with common sense , 1990, CACM.

[65]  Sanda M. Harabagiu,et al.  Learning Textual Graph Patterns to Detect Causal Event Relations , 2010, FLAIRS.

[66]  F. Keil Concepts, Kinds, and Cognitive Development , 1989 .

[67]  Piet Mertens,et al.  Verbum ex machina. Actes de la 13e conférence sur le traitement automatique des langues naturelles. , 2006 .

[68]  James H. Martin,et al.  Building a Corpus of Temporal-Causal Structure , 2008, LREC.

[69]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..

[70]  Jens Lehmann,et al.  DBpedia: A Nucleus for a Web of Open Data , 2007, ISWC/ASWC.

[71]  Jacob Cohen A Coefficient of Agreement for Nominal Scales , 1960 .

[72]  Stefan Evert,et al.  Using small random samples for the manual evaluation of statistical association measures , 2005, Comput. Speech Lang..

[73]  Doug Downey,et al.  Unsupervised named-entity extraction from the Web: An experimental study , 2005, Artif. Intell..