Incremental cue phrase learning and bootstrapping method for causality extraction using cue phrase and word pair probabilities

This work aims to extract possible causal relations that exist between noun phrases. Some causal relations are manifested by lexical patterns like causal verbs and their sub-categorization. We use lexical patterns as a filter to find causality candidates and we transfer the causality extraction problem to the binary classification. To solve the problem, we introduce probabilities for word pair and concept pair that could be part of causal noun phrase pairs. We also use the cue phrase probability that could be a causality pattern. These probabilities are learned from the raw corpus in an unsupervised manner. With this probabilistic model, we increase both precision and recall. Our causality extraction shows an F-score of 77.37%, which is an improvement of 21.14 percentage points over the baseline model. The long distance causal relation is extracted with the binary tree-styled cue phrase. We propose an incremental cue phrase learning method based on the cue phrase confidence score that was measured after each causal classifier learning step. A better recall of 15.37 percentage points is acquired after the cue phrase learning.

[1]  Gregory F. Cooper,et al.  A Bayesian Method for Constructing Bayesian Belief Networks from Databases , 1991, UAI.

[2]  Dan I. Moldovan,et al.  Mining Answers for Causation Questions , 2002 .

[3]  Du-Seong Chang,et al.  Causal Relation Extraction Using Cue Phrase and Lexical Pair Probabilities , 2004, IJCNLP.

[4]  Daniel Marcu,et al.  An Unsupervised Approach to Recognizing Discourse Relations , 2002, ACL.

[5]  Donna K. Harman,et al.  Overview of the First Text REtrieval Conference (TREC-1) , 1992, TREC.

[6]  Roxana Gîrju,et al.  Automatic Detection of Causal Relations for Question Answering , 2003, ACL 2003.

[7]  F. Luccio,et al.  Exact Rooted Subtree Matching in Sublinear Time , 2001 .

[8]  Ki Chan,et al.  Semantic Expectation-Based Causation Knowledge Extraction: A Study on Hong Kong Stock Movement Analysis , 2001, PAKDD.

[9]  S. Brodetsky Essai philosophique sur les probabilités , 1922, Nature.

[10]  Tom Burr,et al.  Causation, Prediction, and Search , 2003, Technometrics.

[11]  R. M. Kaplan,et al.  Knowledge-based acquisition of causal relationships in text , 1991 .

[12]  Timo Järvinen,et al.  A non-projective dependency parser , 1997, ANLP.

[13]  Leo Joskowicz,et al.  Deep domain models for discourse analysis , 1989, [1989] Proceedings. The Annual AI Systems in Government Conference.

[14]  P. Laplace A Philosophical Essay On Probabilities , 1902 .

[15]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[16]  Syin Chan,et al.  Extracting Causal Knowledge from a Medical Database Using Graphical Patterns , 2000, ACL.

[17]  Christopher S. G. Khoo,et al.  Automatic Extraction of Cause-Effect Information from Newspaper Text Without Knowledge-based Inferencing , 1998 .