Multi level causal relation identification using extended features

Extracting causal relation underlying natural language is an important issue in knowledge discovery. Most previous studies of casual relation extraction focus on simple cases like causal relations between two noun phrases indicated by fixed verbs or prepositions. For more complicated causal relations, such as causal relations between clauses, the previously developed algorithm may not work. To solve this problem, this paper develops a system that is able to extract causal relations in multi-level language expressions such as, words, phrases and clauses without fixed relators. The information extraction system is composed of a multi-level relation extractor and an ensemble-based relation classifier. It may extract more subtypes of causal relations than previous work because extracting domain is expanded in terms of both syntactic expressions and semantic meanings. In addition, the proposed method outperforms previously developed methods because extended features based on lexical semantic resources are explored. Experiments show that our system achieves an accuracy of 88.69% and F-score of 0.6637 in a dataset with 300 sentences.

[1]  Quoc-Chinh Bui,et al.  Extracting causal relations on HIV drug resistance from literature , 2010, BMC Bioinformatics.

[2]  Dan Roth,et al.  Minimally Supervised Event Causality Identification , 2011, EMNLP.

[3]  Ralph Grishman,et al.  Message Understanding Conference- 6: A Brief History , 1996, COLING.

[4]  Dan Klein,et al.  Accurate Unlexicalized Parsing , 2003, ACL.

[5]  Euripides G. M. Petrakis,et al.  Semantic similarity methods in wordNet and their application to information retrieval on the web , 2005, WIDM '05.

[6]  Ju Cheng Yang,et al.  Text categorization algorithms using semantic approaches, corpus-based thesaurus and WordNet , 2012, Expert Syst. Appl..

[7]  Robert N. Oddy,et al.  Using cause-effect relations in text to improve information retrieval precision , 2001, Inf. Process. Manag..

[8]  Hongfei Lin,et al.  Knowledge transfer based on feature representation mapping for text classification , 2011, Expert Syst. Appl..

[9]  Leo Joskowicz,et al.  Deep domain models for discourse analysis , 1989, [1989] Proceedings. The Annual AI Systems in Government Conference.

[10]  Du-Seong Chang,et al.  Incremental cue phrase learning and bootstrapping method for causality extraction using cue phrase and word pair probabilities , 2006, Inf. Process. Manag..

[11]  Oren Etzioni,et al.  Open Information Extraction from the Web , 2007, CACM.

[12]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[13]  Syin Chan,et al.  Extracting Causal Knowledge from a Medical Database Using Graphical Patterns , 2000, ACL.

[14]  Dan I. Moldovan,et al.  Text Mining for Causal Relations , 2002, FLAIRS.

[15]  Francisco Herrera,et al.  A Review on Ensembles for the Class Imbalance Problem: Bagging-, Boosting-, and Hybrid-Based Approaches , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[16]  D. Massart,et al.  The Mahalanobis distance , 2000 .

[17]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1997, EuroCOLT.

[18]  Alla Rozovskaya,et al.  Automatic Semantic Relation Extraction with Multiple Boundary Generation , 2008, AAAI.

[19]  Eugene Charniak,et al.  Coarse-to-Fine n-Best Parsing and MaxEnt Discriminative Reranking , 2005, ACL.

[20]  Simone Paolo Ponzetto,et al.  BabelNet: The automatic construction, evaluation and application of a wide-coverage multilingual semantic network , 2012, Artif. Intell..

[21]  Martha Palmer,et al.  Verbnet: a broad-coverage, comprehensive verb lexicon , 2005 .

[22]  Zhi-Hua Zhou,et al.  Exploratory Under-Sampling for Class-Imbalance Learning , 2006, Sixth International Conference on Data Mining (ICDM'06).

[23]  Yuni Xia,et al.  AutoBayesian: Developing Bayesian Networks Based on Text Mining , 2011, DASFAA.

[24]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[25]  Michael Healy,et al.  Theory and Applications of Ontology: Computer Applications , 2010 .

[26]  Dan I. Moldovan,et al.  Causal Relation Extraction , 2008, LREC.

[27]  Phayung Meesad,et al.  A critical assessment of imbalanced class distribution problem: The case of predicting freshmen student attrition , 2014, Expert Syst. Appl..

[28]  Gerhard Weikum,et al.  WWW 2007 / Track: Semantic Web Session: Ontologies ABSTRACT YAGO: A Core of Semantic Knowledge , 2022 .

[29]  Oren Etzioni,et al.  Open Information Extraction: The Second Generation , 2011, IJCAI.

[30]  Roxana Gîrju,et al.  Automatic Detection of Causal Relations for Question Answering , 2003, ACL 2003.

[31]  John B. Lowe,et al.  The Berkeley FrameNet Project , 1998, ACL.

[32]  Mark A. Przybocki,et al.  The Automatic Content Extraction (ACE) Program – Tasks, Data, and Evaluation , 2004, LREC.

[33]  Massimo Poesio,et al.  Acquiring Bayesian Networks from Text , 2004, LREC.

[34]  Soon-Young Huh,et al.  Automatic generation of concept hierarchies using WordNet , 2008, Expert Syst. Appl..

[35]  Xiaoyi Jiang,et al.  Dynamic classifier ensemble model for customer classification with imbalanced class distribution , 2012, Expert Syst. Appl..

[36]  Antonio Moreno,et al.  Ontology-based information extraction of regulatory networks from scientific articles with case studies for Escherichia coli , 2013, Expert Syst. Appl..

[37]  Martha Palmer,et al.  From TreeBank to PropBank , 2002, LREC.

[38]  Dejing Dou,et al.  Ontology-based information extraction: An introduction and a survey of current approaches , 2010, J. Inf. Sci..

[39]  Kira Radinsky,et al.  Learning causality for news events prediction , 2012, WWW.

[40]  Josef Kittler,et al.  Floating search methods in feature selection , 1994, Pattern Recognit. Lett..

[41]  Graeme Hirst,et al.  Evaluating WordNet-based Measures of Lexical Semantic Relatedness , 2006, CL.

[42]  Michael K. Ng,et al.  Knowledge-based vector space model for text clustering , 2010, Knowledge and Information Systems.

[43]  R. M. Kaplan,et al.  Knowledge-based acquisition of causal relationships in text , 1991 .