Embedding Semantic Similarity in Tree Kernels for Domain Adaptation of Relation Extraction

Relation Extraction (RE) is the task of extracting semantic relationships between entities in text. Recent studies on relation extraction are mostly supervised. The clear drawback of supervised methods is the need of training data: labeled data is expensive to obtain, and there is often a mismatch between the training data and the data the system will be applied to. This is the problem of domain adaptation. In this paper, we propose to combine (i) term generalization approaches such as word clustering and latent semantic analysis (LSA) and (ii) structured kernels to improve the adaptability of relation extractors to new text genres/domains. The empirical evaluation on ACE 2005 domains shows that a suitable combination of syntax and lexical generalization is very promising for domain adaptation.

[1]  John Blitzer,et al.  Domain Adaptation with Structural Correspondence Learning , 2006, EMNLP.

[2]  Stephan Bloehdorn,et al.  Exploiting Structure and Semantics for Expressive Text Kernels , 2007 .

[3]  Hsuan-Tien Lin,et al.  A note on Platt’s probabilistic outputs for support vector machines , 2007, Machine Learning.

[4]  Mengqiu Wang,et al.  A Re-examination of Dependency Path Kernels for Relation Extraction , 2008, IJCNLP.

[5]  Christian Widmer Domain Adaptation in Sequence Analysis , 2008 .

[6]  Jian Su,et al.  Discovering Relations Between Named Entities from a Large Raw Corpus Using Tree Similarity-Based Clustering , 2005, IJCNLP.

[7]  ChengXiang Zhai,et al.  Instance Weighting for Domain Adaptation in NLP , 2007, ACL.

[8]  Percy Liang,et al.  Semi-Supervised Learning for Natural Language , 2005 .

[9]  Dmitry Zelenko,et al.  Kernel Methods for Relation Extraction , 2002, J. Mach. Learn. Res..

[10]  Anders Søgaard,et al.  Robust Learning in Random Subspaces: Equipping NLP for OOV Effects , 2012, COLING.

[11]  Alessandro Moschitti,et al.  Kernel methods, syntax and semantics for relational text categorization , 2008, CIKM '08.

[12]  Daniel Jurafsky,et al.  Distant supervision for relation extraction without labeled data , 2009, ACL.

[13]  Dan Roth,et al.  Exploiting Background Knowledge for Relation Extraction , 2010, COLING.

[14]  Roberto Basili,et al.  Semantic convolution kernels over dependency trees: smoothed partial tree kernel , 2011, CIKM '11.

[15]  Sebastian Riedel,et al.  The CoNLL 2007 Shared Task on Dependency Parsing , 2007, EMNLP.

[16]  Anders Søgaard,et al.  Sentence-Level Instance-Weighting for Graph-Based and Transition-Based Dependency Parsing , 2011, IWPT.

[17]  Jian Su,et al.  A Composite Kernel to Extract Relations between Entities with Both Flat and Structured Features , 2006, ACL.

[18]  Michael Collins,et al.  Convolution Kernels for Natural Language , 2001, NIPS.

[19]  Jian Su,et al.  Exploring Various Knowledge in Relation Extraction , 2005, ACL.

[20]  Gene H. Golub,et al.  Calculating the singular values and pseudo-inverse of a matrix , 2007, Milestones in Matrix Computation.

[21]  Silvia Bernardini,et al.  The WaCky wide web: a collection of very large linguistically processed web-crawled corpora , 2009, Lang. Resour. Evaluation.

[22]  Nello Cristianini,et al.  Classification using String Kernels , 2000 .

[23]  Hal Daumé,et al.  Frustratingly Easy Domain Adaptation , 2007, ACL.

[24]  K. J. Evans,et al.  Computer Intensive Methods for Testing Hypotheses: An Introduction , 1990 .

[25]  Tim Oates,et al.  We’re Not in Kansas Anymore: Detecting Domain Changes in Streams , 2010, EMNLP.

[26]  Jing Jiang,et al.  Multi-Task Transfer Learning for Weakly-Supervised Relation Extraction , 2009, ACL.

[27]  Stephan Bloehdorn,et al.  Combined Syntactic and Semantic Kernels for Text Classification , 2007, ECIR.

[28]  Alessandro Moschitti,et al.  Efficient Convolution Kernels for Dependency and Constituent Syntactic Trees , 2006, ECML.

[29]  Bernhard Schölkopf,et al.  Correcting Sample Selection Bias by Unlabeled Data , 2006, NIPS.

[30]  Ralph Grishman,et al.  Semi-supervised Relation Extraction with Large-scale Word Clustering , 2011, ACL.

[31]  Slav Petrov,et al.  Overview of the 2012 Shared Task on Parsing the Web , 2012 .

[32]  Robert L. Mercer,et al.  Class-Based n-gram Models of Natural Language , 1992, CL.

[33]  Alessandro Moschitti,et al.  Convolution Kernels on Constituent, Dependency and Sequential Structures for Relation Extraction , 2009, EMNLP.

[34]  Zellig S. Harris,et al.  Distributional Structure , 1954 .

[35]  Razvan C. Bunescu,et al.  Learning to Extract Relations from the Web using Minimal Supervision , 2007, ACL.

[36]  Nello Cristianini,et al.  Kernel Methods for Pattern Analysis , 2003, ICTAI.

[37]  Hans Uszkoreit,et al.  Adaptation of Relation Extraction Rules to New Domains , 2008, LREC.

[38]  Alessandro Moschitti,et al.  A Study on Convolution Kernels for Shallow Statistic Parsing , 2004, ACL.

[39]  Aron Culotta,et al.  Dependency Tree Kernels for Relation Extraction , 2004, ACL.

[40]  Claudio Giuliano,et al.  Relation extraction and the influence of automatic named-entity recognition , 2007, TSLP.