论文信息 - In-domain Relation Discovery with Meta-constraints via Posterior Regularization - 字舞流文

In-domain Relation Discovery with Meta-constraints via Posterior Regularization

We present a novel approach to discovering relations and their instantiations from a collection of documents in a single domain. Our approach learns relation types by exploiting meta-constraints that characterize the general qualities of a good relation in any domain. These constraints state that instances of a single relation should exhibit regularities at multiple levels of linguistic structure, including lexicography, syntax, and document-level context. We capture these regularities via the structure of our probabilistic model as well as a set of declaratively-specified constraints enforced during posterior inference. Across two domains our approach successfully recovers hidden relation structure, comparable to or outperforming previous state-of-the-art approaches. Furthermore, we find that a small set of constraints is applicable across the domains, and that using domain-specific constraints can further improve performance.

Regina Barzilay | Tahira Naseem | Harr Chen | Edward Benson

[1] Ming-Wei Chang,et al. Guiding Semi-Supervision with Constraint-Driven Learning , 2007, ACL.

[2] Dan Roth,et al. A Linear Programming Formulation for Global Inference in Natural Language Tasks , 2004, CoNLL.

[3] Ralph Grishman,et al. An Improved Extraction Pattern Representation Model for Automatic IE Pattern Acquisition , 2003, ACL.

[4] Richard Johansson,et al. Extended Constituent-to-Dependency Conversion for English , 2007, NODALIDA.

[5] Hoifung Poon,et al. Unsupervised Semantic Parsing , 2009, EMNLP.

[6] Dan Klein,et al. Accurate Unlexicalized Parsing , 2003, ACL.

[7] Luis Gravano,et al. Snowball: extracting relations from large plain-text collections , 2000, DL '00.

[8] Regina Barzilay,et al. Catching the Drift: Probabilistic Content Models, with Applications to Generation and Summarization , 2004, NAACL.

[9] Patrick Pantel,et al. DIRT @SBT@discovery of inference rules from text , 2001, KDD '01.

[10] Dong-Hong Ji,et al. Automatic Relation Extraction with Model Order Selection and Discriminative Label Identification , 2005, IJCNLP.

[11] Daniel Jurafsky,et al. Distant supervision for relation extraction without labeled data , 2009, ACL.

[12] Oren Etzioni,et al. The Tradeoffs Between Open and Traditional Relation Extraction , 2008, ACL.

[13] David M. Blei,et al. Syntactic Topic Models , 2008, NIPS.

[14] Thomas Hofmann,et al. Modeling General and Specific Aspects of Documents with a Probabilistic Topic Model , 2007 .

[15] Gideon S. Mann,et al. Generalized Expectation Criteria for Semi-Supervised Learning of Conditional Random Fields , 2008, ACL.

[16] Ronen Feldman,et al. Clustering for unsupervised relation identification , 2007, CIKM '07.

[17] Ellen Riloff,et al. Automatically Generating Extraction Patterns from Untagged Text , 1996, AAAI/IAAI, Vol. 2.

[18] Ralph Grishman,et al. Automatic Acquisition of Domain Knowledge for Information Extraction , 2000, COLING.

[19] Andrew McCallum,et al. Generalized Expectation Criteria for Bootstrapping Extractors using Record-Text Alignment , 2009, EMNLP.

[20] Ralph Grishman,et al. Discovering Relations among Named Entities from Large Corpora , 2004, ACL.

[21] Satoshi Sekine,et al. Preemptive Information Extraction using Unrestricted Relation Discovery , 2006, NAACL.

[22] Christopher D. Manning,et al. The Stanford Typed Dependencies Representation , 2008, CF+CDPE@COLING.

[23] Oren Etzioni,et al. Unsupervised Methods for Determining Object and Relation Synonyms on the Web , 2014, J. Artif. Intell. Res..

[24] Razvan C. Bunescu,et al. Learning to Extract Relations from the Web using Minimal Supervision , 2007, ACL.

[25] Oren Etzioni,et al. Open Information Extraction from the Web , 2007, CACM.

[26] Mirella Lapata,et al. Automatic Evaluation of Information Ordering: Kendall’s Tau , 2006, CL.

[27] Jian Su,et al. Discovering Relations Between Named Entities from a Large Raw Corpus Using Tree Similarity-Based Clustering , 2005, IJCNLP.

[28] Andrew McCallum,et al. Collective Cross-Document Relation Extraction Without Labelled Data , 2010, EMNLP.

[29] Dekang Lin,et al. DIRT – Discovery of Inference Rules from Text , 2001 .

[30] David R. Karger,et al. Content Modeling Using Latent Permutations , 2009, J. Artif. Intell. Res..

[31] Ji Donghong,et al. Automatic relation extraction with model order selection and discriminative label identification , 2005 .

[32] Mark Johnson,et al. Why Doesn’t EM Find Good HMM POS-Taggers? , 2007, EMNLP.

[33] Jorge Nocedal,et al. A Limited Memory Algorithm for Bound Constrained Optimization , 1995, SIAM J. Sci. Comput..

[34] Ben Taskar,et al. Expectation Maximization and Posterior Constraints , 2007, NIPS.

[35] Bo Zhang,et al. StatSnowball: a statistical approach to extracting entity relationships , 2009, WWW '09.

[36] Padhraic Smyth,et al. Modeling General and Specific Aspects of Documents with a Probabilistic Topic Model , 2006, NIPS.