CoRE: A Context-Aware Relation Extraction Method for Relation Completion

We identify relation completion (RC) as one recurring problem that is central to the success of novel big data applications such as Entity Reconstruction and Data Enrichment. Given a semantic relation ℜ, RC attempts at linking entity pairs between two entity lists under the relation ℜ. To accomplish the RC goals, we propose to formulate search queries for each query entity α based on some auxiliary information, so that to detect its target entity β from the set of retrieved documents. For instance, a pattern-based method (PaRE) uses extracted patterns as the auxiliary information in formulating search queries. However, high-quality patterns may decrease the probability of finding suitable target entities. As an alternative, we propose CoRE method that uses context terms learned surrounding the expression of a relation as the auxiliary information in formulating queries. The experimental results based on several real-world web data collections demonstrate that CoRE reaches a much higher accuracy than PaRE for the purpose of RC.

[1]  Surajit Chaudhuri,et al.  What next?: a half-dozen data management research goals for big data and the cloud , 2012, PODS.

[2]  Marta Indulska,et al.  WebPut: Efficient Web-Based Data Imputation , 2012, WISE.

[3]  Philip S. Yu,et al.  Truth Discovery with Multiple Conflicting Information Providers on the Web , 2007, IEEE Transactions on Knowledge and Data Engineering.

[4]  Subbarao Kambhampati,et al.  SMARTINT: using mined attribute dependencies to integrate fragmented web databases , 2011, Journal of Intelligent Information Systems.

[5]  ChengXiang Zhai,et al.  Positional relevance model for pseudo-relevance feedback , 2010, SIGIR.

[6]  Nanda Kambhatla,et al.  Combining Lexical, Syntactic, and Semantic Features with Maximum Entropy Models for Information Extraction , 2004, ACL.

[7]  Satoshi Sekine,et al.  Preemptive Information Extraction using Unrestricted Relation Discovery , 2006, NAACL.

[8]  Hans-Peter Kriegel,et al.  A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.

[9]  Vincent Kanade,et al.  Clustering Algorithms , 2021, Wireless RF Energy Transfer in the Massive IoT Era.

[10]  Luis Gravano,et al.  Snowball: extracting relations from large plain-text collections , 2000, DL '00.

[11]  Xiaoyong Du,et al.  AML: Efficient Approximate Membership Localization within a Web-Based Join Framework , 2013, IEEE Transactions on Knowledge and Data Engineering.

[12]  Laurianne Sitbon,et al.  Learning-based relevance feedback for web-based relation completion , 2011, CIKM '11.

[13]  Stephen E. Robertson,et al.  Okapi at TREC-4 , 1995, TREC.

[14]  Christopher D. Manning,et al.  Incorporating Non-local Information into Information Extraction Systems by Gibbs Sampling , 2005, ACL.

[15]  Clement T. Yu,et al.  T-verifier: Verifying truthfulness of fact statements , 2011, 2011 IEEE 27th International Conference on Data Engineering.

[16]  Daniel Jurafsky,et al.  Distant supervision for relation extraction without labeled data , 2009, ACL.

[17]  Eric Crestan,et al.  Web-Scale Distributional Similarity and Entity Set Expansion , 2009, EMNLP.

[18]  Stephen E. Robertson,et al.  Okapi at TREC-3 , 1994, TREC.

[19]  John D. Lafferty,et al.  A study of smoothing methods for language models applied to Ad Hoc information retrieval , 2001, SIGIR '01.

[20]  Oren Etzioni,et al.  Open Information Extraction from the Web , 2007, CACM.

[21]  Estevam R. Hruschka,et al.  Toward an Architecture for Never-Ending Language Learning , 2010, AAAI.

[22]  Alessandro Moschitti,et al.  End-to-End Relation Extraction Using Distant Supervision from External Semantic Repositories , 2011, ACL.

[23]  Ralph Grishman,et al.  Extracting Relations with Integrated Information Using Kernel Methods , 2005, ACL.

[24]  Marc Moens,et al.  Named Entity Recognition without Gazetteers , 1999, EACL.

[25]  Jaana Kekäläinen,et al.  Cumulated gain-based evaluation of IR techniques , 2002, TOIS.

[26]  William W. Cohen,et al.  Automatic Set Expansion for List Question Answering , 2008, EMNLP.

[27]  William W. Cohen,et al.  Iterative Set Expansion of Named Entities Using the Web , 2008, 2008 Eighth IEEE International Conference on Data Mining.