论文信息 - Finding additional semantic entity information for search engines - 字舞流文

Finding additional semantic entity information for search engines

Entity-oriented search has become an essential component of modern search engines. It focuses on retrieving a list of entities or information about the specific entities instead of documents. In this paper, we study the problem of finding entity related information, referred to as attribute-value pairs, that play a significant role in searching target entities. We propose a novel decomposition framework combining reduced relations and the discriminative model, Conditional Random Field (CRF), for automatically finding entity-related attribute-value pairs from free text documents. This decomposition framework allows us to locate potential text fragments and identify the hidden semantics, in the form of attribute-value pairs for user queries. Empirical analysis shows that the decomposition framework outperforms pattern-based approaches due to its capability of effective integration of syntactic and semantic features.

Richi Nayak | Jinglan Zhang | Jun Hou | R. Nayak | Jinglan Zhang | Jun Hou

[1] Xian Zhang,et al. Classifying What-Type Questions by Head Noun Tagging , 2008, COLING.

[2] Oren Etzioni,et al. Open Information Extraction from the Web , 2007, CACM.

[3] Ralf Krestel,et al. Why finding entities in Wikipedia is difficult, sometimes , 2010, Information Retrieval.

[4] Rayid Ghani,et al. Text mining for product attribute extraction , 2006, SKDD.

[5] Daniel S. Weld,et al. Open Information Extraction Using Wikipedia , 2010, ACL.

[6] Benjamin Van Durme,et al. What You Seek Is What You Get: Extraction of Class Attributes from Query Logs , 2007, IJCAI.

[7] Xiao Li,et al. Understanding the Semantic Structure of Noun Phrase Queries , 2010, ACL.

[8] Henning Rode,et al. From Document to Entity Retrieval: Improving Precision and Performance of Focused Text Search , 2008 .

[9] Djoerd Hiemstra,et al. Structured Document Retrieval, Multimedia Retrieval, and Entity Ranking Using PF/Tijah , 2008, INEX.

[10] Katja Hofmann,et al. The University of Amsterdam at TREC 2010: Session, Entity and Relevance Feedback , 2010, TREC.

[11] Marius Pasca,et al. Turning Web Text and Search Queries into Factual Knowledge: Hierarchical Class Attribute Extraction , 2008, AAAI.

[12] Oren Etzioni,et al. Identifying Relations for Open Information Extraction , 2011, EMNLP.

[13] Fernando Diaz,et al. Sources of evidence for vertical selection , 2009, SIGIR.

[14] Fabian M. Suchanek,et al. Yago: A Core of Semantic Knowledge Unifying WordNet and Wikipedia , 2007 .

[15] Marius Pasca,et al. Organizing and searching the world wide web of facts -- step two: harnessing the wisdom of the crowds , 2007, WWW '07.

[16] Ellen M. Voorhees,et al. Overview of the TREC 2004 Novelty Track. , 2005 .

[17] M. de Rijke,et al. Entity Retrieval , 2007 .

[18] Qiang Yang,et al. Building bridges for web query classification , 2006, SIGIR.

[19] John F. Sowa,et al. Knowledge representation: logical, philosophical, and computational foundations , 2000 .

[20] Michael Strube,et al. Distinguishing between Instances and Classes in the Wikipedia Taxonomy , 2008, ESWC.

[21] Gerhard Weikum,et al. WWW 2007 / Track: Semantic Web Session: Ontologies ABSTRACT YAGO: A Core of Semantic Knowledge , 2022 .

[22] Abdulrahman Almuhareb,et al. Attributes in lexical acquisition , 2006 .

[23] Benjamin Van Durme,et al. Weakly-Supervised Acquisition of Open-Domain Classes and Class Attributes from Web Documents and Query Logs , 2008, ACL.

[24] Matthias Hartung,et al. Exploring Supervised LDA Models for Assigning Attributes to Adjective-Noun Phrases , 2011, EMNLP.

[25] Matthias Hartung,et al. A Structured Vector Space Model for Hidden Attribute Meaning in Adjective-Noun Phrases , 2010, COLING.

[26] Oren Etzioni,et al. The Tradeoffs Between Open and Traditional Relation Extraction , 2008, ACL.

[27] Andrew McCallum,et al. Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.