论文信息 - Entity-Oriented Search

Entity-Oriented Search

dbo:abstract The first lines of the Wikipedia article Categories dc:subject Wikipedia categories assigned to the article Disambiguation dbo:wikiPageDisambiguates Disambiguation links External links dbo:wikiPageExternalLink Links to external web pages Geo-coordinates georss:point Geographical coordinates Homepage foaf:homepage Link to the official homepage of an instance Image foaf:depiction Link to the first image on the Wikipedia page Label rdfs:label The page title of the Wikipedia article Page links dbo:wikiPageWikiLink Links to other Wikipedia articles Redirect dbo:wikiPageRedirects Wikipedia page to redirect to See Table 2.4 for the URI prefixes them deviate further from the regular extractors in that they aggregate data from all Wikipedia pages as opposed to operating on a single article. The resulting datasets include grammatical gender (for entities of type person), lexicalizations (alternative names for entities and concepts), topic signatures (strongest related terms), and thematic concepts (the main subject entities/concepts for Wikipedia categories). 2.3.2.3 Datasets and Resources The output of each DBpedia extractor, for each language, is made available as a separate dataset. All datasets are provided in two serializations: as Turtle (N-triples) and as Turtle quads (N-Quads, which include context). The datasets can be divided into the following categories: • DBpedia Ontology: The latest version of the ontology that was used while extracting all datasets. • Core datasets: All infobox-based and specific feature extractors (including the ones listed in Table 2.3) belong here. • Links to other datasets: DBpedia is interlinked with a large number of knowledge bases. The datasets in this group provide links to external resources both on the instance level (owl:sameAs), e.g., to Freebase and YAGO, and on the schema level (owl:equivalentClass and owl:equivalentProperty), most notably to schema.org. • NLP datasets: This last group corresponds to the output of the statistical extractors. Namespaces and Internationalization The generic DBpedia URI namespaces are listed in the upper block of Table 2.4. As part of the internationalization efforts, some datasets are available both in localized and in canonicalized version.

Krisztian Balog | K. Balog

[1] Ramanathan V. Guha,et al. Semantic search , 2003, WWW '03.

[2] Surajit Chaudhuri,et al. InfoGather: entity augmentation and attribute discovery by holistic matching with web tables , 2012, SIGMOD Conference.

[3] Satoshi Sekine,et al. A survey of named entity recognition and classification , 2007 .

[4] Andrew Trotman,et al. Overview of the INEX 2010 Link the Wiki Track , 2010, INEX.

[5] Xiaolong Wang,et al. Modeling Mention, Context and Entity with Neural Networks for Entity Disambiguation , 2015, IJCAI.

[6] Gianluca Demartini,et al. Overview of the INEX 2008 Entity Ranking Track , 2009, INEX.

[7] Krisztian Balog,et al. Anticipating Information Needs Based on Check-in Activity , 2017, WSDM.

[8] Jeffrey Dean,et al. Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[9] G. Prasad. LEARNING TO LINK ENTITIES WITH KNOWLEDGE BASE , 2016 .

[10] Krisztian Balog,et al. Exploiting Entity Linking in Queries for Entity Retrieval , 2016, ICTIR.

[11] Gianluca DemartiniClaudiu. Why finding entities in Wikipedia is difficult, sometimes , 2010 .