FRanCo - A Ground Truth Corpus for Fact Ranking Evaluation

The vast amount of information on the Web poses a challenge when trying to identify the most important facts. Many fact ranking algorithms have emerged, however, thus far there is a lack of a general domain, objective gold standard that would serve as an evaluation benchmark for comparing such systems. We present FRanCo, a ground truth for fact ranking acquired using crowdsourcing. The corpus is built on a representative DBpedia sample of 541 entities and made freely available. We have published both the aggregated and the raw data collected, including identified nonsense statements that contribute to improving data

[1]  Heiko Paulheim,et al.  DBpediaNYD - A Silver Standard Benchmark Dataset for Semantic Relatedness in DBpedia , 2013, NLP-DBPEDIA@ISWC.

[2]  Harald Sack,et al.  Evaluating Entity Summarization Using a Game-Based Ground Truth , 2012, International Semantic Web Conference.

[3]  Harald Sack,et al.  WhoKnows? Evaluating linked data heuristics with a quiz that cleans up DBpedia , 2011, Interact. Technol. Smart Educ..

[4]  Marcin Sydow,et al.  The notion of diversity in graphical entity summarisation on semantic knowledge graphs , 2013, Journal of Intelligent Information Systems.

[5]  Wendy Hall,et al.  The Semantic Web Revisited , 2006, IEEE Intelligent Systems.

[6]  Alistair Moffat,et al.  A similarity measure for indefinite rankings , 2010, TOIS.

[7]  Jens Lehmann,et al.  DBpedia - A large-scale, multilingual knowledge base extracted from Wikipedia , 2015, Semantic Web.

[8]  Amit P. Sheth,et al.  FACES: Diversity-Aware Entity Summarization Using Incremental Hierarchical Conceptual Clustering , 2015, AAAI.

[9]  Kalina Bontcheva,et al.  Corpus Annotation through Crowdsourcing: Towards Best Practice Guidelines , 2014, LREC.

[10]  Wei Zhang,et al.  Knowledge-Based Trust: Estimating the Trustworthiness of Web Sources , 2015, Proc. VLDB Endow..

[11]  Harald Sack,et al.  Towards exploratory video search using linked data , 2009, 2009 11th IEEE International Symposium on Multimedia.

[12]  Tim Berners-Lee,et al.  Linked Data - The Story So Far , 2009, Int. J. Semantic Web Inf. Syst..

[13]  Ioan Toma,et al.  Leveraging Usage Data for Linked Data Movie Entity Summarization , 2012, ArXiv.

[14]  Sougata Mukherjea,et al.  Utilizing Resource Importance for Ranking Semantic Web Query Results , 2004, SWDB.

[15]  Steffen Staab,et al.  TripleRank: Ranking Semantic Web Data by Tensor Decomposition , 2009, SEMWEB.

[16]  Andreas Dengel,et al.  BetterRelations: Collecting Association Strengths for Linked Data Triples with a Game , 2012, SeCO Book.

[17]  Achim Rettinger,et al.  Browsing DBpedia Entities with Summaries , 2014, ESWC.

[18]  Gjergji Kasneci,et al.  Assigning global relevance scores to DBpedia facts , 2014, 2014 IEEE 30th International Conference on Data Engineering Workshops.

[19]  Aidan Hogan,et al.  ReConRank: A Scalable Ranking Method for Semantic Web Data with Context , 2006 .

[20]  Yuzhong Qu,et al.  RELIN: Relatedness and Informativeness-Based Centrality for Entity Summarization , 2011, International Semantic Web Conference.

[21]  Vagelis Hristidis,et al.  ObjectRank: a system for authority-based search on databases , 2006, SIGMOD Conference.