Domain-specific hierarchical subgraph extraction: A recommendation use case

Hierarchical relationships play a key role in knowledge graphs. Particularly, large and well-known knowledge graphs such as DBpedia contain significant number of facts expressed with hierarchical relationships in comparison to the other types of relationships. These hierarchical relationships are extensively harnessed by applications such as personalization, question answering, and recommendation systems. However, the presence of large number of facts with hierarchical relationships makes the applications computationally intensive. Additionally, the applications can be domain-specific and may not require all the hierarchical facts available, but only require those that are specific to the domain. In this paper, we present an approach to extract domain-specific hierarchical subgraph from large knowledge graphs by identifying the domain-specificity of the categories in the hierarchy. Given a domain, the domain-specificity of categories are determined by combining different types of evidence using a probabilistic framework. We show the effectiveness of our approach with a recommendation use case for movie and book domains. Our evaluation demonstrates that the domain-specific hierarchical subgraphs extracted by our approach can reduce the baseline subgraph by 40% to 50% without compromising the accuracy of the recommendations. Furthermore, the presented approach outperforms the recommendation results obtained with a state-of-the-art domain-specific subgraph extraction technique which uses supervised learning.

[1]  James R. Foulds,et al.  HyPER: A Flexible and Extensible Probabilistic Framework for Hybrid Recommender Systems , 2015, RecSys.

[2]  José Leal Paulo Using proximity to compute semantic relatedness in RDF graphs , 2013 .

[3]  Sören Auer,et al.  AGDISTIS - Graph-Based Disambiguation of Named Entities Using Linked Data , 2014, International Semantic Web Conference.

[4]  Andrea Passerini,et al.  Bootstrapping Domain Ontologies from Wikipedia: A Uniform Approach , 2015, IJCAI.

[5]  Lise Getoor,et al.  Knowledge Graph Identification , 2013, SEMWEB.

[6]  Simone Paolo Ponzetto,et al.  Taxonomy induction based on a collaboratively built knowledge repository , 2011, Artif. Intell..

[7]  Philip Resnik,et al.  Using Information Content to Evaluate Semantic Similarity in a Taxonomy , 1995, IJCAI.

[8]  Christian Bizer,et al.  DBpedia spotlight: shedding light on the web of documents , 2011, I-Semantics '11.

[9]  Markus Zanker,et al.  Linked open data to support content-based recommender systems , 2012, I-SEMANTICS '12.

[10]  Guy Shani,et al.  Evaluating Recommendation Systems , 2011, Recommender Systems Handbook.

[11]  Jens Lehmann,et al.  DBpedia - A large-scale, multilingual knowledge base extracted from Wikipedia , 2015, Semantic Web.

[12]  GetoorLise,et al.  Hinge-loss Markov random fields and probabilistic soft logic , 2017 .

[13]  Pasquale Lops,et al.  Linked Open Data-enabled Strategies for Top-N Recommendations , 2014, CBRecSys@RecSys.

[14]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[15]  Amit P. Sheth,et al.  Growing Fields of Interest - Using an Expand and Reduce Strategy for Domain Model Extraction , 2008, 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology.

[16]  John G. Breslin,et al.  Measuring semantic distance for linked open data-enabled recommender systems , 2016, SAC.

[17]  M. de Rijke,et al.  Siamese CBOW: Optimizing Word Embeddings for Sentence Representations , 2016, ACL.

[18]  Aditya Kalyanpur,et al.  A Comparison of Hard Filters and Soft Evidence for Answer Typing in Watson , 2012, International Semantic Web Conference.

[19]  政子 鶴岡,et al.  1998 IEEE International Conference on SMCに参加して , 1998 .

[20]  Ioana Hulpus,et al.  Path-Based Semantic Relatedness on Linked Data and Its Use to Word and Entity Disambiguation , 2015, International Semantic Web Conference.

[21]  Tony Veale,et al.  An Intrinsic Information Content Metric for Semantic Similarity in WordNet , 2004, ECAI.

[22]  Jonathan Weese,et al.  UMBC_EBIQUITY-CORE: Semantic Textual Similarity Systems , 2013, *SEMEVAL.

[23]  Karl Aberer,et al.  TRank: Ranking Entity Types Using the Web of Data , 2013, International Semantic Web Conference.

[24]  Tommaso Di Noia,et al.  Top-N recommendations from implicit feedback leveraging linked open data , 2013, IIR.

[25]  Tiziano Flati,et al.  Two Is Bigger (and Better) Than One: the Wikipedia Bitaxonomy Project , 2014, ACL.

[26]  Amit P. Sheth,et al.  User Interests Identification on Twitter Using a Hierarchical Knowledge Base , 2014, ESWC.

[27]  Amit P. Sheth,et al.  Harnessing relationships for domain-specific subgraph extraction: A recommendation use case , 2016, 2016 IEEE International Conference on Big Data (Big Data).