Querying Knowledge Graphs by Example Entity Tuples

We witness an unprecedented proliferation of knowledge graphs that record millions of entities and their relationships. While knowledge graphs are structure-flexible and content-rich, they are difficult to use. The challenge lies in the gap between their overwhelming complexity and the limited database knowledge of non-professional users. If writing structured queries over “simple” tables is difficult, complex graphs are only harder to query. As an initial step toward improving the usability of knowledge graphs, we propose to query such data by example entity tuples, without requiring users to form complex graph queries. Our system, Graph Query By Example ( $\mathsf {GQBE}$ ), automatically discovers a weighted hidden maximum query graph based on input query tuples, to capture a user’s query intent. It then efficiently finds and ranks the top approximate matching answer graphs and answer tuples. We conducted experiments and user studies on the large Freebase and DBpedia datasets and observed appealing accuracy and efficiency. Our system provides a complementary approach to the existing keyword-based methods, facilitating user-friendly graph querying. To the best of our knowledge, there was no such proposal in the past in the context of graphs.

[1]  Wolfgang Nejdl,et al.  FreeQ: an interactive query interface for freebase , 2012, WWW.

[2]  Christos Faloutsos,et al.  Fast discovery of connection subgraphs , 2004, KDD.

[3]  Rahul Gupta,et al.  Answering Table Augmentation Queries from Unstructured Lists on the Web , 2009, Proc. VLDB Endow..

[4]  Daniel J. Abadi,et al.  Scalable Semantic Web Data Management Using Vertical Partitioning , 2007, VLDB.

[5]  Ramez Elmasri,et al.  GQBE: Querying knowledge graphs by example entity tuples , 2014, 2014 IEEE 30th International Conference on Data Engineering.

[6]  Themis Palpanas,et al.  Exemplar Queries: Give me an Example of What You Need , 2014, Proc. VLDB Endow..

[7]  Xuemin Lin,et al.  SPARK2: Top-k Keyword Query in Relational Databases , 2007, IEEE Transactions on Knowledge and Data Engineering.

[8]  Christos Faloutsos,et al.  Center-piece subgraphs: problem definition and fast solutions , 2006, KDD '06.

[9]  Fabian M. Suchanek,et al.  Yago: A Core of Semantic Knowledge Unifying WordNet and Wikipedia , 2007 .

[10]  Yizhou Sun,et al.  Query-driven discovery of semantically similar substructures in heterogeneous networks , 2012, KDD.

[11]  Praveen Paritosh,et al.  Freebase: a collaboratively created graph database for structuring human knowledge , 2008, SIGMOD Conference.

[12]  Christos Faloutsos,et al.  GRAPHITE: A Visual Query System for Large Graphs , 2008, 2008 IEEE International Conference on Data Mining Workshops.

[13]  Adriane Chapman,et al.  Making database systems usable , 2007, SIGMOD '07.

[14]  Christopher D. Manning,et al.  Introduction to Information Retrieval , 2010, J. Assoc. Inf. Sci. Technol..

[15]  Eugene W. Myers,et al.  Finding All Spanning Trees of Directed and Undirected Graphs , 1978, SIAM J. Comput..

[16]  Sourav S. Bhowmick,et al.  GBLENDER: towards blending visual query formulation and query processing in graph databases , 2010, SIGMOD Conference.

[17]  Haixun Wang,et al.  Semantic queries by example , 2013, EDBT '13.

[18]  Yinghui Wu,et al.  Emerging Graph Queries in Linked Data , 2012, 2012 IEEE 28th International Conference on Data Engineering.

[19]  Sihem Amer-Yahia,et al.  Structure and Content Scoring for XML , 2005, VLDB.

[20]  Haixun Wang,et al.  Probase: a probabilistic taxonomy for text understanding , 2012, SIGMOD Conference.

[21]  Nan Li,et al.  Neighborhood based fast graph search in large networks , 2011, SIGMOD '11.

[22]  Xiang-Sun Zhang,et al.  Exploring the constrained maximum edge-weight connected graph problem , 2009 .

[23]  Aijun An,et al.  Keyword Search in Graphs: Finding r-cliques , 2011, Proc. VLDB Endow..

[24]  Jacob Cohen Statistical Power Analysis for the Behavioral Sciences , 1969, The SAGE Encyclopedia of Research Design.

[25]  Christos Faloutsos,et al.  Fast best-effort pattern matching in large attributed graphs , 2007, KDD '07.

[26]  Marios D. Dikaiakos,et al.  A Query Formulation Language for the Data Web , 2012, IEEE Transactions on Knowledge and Data Engineering.

[27]  编程语言 Query by Example , 2010, Encyclopedia of Database Systems.

[28]  Cong Yu,et al.  REX: Explaining Relationships between Entity Pairs , 2011, Proc. VLDB Endow..

[29]  Ramez Elmasri,et al.  Towards a Query-by-Example System for Knowledge Graphs , 2014, GRADES.

[30]  William W. Cohen,et al.  Language-Independent Set Expansion of Named Entities Using the Web , 2007, Seventh IEEE International Conference on Data Mining (ICDM 2007).

[31]  Junjie Yao,et al.  Keyword Query Reformulation on Structured Data , 2012, 2012 IEEE 28th International Conference on Data Engineering.

[32]  Gerhard Weikum,et al.  Deep answers for naturally asked questions on the web of data , 2012, WWW.

[33]  Philip S. Yu,et al.  PathSim , 2011, Proc. VLDB Endow..

[34]  Jignesh M. Patel,et al.  TALE: A Tool for Approximate Large Graph Matching , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[35]  Gerhard Weikum,et al.  MING: mining informative entity relationship subgraphs , 2009, CIKM.

[36]  Jeffrey Xu Yu,et al.  Finding information nebula over large networks , 2011, CIKM '11.

[37]  Jens Lehmann,et al.  DBpedia: A Nucleus for a Web of Open Data , 2007, ISWC/ASWC.

[38]  Ihab F. Ilyas,et al.  Expressive and flexible access to web-extracted data: a keyword-based structured query language , 2010, SIGMOD Conference.