The use of a graph‐based system to improve bibliographic information retrieval: System design, implementation, and evaluation

In this article, we propose a graph‐based interactive bibliographic information retrieval system—GIBIR. GIBIR provides an effective way to retrieve bibliographic information. The system represents bibliographic information as networks and provides a form‐based query interface. Users can develop their queries interactively by referencing the system‐generated graph queries. Complex queries such as “papers on information retrieval, which were cited by John's papers that had been presented in SIGIR” can be effectively answered by the system. We evaluate the proposed system by developing another relational database‐based bibliographic information retrieval system with the same interface and functions. Experiment results show that the proposed system executes the same queries much faster than the relational database‐based system, and on average, our system reduced the execution time by 72% (for 3‐node query), 89% (for 4‐node query), and 99% (for 5‐node query).

[1]  Peter T. Wood,et al.  Query languages for graph databases , 2012, SGMD.

[2]  Enrico Motta,et al.  SemSearch: A Search Engine for the Semantic Web , 2006, EKAW.

[3]  Doina Caragea,et al.  Graph Databases , 2019, Encyclopedia of Big Data Technologies.

[4]  Jie Tang,et al.  ArnetMiner: extraction and mining of academic social networks , 2008, KDD.

[5]  P. Jacsó As we may search : Comparison of major features of the Web of Science, Scopus, and Google Scholar citation-based and citation-enhanced databases , 2005 .

[6]  Sebastian Rudolph,et al.  Ontology-Based Interpretation of Keywords for Semantic Search , 2007, ISWC/ASWC.

[7]  Chong Wang,et al.  SPARK: Adapting Keyword Query to Semantic Search , 2007, ISWC/ASWC.

[8]  B. Martin,et al.  University Research Evaluation and Funding: An International Comparison , 2003 .

[9]  Matthew E Falagas,et al.  Comparison of PubMed, Scopus, Web of Science, and Google Scholar: strengths and weaknesses , 2007, FASEB journal : official publication of the Federation of American Societies for Experimental Biology.

[10]  Charu C. Aggarwal,et al.  Managing and Mining Graph Data , 2010, Managing and Mining Graph Data.

[11]  Tapio Lahdenmäki,et al.  Relational Database Index Design and the Optimizers: DB2, Oracle, SQL Server, et al. , 2005 .

[12]  Huajun Chen,et al.  The Semantic Web , 2011, Lecture Notes in Computer Science.

[13]  Sonali Agarwal,et al.  Graph Database Model for Querying, Searching and Updating , 2012 .

[14]  Lei Zou,et al.  DistanceJoin: Pattern Match Query In a Large Graph Database , 2009, Proc. VLDB Endow..

[15]  Jiong Yang,et al.  SPIN: mining maximal frequent subgraphs from graph databases , 2004, KDD.

[16]  Masood Fooladi,et al.  A Comparison between Two Main Academic Literature Collections: Web of Science and Scopus Databases , 2013, ArXiv.

[17]  Eric Miller,et al.  An Introduction to the Resource Description Framework , 1998, D Lib Mag..

[18]  Ramanathan V. Guha,et al.  Semantic search , 2003, WWW '03.

[19]  Philip S. Yu,et al.  Substructure similarity search in graph databases , 2005, SIGMOD '05.

[20]  J. Carroll,et al.  Jena: implementing the semantic web recommendations , 2004, WWW Alt. '04.

[21]  Tapio Lahdenmäki,et al.  Relational Database Index Design and the Optimizers , 2005 .

[22]  Xiaotao Huang,et al.  A Relation-Based Search Engine in Semantic Web , 2007, IEEE Transactions on Knowledge and Data Engineering.

[23]  Sherif Sakr,et al.  Graph Data Management: Techniques and Applications , 2011, Graph Data Management.

[24]  Josep-Lluís Larriba-Pey,et al.  Survey of Graph Database Performance on the HPC Scalable Graph Analysis Benchmark , 2010, WAIM Workshops.

[25]  A. John MINING GRAPH DATA , 2022 .

[26]  Claudio Gutierrez,et al.  Survey of graph database models , 2008, CSUR.

[27]  Wei Wang,et al.  Graph Database Indexing Using Structured Graph Decomposition , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[28]  James A. Hendler,et al.  The Semantic Web" in Scientific American , 2001 .

[29]  R. Doyle The American terrorist. , 2001, Scientific American.

[30]  René Peinl,et al.  Performance of graph query languages: comparison of cypher, gremlin and native access in Neo4j , 2013, EDBT '13.

[31]  Kurt Rohloff,et al.  An Evaluation of Triple-Store Technologies for Large Data Stores , 2007, OTM Workshops.

[32]  Margaret H. Dunham,et al.  Join processing in relational databases , 1992, CSUR.

[33]  Gerard Salton,et al.  The State of Retrieval System Evaluation , 1992, Inf. Process. Manag..

[34]  Frank van Harmelen,et al.  Sesame: A Generic Architecture for Storing and Querying RDF and RDF Schema , 2002, SEMWEB.