Semplore: A scalable IR approach to search the Web of Data

The Web of Data keeps growing rapidly. However, the full exploitation of this large amount of structured data faces numerous challenges like usability, scalability, imprecise information needs and data change. We present Semplore, an IR-based system that aims at addressing these issues. Semplore supports intuitive faceted search and complex queries both on text and structured data. It combines imprecise keyword search and precise structured query in a unified ranking scheme. Scalable query processing is supported by leveraging inverted indexes traditionally used in IR systems. This is combined with a novel block-based index structure to support efficient index update when data changes. The experimental results show that Semplore is an efficient and effective system for searching the Web of Data and can be used as a basic infrastructure for Web-scale Semantic Web search engines.

[1]  Ramanathan V. Guha,et al.  Semantic search , 2003, WWW '03.

[2]  Yuzhong Qu,et al.  Falcons: searching and browsing entities on the semantic web , 2008, WWW.

[3]  Daniel Schwabe,et al.  A hybrid approach for searching in the semantic web , 2004, WWW '04.

[4]  Ryen W. White,et al.  Supporting exploratory search , 2006 .

[5]  Enrico Motta,et al.  WATSON: a gateway for the semantic web , 2007 .

[6]  Eyal Oren,et al.  Sindice.com: Weaving the Open Linked Data , 2007, ISWC/ASWC.

[7]  Jie Zhang,et al.  Semplore: An IR Approach to Scalable Hybrid Query of Semantic Web Data , 2007, ISWC/ASWC.

[8]  Jeffrey Scott Vitter,et al.  Efficient Update of Indexes for Dynamically Changing Web Documents , 2006, World Wide Web.

[9]  Oren Etzioni,et al.  Relational Web Search , 2006 .

[10]  Li Ma,et al.  Effective and efficient semantic web data management over DB2 , 2008, SIGMOD Conference.

[11]  Kevin Chen-Chuan Chang,et al.  EntityRank: Searching Entities Directly and Holistically , 2007, VLDB.

[12]  Kevin Li,et al.  Faceted metadata for image search and browsing , 2003, CHI '03.

[13]  Gerhard Weikum,et al.  RDF-3X: a RISC-style engine for RDF , 2008, Proc. VLDB Endow..

[14]  Panagiotis G. Ipeirotis,et al.  Automatic construction of multifaceted browsing interfaces , 2005, CIKM '05.

[15]  Charles L. A. Clarke,et al.  Hybrid index maintenance for growing text collections , 2006, SIGIR.

[16]  Jürgen Umbrich,et al.  YARS2: A Federated Repository for Querying Graph Structured Data from the Web , 2007, ISWC/ASWC.

[17]  Aidan Hogan,et al.  ReConRank: A Scalable Ranking Method for Semantic Web Data with Context , 2006 .

[18]  Tim Berners-Lee,et al.  Linked data on the web (LDOW2008) , 2008, WWW.

[19]  Eugene Inseok Chong,et al.  An Efficient SQL-based RDF Querying Scheme , 2005, VLDB.

[20]  Timothy W. Finin,et al.  Swoogle: a search and metadata engine for the semantic web , 2004, CIKM '04.

[21]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[22]  Gerhard Weikum,et al.  EntityAuthority: Semantically Enriched Graph-Based Authority Propagation , 2007, WebDB.

[23]  Gerhard Weikum,et al.  NAGA: Searching and Ranking Knowledge , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[24]  Lynda Hardman,et al.  /facet: A Browser for Heterogeneous Semantic Web Repositories , 2006, SEMWEB.

[25]  Hugh E. Williams,et al.  Compression of inverted indexes For fast query evaluation , 2002, SIGIR '02.

[26]  Sougata Mukherjea,et al.  Utilizing Resource Importance for Ranking Semantic Web Query Results , 2004, SWDB.

[27]  Eyal Oren,et al.  Extending Faceted Navigation for RDF Data , 2006, SEMWEB.

[28]  Jeff Heflin,et al.  LUBM: A benchmark for OWL knowledge base systems , 2005, J. Web Semant..

[29]  Hector Garcia-Molina,et al.  Incremental updates of inverted lists for text document retrieval , 1994, SIGMOD '94.

[30]  Haofen Wang,et al.  Efficient Index Maintenance for Frequently Updated Semantic Data , 2008, ASWC.

[31]  Hugh E. Williams,et al.  Efficient online index maintenance for contiguous inverted lists , 2006, Inf. Process. Manag..

[32]  Orri Erling,et al.  RDF Support in the Virtuoso DBMS , 2007, CSSW.

[33]  Ryen W. White,et al.  Exploratory search interfaces: categorization, clustering and beyond: report on the XSI 2005 workshop at the Human-Computer Interaction Laboratory, University of Maryland , 2005, SIGF.

[34]  David Wood Scaling the Kowari Metastore , 2005, WISE Workshops.

[35]  Ian Horrocks,et al.  Querying the Semantic Web: A Formal Approach , 2002, SEMWEB.

[36]  Alistair Moffat,et al.  Self-indexing inverted files for fast text retrieval , 1996, TOIS.

[37]  Yun Peng,et al.  Swoogle: A semantic web search and metadata engine , 2004, CIKM 2004.

[38]  Sihem Amer-Yahia,et al.  Report on the DB/IR panel at SIGMOD 2005 , 2005, SGMD.

[39]  Vagelis Hristidis,et al.  ObjectRank: Authority-Based Keyword Search in Databases , 2004, VLDB.