A Query Formulation Language for the Data Web

We present a query formulation language (called MashQL) in order to easily query and fuse structured data on the web. The main novelty of MashQL is that it allows people with limited IT skills to explore and query one (or multiple) data sources without prior knowledge about the schema, structure, vocabulary, or any technical details of these sources. More importantly, to be robust and cover most cases in practice, we do not assume that a data source should have - an offline or inline - schema. This poses several language-design and performance complexities that we fundamentally tackle. To illustrate the query formulation power of MashQL, and without loss of generality, we chose the Data web scenario. We also chose querying RDF, as it is the most primitive data model; hence, MashQL can be similarly used for querying relational databases and XML. We present two implementations of MashQL, an online mashup editor, and a Firefox add on. The former illustrates how MashQL can be used to query and mash up the Data web as simple as filtering and piping web feeds; and the Firefox add on illustrates using the browser as a web composer rather than only a navigator. To end, we evaluate MashQL on querying two data sets, DBLP and DBPedia, and show that our indexing techniques allow instant user interaction.

[1]  Yannis Papakonstantinou,et al.  Graphical query interfaces for semistructured data: the QURSED system , 2005, TOIT.

[2]  Mustafa Jarrar,et al.  Towards Methodological Principles for Ontology Engineering. , 2005 .

[3]  Thomas A. Henzinger,et al.  Computing simulations on finite and infinite graphs , 1995, Proceedings of IEEE 36th Annual Foundations of Computer Science.

[4]  Gerhard Weikum,et al.  RDF-3X: a RISC-style engine for RDF , 2008, Proc. VLDB Endow..

[5]  JarrarMustafa,et al.  Querying the Data Web , 2010 .

[6]  E. Prud hommeaux,et al.  SPARQL query language for RDF , 2011 .

[7]  Marios D. Dikaiakos,et al.  Querying the Data Web: The MashQL Approach , 2010, IEEE Internet Computing.

[8]  Roy Goldman,et al.  DataGuides: Enabling Query Formulation and Optimization in Semistructured Databases , 1997, VLDB.

[9]  Moshé M. Zloof Query-by-Example: A Data Base Language , 1977, IBM Syst. J..

[10]  Robert E. Tarjan,et al.  Three Partition Refinement Algorithms , 1987, SIAM J. Comput..

[11]  Ehud Gudes,et al.  Exploiting local similarity for indexing paths in graph-structured data , 2002, Proceedings 18th International Conference on Data Engineering.

[12]  Albert R. Meyer,et al.  Word problems requiring exponential time(Preliminary Report) , 1973, STOC.

[13]  Marios D. Dikaiakos,et al.  A Data Mashup Language for the Data Web , 2009, LDOW.

[14]  Nigel Shadbolt,et al.  NITELIGHT: A Graphical Tool for Semantic Query Construction , 2008 .

[15]  Ramez Elmasri,et al.  An algebraic language for graphical query formulation using an extended entity-relationship model , 1987, CSC '87.

[16]  Jeffrey D. Ullman,et al.  Representative objects: concise representations of semistructured, hierarchical data , 1997, Proceedings 13th International Conference on Data Engineering.

[17]  Vassilis Christophides,et al.  Generating On the Fly Queries for the Semantic Web: The ICS-FORTH Graphical RQL Interface (GRQL) , 2004, SEMWEB.

[18]  John David N. Dionisio,et al.  MQuery: A Visual Query Language for Multimedia, Timeline and Simulation Data , 1996, J. Vis. Lang. Comput..

[19]  H. V. Jagadish,et al.  NaLIX: an interactive natural language interface for querying XML , 2005, SIGMOD '05.

[20]  Robert Meersman,et al.  RIDL on the CRIS Case: A Workbench for NIAM , 1988, Computerized Assistance During the Information Systems Life Cycle.

[21]  Daniel J. Abadi,et al.  Scalable Semantic Web Data Management Using Vertical Partitioning , 2007, VLDB.

[22]  Adriane Chapman,et al.  Making database systems usable , 2007, SIGMOD '07.

[23]  H. V. Jagadish,et al.  Assisted querying using instant-response interfaces , 2007, SIGMOD '07.

[24]  Abraham Bernstein,et al.  How Useful Are Natural Language Interfaces to the Semantic Web for Casual End-Users? , 2007, ISWC/ASWC.

[25]  Arthur H. M. ter Hofstede,et al.  Computer Supported Query Formulation in an Evolving Context , 1995, Australasian Database Conference.

[26]  Magesh Jayapandian,et al.  Automated creation of a forms-based database query interface , 2008, Proc. VLDB Endow..

[27]  Anthony C. Bloesch,et al.  Conceptual Queries Using ConQuer-II , 1997, ER.

[28]  Ernesto Damiani,et al.  Computing graphical queries over XML data , 2001, TOIS.

[29]  Magesh Jayapandian,et al.  Expressive query specification through form customization , 2008, EDBT '08.

[30]  Robert B. Miller,et al.  Response time in man-computer conversational transactions , 1899, AFIPS Fall Joint Computing Conference.

[31]  Jeffrey F. Naughton,et al.  Covering indexes for branching path queries , 2002, SIGMOD '02.

[32]  Eugene Inseok Chong,et al.  An Efficient SQL-based RDF Querying Scheme , 2005, VLDB.

[33]  Michael Steiner,et al.  SMash: secure component model for cross-domain mashups on unmodified browsers , 2008, WWW.

[34]  Oren Etzioni,et al.  Towards a theory of natural language interfaces to databases , 2003, IUI '03.

[35]  Stefano Spaccapietra,et al.  About Entities, Complex Objects and Object-oriented Data Models , 1989, ISCO.

[36]  Dan Suciu,et al.  Index Structures for Path Expressions , 1999, ICDT.

[37]  Axel Polleres,et al.  Who the FOAF knows Alice? RDF Revocation in DBin 2.0 , 2007, SWAP.