Towards a hybrid relational and XML benchmark for loosely-coupled distributed data sources

Designed heterogeneous data sources for a hybrid version of the TPC-H enterprise.Developed hybrid LINQ queries over the relational and XML data sources.Evaluated the hybrid benchmark for loosely-coupled distributed data sources.Assessed query performance for two database products with various options. There are known benchmarks for the performance evaluation of relational and XML databases. However, there is an increasing demand for database applications that require access to heterogeneous loosely-coupled distributed data sources. This paper presents a hybrid benchmark based on TPC-H where the data sources are heterogeneous. Specifically, the paper describes the design of the relational and XML data sources as well as the query redesign in the LINQ query language, which supports queries over heterogeneous data sources. The results of a performance evaluation of the hybrid benchmark over various database products is included for untyped and typed XML with and without clearing the database cache.

[1]  Suzanne W. Dietrich,et al.  A Practitioner's Introduction to Database Performance Benchmarks and Measurements , 1992, Comput. J..

[2]  Suzanne W. Dietrich,et al.  A Hybrid TPCH Benchmark over Heterogeneous Data Sources , 2014 .

[3]  Trey Nash LINQ: Language Integrated Query , 2010 .

[4]  Suzanne W. Dietrich,et al.  LINQ ROX!: integrating LINQ into the database curriculum , 2011, SIGCSE '11.

[5]  David Maier,et al.  Principles of dataspace systems , 2006, PODS '06.

[6]  Suzanne W. Dietrich,et al.  Learning from database performance benchmarks , 2012 .

[7]  Fabrice Marguerie,et al.  LINQ in Action , 2008 .

[8]  Mahesh B. Chaudhari Materialized Views over Heterogeneous Structured Data Sources in a Distributed Event Stream Processing Environment , 2011 .

[9]  Suzanne W. Dietrich,et al.  Metadata Services for Distributed Event Stream Processing Agents , 2010, SEDE.

[10]  Ioana Manolescu,et al.  XMark: A Benchmark for XML Data Management , 2002, VLDB.

[11]  David Maier,et al.  From databases to dataspaces: a new abstraction for information management , 2005, SGMD.

[12]  Erik Meijer The World According to LINQ , 2011, ACM Queue.

[13]  Peter Brezany,et al.  Towards Realization of Dataspaces , 2006, 17th International Workshop on Database and Expert Systems Applications (DEXA'06).

[14]  Matthias Nicola,et al.  An XML transaction processing benchmark , 2007, SIGMOD '07.

[15]  Torsten Grust,et al.  Avalanche-safe LINQ compilation , 2010, Proc. VLDB Endow..

[16]  Charlie Calvert,et al.  Essential LINQ , 2009 .

[17]  Erhard Rahm,et al.  XMach-1: A Benchmark for XML Data Management , 2001, BTW.

[18]  Jim Gray,et al.  Benchmark Handbook: For Database and Transaction Processing Systems , 1992 .

[19]  Suzanne W. Dietrich,et al.  Detecting common subexpressions for multiple query optimization over loosely-coupled heterogeneous data sources , 2014, Distributed and Parallel Databases.