STBenchmark: towards a benchmark for mapping systems

A fundamental problem in information integration is to precisely specify the relationships, called mappings, between schemas. Designing mappings is a time-consuming process. To alleviate this problem, many mapping systems have been developed to assist the design of mappings. However, a benchmark for comparing and evaluating these systems has not yet been developed. We present STBenchmark, a solution towards a much needed benchmark for mapping systems. We first describe the challenges that are unique to the development of benchmarks for mapping systems. After this, we describe the three components of STBenchmark: (1) a basic suite of mapping scenarios that we believe represents a minimum set of transformations that should be readily supported by any mapping system, (2) a mapping scenario generator as well as an instance generator that can produce complex mapping scenarios and, respectively, instances of varying sizes of a given schema, (3) a simple usability model that can be used as a first-cut measure on the case of use of a mapping system. We use STBenchmark to evaluate four mapping systems and report our results, as well as describe some interesting observations.

[1]  Mary Roth,et al.  XML mapping technology: Making connections in an XML-centric world , 2006, IBM Syst. J..

[2]  Laks V. S. Lakshmanan,et al.  HepToX: Heterogeneous Peer to Peer XML Databases , 2005, ArXiv.

[3]  Dan Suciu,et al.  Schema mediation for large-scale semantic data sharing , 2005, The VLDB Journal.

[4]  Zohra Bellahsene,et al.  XBenchMatch: a Benchmark for XML Schema Matching Tools , 2007, VLDB.

[5]  S. Mackenzie,et al.  A comparison of input device in elemental pointing and dragging task , 1991, CHI 1991.

[6]  DANIELE BRAGA,et al.  XQBE (XQuery By Example): A visual interface to the standard XML query language , 2005, TODS.

[7]  Michael J. Carey,et al.  XPERANTO: Middleware for Publishing Object-Relational Data as XML Documents , 2000, VLDB.

[8]  Felix Naumann,et al.  A Classification of Schema Mappings and Analysis of Mapping Tools , 2007, BTW.

[9]  Jetta Carol Culpepper Merriam‐Webster Online: The Language Center , 2000 .

[10]  Erhard Rahm,et al.  XMach-1: A Benchmark for XML Data Management , 2001, BTW.

[11]  Wang Chiew Tan,et al.  Comparing and evaluating mapping systems with STBenchmark , 2008, Proc. VLDB Endow..

[12]  Arnon Rosenthal,et al.  eTuner: tuning schema matching software using synthetic scenarios , 2007, The VLDB Journal.

[13]  Philip A. Bernstein,et al.  Implementing mapping composition , 2007, The VLDB Journal.

[14]  Stéphane Bressan,et al.  XOO7: applying OO7 benchmark to XML query processing tool , 2001, CIKM '01.

[15]  Ioana Manolescu,et al.  XMark: A Benchmark for XML Data Management , 2002, VLDB.

[16]  Laura M. Haas,et al.  Clio grows up: from research prototype to industrial tool , 2005, SIGMOD '05.

[17]  Laura M. Haas,et al.  Beauty and the Beast: The Theory and Practice of Information Integration , 2007, ICDT.

[18]  Wang Chiew Tan,et al.  Debugging schema mappings with routes , 2006, VLDB.

[19]  Panos Vassiliadis,et al.  Towards a Benchmark for ETL Workflows , 2007, QDB.

[20]  Philip A. Bernstein,et al.  Model management 2.0: manipulating richer mappings , 2007, SIGMOD '07.

[21]  Philip A. Bernstein,et al.  Incremental schema matching , 2006, VLDB.

[22]  Maurizio Lenzerini,et al.  Data integration: a theoretical perspective , 2002, PODS.

[23]  Cong Yu,et al.  Semantic Adaptation of Schema Mappings when Schemas Evolve , 2005, VLDB.

[24]  Andrew B. Whinston,et al.  Model management , 1994 .

[25]  Barbara Lerner,et al.  A model for compound type changes encountered in schema evolution , 2000, TODS.

[26]  Ronald Fagin,et al.  Translating Web Data , 2002, VLDB.

[27]  Michael Stonebraker,et al.  THALIA: Test Harness for the Assessment of Legacy Information Integration Approaches , 2005, 21st International Conference on Data Engineering (ICDE'05).

[28]  Denilson Barbosa,et al.  ToXgene: a template-based data generator for XML , 2002, SIGMOD '02.

[29]  Phokion G. Kolaitis Schema mappings, data exchange, and metadata management , 2005, PODS '05.

[30]  Hamid Pirahesh,et al.  Efficiently publishing relational data as XML documents , 2001, The VLDB Journal.

[31]  M. Tamer Özsu,et al.  XBench benchmark and performance testing of XML DBMSs , 2004, Proceedings. 20th International Conference on Data Engineering.

[32]  Denilson Barbosa,et al.  ToXgene: An extensible template-based data generator for XML , 2002, WebDB.

[33]  Angela Bonifati,et al.  Schema mapping verification: the spicy way , 2008, EDBT '08.

[34]  高璐,et al.  浅谈Microsoft Visual Studio 2010新特性 , 2010 .

[35]  Moshé M. Zloof Query-by-Example: A Data Base Language , 1977, IBM Syst. J..

[36]  Erhard Rahm,et al.  A survey of approaches to automatic schema matching , 2001, The VLDB Journal.

[37]  Renée J. Miller,et al.  Mapping Adaptation under Evolving Schemas , 2003, VLDB.

[38]  M. Erwig Xing: a visual XML query language , 2003, J. Vis. Lang. Comput..

[39]  Toshiaki Okawara,et al.  An Approach to the Benchmark Development for Data Exchange Tools , 2006, Databases and Applications.

[40]  Jignesh M. Patel,et al.  The Michigan Benchmark: A Microbenchmark for XML Query Processing Systems , 2002, EEXTT.

[41]  Abigail Sellen,et al.  A comparison of input devices in element pointing and dragging tasks , 1991, CHI.

[42]  Paolo Papotti,et al.  Clip: a Visual Language for Explicit Schema Mappings , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[43]  Michael J. Carey Data delivery in a service-oriented world: the BEA aquaLogic data services platform , 2006, SIGMOD Conference.