A test framework for large scale declarative queries: preliminary results

This paper presents preliminary results of experiments conducted on PT1.11 and PT1.22 data sets in order to compare the performances of both centralized and distributed database management systems. As for centralized systems, we deployed three different DBMSs: Mysql, Postgresql and DBMS-X (a commercial relational database). Regarding distributed systems, we deployed HadoopDB and Hive. The goal of these experiments is to report on the ability of these systems to support large scale declarative Queries.

[1]  Zheng Shao,et al.  Hive - a petabyte scale data warehouse using Hadoop , 2010, 2010 IEEE 26th International Conference on Data Engineering (ICDE 2010).

[2]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[3]  Abraham Silberschatz,et al.  HadoopDB: An Architectural Hybrid of MapReduce and DBMS Technologies for Analytical Workloads , 2009, Proc. VLDB Endow..