Toward lightweight transparent data middleware in support of document stores

With the advent of rapid increase in the size of data in legacy applications, it is urgent to build efficient and flexible data middleware that supports SQL to MapReduce transformation. In this paper, we propose a data middleware to translate the SQLs to the operations that the NoSQL can understand using the MapReduce framework we design. This middleware transforms the read and write SQL statements into MapReduce jobs. In addition, a set of transformation rules is discussed in detail. This middleware can significantly reduce redundant computations and improve the performance of I/O operations.

[1]  Neal Leavitt,et al.  Will NoSQL Databases Live Up to Their Promise? , 2010, Computer.

[2]  Rick Cattell,et al.  Scalable SQL and NoSQL data stores , 2011, SGMD.

[3]  Kirk P. Arnett,et al.  The size of the IT job market , 2008, CACM.

[4]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[5]  Kevin R Coombes,et al.  Relax with CouchDB--into the non-relational DBMS era of bioinformatics. , 2012, Genomics.

[6]  Lavanya Ramakrishnan,et al.  Performance evaluation of a MongoDB and hadoop platform for scientific data analysis , 2013, Science Cloud '13.

[7]  Guan Le,et al.  Survey on NoSQL database , 2011, 2011 6th International Conference on Pervasive Computing and Applications.

[8]  Zheng Shao,et al.  Hive - a petabyte scale data warehouse using Hadoop , 2010, 2010 IEEE 26th International Conference on Data Engineering (ICDE 2010).

[9]  Sanjay Ghemawat,et al.  MapReduce: simplified data processing on large clusters , 2008, CACM.

[10]  Jingren Zhou,et al.  SCOPE: easy and efficient parallel processing of massive data sets , 2008, Proc. VLDB Endow..

[11]  Kun Ma,et al.  A transparent data middleware in support of multi-tenancy , 2011, 2011 7th International Conference on Next Generation Web Services Practices.

[12]  Boon Thau Loo,et al.  Optimizing Completion Time and Resource Provisioning of Pig Programs , 2012, 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012).

[13]  Fusheng Wang,et al.  YSmart: Yet Another SQL-to-MapReduce Translator , 2011, 2011 31st International Conference on Distributed Computing Systems.

[14]  Mariano P. Consens,et al.  Web data processing on the cloud , 2010, CASCON.

[15]  M. R. Sumalatha,et al.  S2MART: Smart Sql to Map-Reduce Translators , 2013, APWeb.