Asynchronous data translation framework for converting relational tables to document stores

Although Not Only SQL (NoSQL) techniques have some new features such as query optimization and caching in the era of big data, it is not mature enough to replace traditional Relational Database Management System (RDBMS). In a long period of time, the hybrid solution of RDBMS and NoSQL is a main steam technology. In this paper, we present an asynchronous data framework to translate relational tables into schema-free document stores. The challenge of this framework is how to ensure that the translation is transparent and pervasive to the application with minimal impact on RDBMS performance. To better describe the translation process, we present the intermediate cell state model (CSM) to describe versioning incremental state. We address this challenges by translating the binary log of RDBMS to CSM repository, and the repository is reset to the NoSQL when encountering schema adjustment asynchronously. Finally, we leverage the CSM rectification algorithm and NoSQL synchronization algorithm to implement the whole framework. The experimental results illustrate that this translation has a less minimal impact on RDBMS performance and lower cost than other solutions.

[1]  Song Guo,et al.  Green Communication in Energy Renewable Wireless Mesh Networks: Routing, Rate Control, and Power Allocation , 2014, IEEE Transactions on Parallel and Distributed Systems.

[2]  Joseph Issa,et al.  Hadoop and memcached: Performance and power characterization and analysis , 2012, Journal of Cloud Computing: Advances, Systems and Applications.

[3]  Xue Liu,et al.  Temporal Load Balancing with Service Delay Guarantees for Data Center Energy Cost Optimization , 2014, IEEE Transactions on Parallel and Distributed Systems.

[4]  Alan Fekete,et al.  Multi-version Concurrency via Timestamp Range Conflict Management , 2012, 2012 IEEE 28th International Conference on Data Engineering.

[5]  Kun Ma,et al.  Log-based change data capture from schema-free document stores using MapReduce , 2015, 2015 International Conference on Cloud Technologies and Applications (CloudTech).

[6]  Zhe Yang,et al.  AtMe: An Online Multi-tenant Social Networking Service in Campus , 2015 .

[7]  Samuel Madden,et al.  Transactional Consistency and Automatic Management in an Application Data Cache , 2010, OSDI.

[8]  Kun Ma,et al.  Live data migration approach from relational tables to schema-free collections with MapReduce , 2015 .

[9]  Shahram Ghandeharizadeh,et al.  Cache augmented database management systems , 2013, DBSocial '13.

[10]  Panos Vassiliadis,et al.  A Survey of Extract-Transform-Load Technology , 2009, Int. J. Data Warehous. Min..

[11]  Kathleen Ting,et al.  Apache Sqoop Cookbook , 2013 .

[12]  Alan L. Cox,et al.  A comparative evaluation of transparent scaling techniques for dynamic content servers , 2005, 21st International Conference on Data Engineering (ICDE'05).

[13]  Michael Stonebraker,et al.  The End of an Architectural Era (It's Time for a Complete Rewrite) , 2007, VLDB.

[14]  Wei Wang,et al.  Enabling Elasticity of Key-Value Stores in the Cloud Using Cost-Aware Live Data Migration: Enabling Elasticity of Key-Value Stores in the Cloud Using Cost-Aware Live Data Migration , 2014 .

[15]  Roland Bouman,et al.  Pentaho Kettle Solutions: Building Open Source ETL Solutions with Pentaho Data Integration , 2010 .

[16]  Andrey Balmin,et al.  Adaptive Processing of User-Defined Aggregates in Jaql , 2011, IEEE Data Eng. Bull..

[17]  Karsten Schwan,et al.  Faster, larger, easier: reining real-time big data processing in cloud , 2012, Middleware '12.

[18]  Ioannis Konstantinou,et al.  On the elasticity of NoSQL databases over cloud management platforms , 2011, CIKM '11.

[19]  Kun Ma,et al.  Toward lightweight transparent data middleware in support of document stores , 2013, 2013 Third World Congress on Information and Communication Technologies (WICT 2013).

[20]  Alexandros Labrinidis,et al.  Exploring the tradeoff between performance and data freshness in database-driven Web servers , 2004, The VLDB Journal.

[21]  Rick Cattell,et al.  Scalable SQL and NoSQL data stores , 2011, SGMD.

[22]  Lei Gao,et al.  All aboard the Databus!: Linkedin's scalable consistent change data capture platform , 2012, SoCC '12.

[23]  Qin Xiu Enabling Elasticity of Key-Value Stores in the Cloud Using Cost-Aware Live Data Migration , 2013 .