Scalable transactions in cloud data stores

Cloud Computing is a successful paradigm for deploying scalable and highly available web applications at low cost. In real life scenarios, the applications are expected to be scalable and consistent. Data partitioning is a commonly used technique for improving scalability. Traditional horizontal partitioning techniques are not capable of tracking the data access patterns of web applications. The development of novel, scalable workload-driven data partitioning is a requirement for improving scalability. This paper proposes a novel workload-aware approach, with scalable workload-driven data partitioning based on data access patterns of web applications for transaction processing. It is specially designed to scale out using NoSQL data stores. In contrast to the existing static approaches, this approach offers high throughput, lower response time, and a less number of distributed transactions. Further, implementation and validation of scalable workload-driven partitioning scheme is carried out through experimentation over cloud data stores such as Hadoop HBase and Amazon SimpleDB. An experimental results of the concerned partitioning scheme is conducted using the industry standard TPC-C benchmark. Analytical and experimental results are observed and it shows that scalable workload-driven data partitioning outperforms the schema level and graph partitioning in terms of throughput, response time and distributed transactions.

[1]  Carlo Curino,et al.  Schism , 2010, Proc. VLDB Endow..

[2]  Werner Vogels,et al.  Data Access Patterns in The Amazon.com Technology Platform , 2007, VLDB.

[3]  Ahirrao Swati SCALABLE TRANSACTIONS IN CLOUD DATA STORES , 2012 .

[4]  Divyakant Agrawal,et al.  ElasTraS: An elastic, scalable, and self-managing transactional database for the cloud , 2013, TODS.

[5]  Gábor Terstyánszky,et al.  Buttressing volatile desktop grids with cloud resources within a reconfigurable environment service for workflow orchestration , 2014, Journal of Cloud Computing.

[6]  Shyam Antony,et al.  Data Management Challenges in Cloud Computing Infrastructures , 2010, DNIS.

[7]  Mohamed F. Mokbel,et al.  Deuteronomy: Transaction Support for Cloud Data , 2011, CIDR.

[8]  Miriam A. M. Capretz,et al.  Data management in cloud environments: NoSQL and NewSQL data stores , 2013, Journal of Cloud Computing: Advances, Systems and Applications.

[9]  Zhou Wei,et al.  CloudTPS: Scalable Transactions for Web Applications in the Cloud , 2012, IEEE Trans. Serv. Comput..

[10]  Yawei Li,et al.  Megastore: Providing Scalable, Highly Available Storage for Interactive Services , 2011, CIDR.

[11]  Amr El Abbadi,et al.  ElasTraS: An Elastic Transactional Data Store in the Cloud , 2009, HotCloud.

[12]  Divyakant Agrawal,et al.  Scalable and elastic transactional data stores for cloud computing platforms , 2011 .

[13]  Divyakant Agrawal,et al.  G-Store: a scalable data store for transactional multi key access in the cloud , 2010, SoCC '10.

[14]  Raghu Ramakrishnan,et al.  PNUTS in Flight: Web-Scale Data Serving at Yahoo , 2012, IEEE Internet Computing.

[15]  Hans-Arno Jacobsen,et al.  PNUTS: Yahoo!'s hosted data serving platform , 2008, Proc. VLDB Endow..

[16]  Wilson C. Hsieh,et al.  Bigtable: A Distributed Storage System for Structured Data , 2006, TOCS.

[17]  Dongman Lee,et al.  Notes on Cloud computing principles , 2014, Journal of Cloud Computing.

[18]  Carlo Curino,et al.  Relational Cloud: a Database Service for the cloud , 2011, CIDR.

[19]  Werner Vogels,et al.  Dynamo: amazon's highly available key-value store , 2007, SOSP.

[20]  Philip A. Bernstein,et al.  Adapting microsoft SQL server for cloud computing , 2011, 2011 IEEE 27th International Conference on Data Engineering.

[21]  Zhou Wei,et al.  Scalable Transactions for Web Applications in the Cloud , 2009, Euro-Par.