Benchmarking and Analysis of NoSQL Technologies

The social web generates terabytes of unstructured, user generated data, spread across thousands of commodity servers. The changed face of web based applications forced to invent new approaches to data management. The NoSQL databases were created as a mean to offer high performance (both in terms of speed and size) and high availability. They share the goals of massive scaling "on demand" (elasticity). Hence NoSQL is best suited solution to the big data needs posed by the evolving web application. With the advent of newer NoSQL technologies each day, developers face a serious issue in selecting the best suited datastore for their application as each of them place very different demands. The NoSQL databases haven't been tested for their claims. In this paper we discuss the benchmarking results of the 4 most popular NoSQL technologies namely- Cassandra, HBase, MongoDB and CouchDB. Also we analyze the results based on the performance of each datastore on various parameters. We are using the YCSB (Yahoo! Cloud Serving Benchmark) tool to compare Cassandra vs. HBase (key-value stores), and MongoDB vs. CouchDB (document-oriented databases). We are benchmarking the NoSQL datastore on tiers of Performance, Scalability and Availability.

[1]  Adam Silberstein,et al.  Benchmarking cloud serving systems with YCSB , 2010, SoCC '10.

[2]  Cristian Bucur,et al.  A comparison between several NoSQL databases with comments and notes , 2011, 2011 RoEduNet International Conference 10th Edition: Networking in Education and Research.

[3]  Kristina Chodorow,et al.  MongoDB: The Definitive Guide , 2010 .

[4]  Neal Leavitt,et al.  Will NoSQL Databases Live Up to Their Promise? , 2010, Computer.

[5]  Lars George,et al.  HBase: The Definitive Guide , 2011 .

[6]  Jeff Carpenter,et al.  Cassandra: The Definitive Guide , 2010 .

[7]  J. Chris Anderson,et al.  CouchDB: The Definitive Guide , 2010 .