Capturing Anomalies of Cassandra Performance with Increase in Data Volume: A NoSQL Analytical Approach

NoSQL database technology has been doing rounds since the early 1990s, but it was the exponential growth of internet and the rise of web applications that lead to a dynamic surge in the popularity of NoSQL databases. The BigTable research by Google (2006) and the Dynamo research by Amazon (2007) paved the way for databases which could develop with agility and operate at any scale. Cassandra and MongoDB have emerged as the two most widely used NoSQL database and hence either of the two is preferred depending on the data problem user is attempting to solve. This paper describes the underlying principles as well as the differences between both the databases. We focus on showing the anomaly in performance of Cassandra as the data volume increases and at the same time we compare its performance with that of MongoDB. We establish how important factor is data volume in choosing either of the databases for an application. Extensive experiments have been carried out to scale the performance in terms of anomaly similarities, and the future scope is pinpointed.

[1]  Sathiamoorthy Manoharan,et al.  A performance comparison of SQL and NoSQL databases , 2013, 2013 IEEE Pacific Rim Conference on Communications, Computers and Signal Processing (PACRIM).

[2]  C. Kumar,et al.  Implementation of atomicity and snapshot isolation for multi-row transactions on column oriented distributed databases using RDBMS , 2012, 2012 International Conference on Communications, Devices and Intelligent Systems (CODIS).

[3]  Prashant Malik,et al.  Cassandra: a decentralized structured storage system , 2010, OPSR.

[4]  Yi Jin,et al.  Research on the improvement of MongoDB Auto-Sharding in cloud environment , 2012, 2012 7th International Conference on Computer Science & Education (ICCSE).

[5]  Kyle Banker,et al.  MongoDB in Action , 2011 .

[6]  Andrey Kashlev,et al.  A Big Data Modeling Methodology for Apache Cassandra , 2015, 2015 IEEE International Congress on Big Data.

[7]  Tim Hawkins,et al.  The Definitive Guide to MongoDB: The NoSQL Database for Cloud and Desktop Computing , 2010 .

[8]  Lavanya Ramakrishnan,et al.  Performance evaluation of a MongoDB and hadoop platform for scientific data analysis , 2013, Science Cloud '13.

[9]  Cristian Bucur,et al.  A comparison between several NoSQL databases with comments and notes , 2011, 2011 RoEduNet International Conference 10th Edition: Networking in Education and Research.

[10]  Ehud Gudes,et al.  Security Issues in NoSQL Databases , 2011, 2011IEEE 10th International Conference on Trust, Security and Privacy in Computing and Communications.

[11]  Guan Le,et al.  Survey on NoSQL database , 2011, 2011 6th International Conference on Pervasive Computing and Applications.

[12]  Shankar Nayak Bhukya,et al.  Inclusion of e-commerce workflow with NoSQL DBMS: MongoDB document store , 2016, 2016 IEEE International Conference on Computational Intelligence and Computing Research (ICCIC).

[13]  Dharavath Ramesh,et al.  Data modelling for discrete time series data using Cassandra and MongoDB , 2016, 2016 3rd International Conference on Recent Advances in Information Technology (RAIT).

[14]  Jorge Bernardino,et al.  NoSQL databases: MongoDB vs cassandra , 2013, C3S2E '13.

[15]  Chetna Dabas,et al.  Fault tolerant streaming of live news using multi-node Cassandra , 2017, 2017 Tenth International Conference on Contemporary Computing (IC3).

[16]  Zachary Parker,et al.  Comparing NoSQL MongoDB to an SQL DB , 2013, ACMSE '13.