Use of NoSQL Database for Handling Semi Structured Data: An Empirical Study of News RSS Feeds

Evolution of Web 2.0 has rapidly contributed to the volume and variety of data. Semi structured and unstructured data are various varieties generated by different sources in Web 2.0. The challenge is to handle semi structured and unstructured data which does not have any consistent format. Handling semi structured data, where data has varying formats urges a need for a DBMS to be less restrictive on the structure of the stored data. This paper discusses features, available data model and query model for NoSQL databases which are competent to handle semi structured data. Document-oriented NoSQL database MongoDB is compared with relational database MySQL in terms of evaluating the query response time. This comparison is presented as a case study for News dataset. News items are collected from various news channels in the form of RSS feeds which generate data in varying formats essentially exhibiting the property of being semi structured. Handling RSS feeds using relational database requires defining a schema and requires preprocessing the feeds. On the other hand, this data generated by heterogeneous data sources can be efficiently handled by NoSQL without any preprocessing. Result of comparison of NoSQL database MongoDB with relational database MySQL shows that NoSQL databases are better than relational database for semi structured data in terms of fabricating the structure of database and in query response time.

[1]  Zachary Parker,et al.  Comparing NoSQL MongoDB to an SQL DB , 2013, ACMSE '13.

[2]  Neal Leavitt,et al.  Will NoSQL Databases Live Up to Their Promise? , 2010, Computer.

[3]  Wilson C. Hsieh,et al.  Bigtable: A Distributed Storage System for Structured Data , 2006, TOCS.

[4]  Stefan Jablonski,et al.  NoSQL evaluation: A use case oriented survey , 2011, 2011 International Conference on Cloud and Service Computing.

[5]  Sahil Puri,et al.  A Survey and Comparison of Relational and Non-Relational Database , 2012 .

[6]  Zhu Wei-ping,et al.  Using MongoDB to implement textbook management system instead of MySQL , 2011, 2011 IEEE 3rd International Conference on Communication Software and Networks.

[7]  Nancy A. Lynch,et al.  Brewer's conjecture and the feasibility of consistent, available, partition-tolerant web services , 2002, SIGA.

[8]  Rick Cattell,et al.  Scalable SQL and NoSQL data stores , 2011, SGMD.

[9]  Punam Bedi,et al.  Beginning with big data simplified , 2014, 2014 International Conference on Data Mining and Intelligent Computing (ICDMIC).

[10]  Rinkle Rani,et al.  Modeling and querying data in NoSQL databases , 2013, 2013 IEEE International Conference on Big Data.