Data Modeling Guidelines for NoSQL Document-Store Databases

Good database design is key to high data avail-ability and consistency in traditional databases, and numerous techniques exist to abet designers in modeling schemas appropri-ately. These schemas are strictly enforced by traditional database engines. However, with the emergence of schema-free databases (NoSQL) coupled with voluminous and highly diversified datasets (big data), such aid becomes even more important as schemas in NoSQL are enforced by application developers, which requires a high level of competence. Precisely, existing modeling techniques and guides used in traditional databases are insufficient for big-data storage settings. As a synthesis, new modeling guidelines for NoSQL document-store databases are posed. These guidelines cut across both logical and physical stages of database designs. Each is developed based on solid empirical insights, yet they are prepared to be intuitive to developers and practitioners. To realize this goal, we employ an exploratory approach to the investigation of techniques, empirical methods and expert consultations. We analyze how industry experts prioritize requirements and analyze the relationships between datasets on the one hand and error prospects and awareness on the other hand. Few proprietary guidelines were extracted from a heuristic evaluation of 5 NoSQL databases. In this regard, the proposed guidelines have great potential to function as an imperative instrument of knowledge transfer from academia to NoSQL database modeling practices.

[1]  Rui Liu,et al.  NoSE: Schema Design for NoSQL Applications , 2017, IEEE Trans. Knowl. Data Eng..

[2]  Katalin Tunde Janosi-Rancz,et al.  Conceptual Design of Document NoSQL Database with Formal Concept Analysis , 2016 .

[3]  Terry A. Halpin UML data models from an ORM perspective: Part 7 , 1998 .

[4]  Aviv Ron,et al.  Analysis and Mitigation of NoSQL Injections , 2016, IEEE Security & Privacy.

[5]  Ciprian-Octavian Truica,et al.  Performance Evaluation for CRUD Operations in Asynchronously Replicated Document Oriented Database , 2015, 2015 20th International Conference on Control Systems and Computer Science.

[6]  Xindong Wu,et al.  Data mining with big data , 2014, IEEE Transactions on Knowledge and Data Engineering.

[7]  Rachid Chalal,et al.  Enabling Self-Service BI on Document Stores , 2017, EDBT/ICDT Workshops.

[8]  Steven Benson,et al.  AGGREGATE DATA MODELING STYLE , 2013 .

[9]  Clarence J M Tauro,et al.  Comparative Study of the New Generation, Agile, Scalable, High Performance NOSQL Databases , 2012 .

[10]  Michael J. Mior Automated schema design for NoSQL databases , 2014, SIGMOD'14 PhD Symposium.

[11]  Jagdev Bhogal,et al.  Handling Big Data Using NoSQL , 2015, 2015 IEEE 29th International Conference on Advanced Information Networking and Applications Workshops.

[12]  Sahil Puri,et al.  A Survey and Comparison of Relational and Non-Relational Database , 2012 .

[13]  Peter P. Chen The Entity-Relationship Model: Towards a unified view of Data , 1976 .

[14]  Haibo He,et al.  Learning from Imbalanced Data , 2009, IEEE Transactions on Knowledge and Data Engineering.

[15]  Taqwa A. Alhaj,et al.  Synchronization wireless algorithm based on message digest (SWAMD) for mobile device database , 2013, 2013 INTERNATIONAL CONFERENCE ON COMPUTING, ELECTRICAL AND ELECTRONIC ENGINEERING (ICCEEE).

[16]  Jorge Bernardino,et al.  NoSQL databases: MongoDB vs cassandra , 2013, C3S2E '13.

[17]  Rohiza Ahmad,et al.  New cardinality notations and styles for modeling NoSQL document-store databases , 2017, TENCON 2017 - 2017 IEEE Region 10 Conference.

[18]  Guan Le,et al.  Survey on NoSQL database , 2011, 2011 6th International Conference on Pervasive Computing and Applications.

[19]  Paolo Atzeni,et al.  Data Modelling in the NoSQL world: A contradiction? , 2016, CompSysTech.

[20]  Muhammad Younas,et al.  A New Model for Testing CRUD Operations in a NoSQL Database , 2016, 2016 IEEE 30th International Conference on Advanced Information Networking and Applications (AINA).

[21]  Ehud Gudes,et al.  Security Issues in NoSQL Databases , 2011, 2011IEEE 10th International Conference on Trust, Security and Privacy in Computing and Communications.

[22]  Wumuti Naheman,et al.  Review of NoSQL databases and performance testing on HBase , 2013, Proceedings 2013 International Conference on Mechatronic Sciences, Electric Engineering and Computer (MEC).

[23]  Peter Lake,et al.  Concise Guide to Databases: A Practical Introduction , 2013 .

[24]  Beng Chin Ooi,et al.  In-Memory Big Data Management and Processing: A Survey , 2015, IEEE Transactions on Knowledge and Data Engineering.