Domain-Driven Design of Big Data Systems based on a Reference Architecture

In general, different application domains may require different big data systems. To enhance the understanding of big data systems and support the architect in designing big data architectures, we propose a domain-driven design approach for deriving application architectures. To this end, we propose a domain engineering approach in which a family feature model, reference architecture, and corresponding design rules are identified. The family feature model is derived based on a domain analysis of big data systems and represents the common and variant features. The reference architecture represents a generic structure for various application architectures of big data systems. Finally, the design rules define reusable design heuristics for designing an application architecture based on the selection of features of the family feature model and the reference architecture. We illustrate our approach for deriving the big data architectures of different well-known big data systems.

[1]  Bas Geerdink,et al.  A reference architecture for big data solutions introducing a model to perform predictive analytics using big data technology , 2013, 8th International Conference for Internet Technology and Secured Transactions (ICITST-2013).

[2]  Paulo Carreira,et al.  Real-Time Integration of Building Energy Data , 2014, 2014 IEEE International Congress on Big Data.

[3]  Eoin Woods,et al.  Software Systems Architecture: Working with Stakeholders Using Viewpoints and Perspectives , 2005 .

[4]  Gilad Mishne,et al.  Fast data in the era of big data: Twitter's real-time related query suggestion architecture , 2012, SIGMOD '13.

[5]  Jimmy J. Lin,et al.  WTF: the who to follow service at Twitter , 2013, WWW.

[6]  Karthik Ranganathan,et al.  Apache hadoop goes realtime at Facebook , 2011, SIGMOD '11.

[7]  Jonathan Leibiusky,et al.  Getting Started with Storm , 2012 .

[8]  Bedir Tekinerdogan,et al.  Integrating Platform Selection Rules in the Model Driven Architecture Approach , 2003, MDAFA.

[9]  Daniel Pakkala,et al.  Reference Architecture and Classification of Technologies, Products and Services for Big Data Systems , 2015, Big Data Res..

[10]  Nathan Marz,et al.  Big Data: Principles and best practices of scalable realtime data systems , 2015 .

[11]  Jaejoon Lee,et al.  Concepts and Guidelines of Feature Modeling for Product Line Software Engineering , 2002, ICSR.

[12]  Krzysztof Czarnecki,et al.  Feature models are views on ontologies , 2006 .

[13]  Vincenzo Morabito Big Data Governance , 2015 .

[14]  Richard N. Taylor,et al.  A Classification and Comparison Framework for Software Architecture Description Languages , 2000, IEEE Trans. Software Eng..

[15]  Bedir Tekinerdogan,et al.  Feature Driven Survey of Big Data Systems , 2016, IoTBD.

[16]  Eoin Woods,et al.  Using Architectural Perspectives , 2005, 5th Working IEEE/IFIP Conference on Software Architecture (WICSA'05).

[17]  Zheng Shao,et al.  Data warehousing and analytics infrastructure at facebook , 2010, SIGMOD Conference.

[18]  Edmon Begoli,et al.  Design Principles for Effective Knowledge Discovery from Big Data , 2012, 2012 Joint Working IEEE/IFIP Conference on Software Architecture and European Conference on Software Architecture.

[19]  Bedir Tekinerdogan,et al.  Feature-Driven Design of SaaS Architectures , 2013 .

[20]  Paris Avgeriou,et al.  Empirically-grounded reference architectures: a proposal , 2011, QoSA-ISARCS '11.

[21]  John Klein,et al.  Distribution, Data, Deployment: Software Architecture Convergence in Big Data Systems , 2015, IEEE Software.

[22]  M Markus Maier,et al.  Towards a big data reference architecture , 2013 .

[23]  Paul W. P. J. Grefen,et al.  A classification of software reference architectures: Analyzing their success and effectiveness , 2009, 2009 Joint Working IEEE/IFIP Conference on Software Architecture & European Conference on Software Architecture.

[24]  Felix Bachmann,et al.  Deriving Architectural Tactics: A Step Toward Methodical Architectural Design , 2003 .

[25]  Richard McClatchey,et al.  Designing Traceability into Big Data Systems , 2015, ArXiv.

[26]  Amy W. Apon,et al.  SciFlow: A dataflow-driven model architecture for scientific computing using Hadoop , 2013, 2013 IEEE International Conference on Big Data.