Defining architecture components of the Big Data Ecosystem

Big Data are becoming a new technology focus both in science and in industry and motivate technology shift to data centric architecture and operational models. There is a vital need to define the basic information/semantic models, architecture components and operational models that together comprise a so-called Big Data Ecosystem. This paper discusses a nature of Big Data that may originate from different scientific, industry and social activity domains and proposes improved Big Data definition that includes the following parts: Big Data properties (also called Big Data 5V: Volume, Velocity, Variety, Value and Veracity), data models and structures, data analytics, infrastructure and security. The paper discusses paradigm change from traditional host or service based to data centric architecture and operational models in Big Data. The Big Data Architecture Framework (BDAF) is proposed to address all aspects of the Big Data Ecosystem and includes the following components: Big Data Infrastructure, Big Data Analytics, Data structures and models, Big Data Lifecycle Management, Big Data Security. The paper analyses requirements to and provides suggestions how the mentioned above components can address the main Big Data challenges. The presented work intends to provide a consolidated view of the Big Data phenomena and related challenges to modern technologies, and initiate wide discussion.

[1]  Paola Gargiulo,et al.  OpenAIRE - Open Access infrastructure for research in Europe , 2015 .

[2]  G. Broll,et al.  Microsoft Corporation , 1999 .

[3]  Jane Bates,et al.  Keeping up. , 2011, Nursing standard (Royal College of Nursing (Great Britain) : 1987).

[4]  Philippe Bonnet,et al.  A Provenance-Based Infrastructure to Support the Life Cycle of Executable Papers , 2011, ICCS.

[5]  L. Florio,et al.  Advancing technologies and federating communities: a study on authentication and authorisation platforms for scientific resources in Europe , 2012 .

[6]  J. Manyika Big data: The next frontier for innovation, competition, and productivity , 2011 .

[7]  Keith C. C. Chan,et al.  A disk based stream oriented approach for storing big data , 2013, 2013 International Conference on Collaboration Technologies and Systems (CTS).

[8]  Cees T. A. M. de Laat,et al.  Addressing big data issues in Scientific Data Infrastructure , 2013, 2013 International Conference on Collaboration Technologies and Systems (CTS).

[9]  Mark Bieraugel,et al.  Keeping up with…Big Data , 2013 .

[10]  Cees T. A. M. de Laat,et al.  Intercloud Architecture Framework for Heterogeneous Multi-Provider Cloud based Infrastructure Services Provisioning , 2013, Int. J. Next Gener. Comput..

[11]  Cees T. A. M. de Laat,et al.  Big Security for Big Data: Addressing Security Challenges for the Big Data Infrastructure , 2013, Secure Data Management.

[12]  Yuri Demchenko,et al.  Defining Intercloud Federation Framework for Multi-provider Cloud Services Integration , 2013, CLOUD 2013.