Towards Specification of a Software Architecture for Cross-Sectoral Big Data Applications

The proliferation of Big Data applications puts pressure on improving and optimizing the handling of diverse datasets across different domains. Among several challenges, major difficulties arise in data-sensitive domains like banking, telecommunications, etc., where strict regulations make very difficult to upload and experiment with real data on external cloud resources. In addition, most Big Data research and development efforts aim to address the needs of IT experts, while Big Data analytics tools remain unavailable to non-expert users to a large extent. In this paper, we report on the work-in-progress carried out in the context of the H2020 project I-BiDaaS (Industrial-Driven Big Data as a Self-service Solution) which aims to address the above challenges. The project will design and develop a novel architecture stack that can be easily configured and adjusted to address cross-sectoral needs, helping to resolve data privacy barriers in sensitive domains, and at the same time being usable by non-experts. This paper discusses and motivates the need for Big Data as a self-service, reviews the relevant literature, and identifies gaps with respect to the challenges described above. We then present the I-BiDaaS paradigm for Big Data as a self-service, position it in the context of existing references, and report on initial work towards the conceptual specification of the I-BiDaaS software architecture.

[1]  Ernesto Damiani,et al.  Model-Based Big Data Analytics-as-a-Service: Take Big Data to the Next Level , 2018, IEEE Transactions on Services Computing.

[2]  Ernesto Damiani,et al.  Big data analytics as-a-service: Issues and challenges , 2016, 2016 IEEE International Conference on Big Data (Big Data).

[3]  Cynthia K. Pickering,et al.  Self service business intelligence (SSBI) for employee communications and collaboration (ECC) , 2015, 2015 International Conference on Collaboration Technologies and Systems (CTS).

[4]  Dongyao Wu,et al.  Making real time data analytics available as a service , 2015, 2015 11th International ACM SIGSOFT Conference on Quality of Software Architectures (QoSA).

[5]  Pablo Basanta-Val,et al.  An Efficient Industrial Big-Data Engine , 2018, IEEE Transactions on Industrial Informatics.

[6]  Zibin Zheng,et al.  Service-Generated Big Data and Big Data-as-a-Service: An Overview , 2013, 2013 IEEE International Congress on Big Data.

[7]  Incheon Paik,et al.  Intelligent Big Data Analysis Architecture Based on Automatic Service Composition , 2015, 2015 IEEE International Congress on Big Data.

[8]  Benjamin Recht,et al.  KeystoneML: Optimizing Pipelines for Large-Scale Advanced Analytics , 2016, 2017 IEEE 33rd International Conference on Data Engineering (ICDE).

[9]  Yasser Abdel-Rady I. Mohamed,et al.  Data Lake Lambda Architecture for Smart Grids Big Data Analytics , 2018, IEEE Access.