Extending reference architecture of big data systems towards machine learning in edge computing environments

Background Augmented reality, computer vision and other (e.g. network functions, Internet-of-Things (IoT)) use cases can be realised in edge computing environments with machine learning (ML) techniques. For realisation of the use cases, it has to be understood how data is collected, stored, processed, analysed, and visualised in big data systems. In order to provide services with low latency for end users, often utilisation of ML techniques has to be optimized. Also, software/service developers have to understand, how to develop and deploy ML models in edge computing environments. Therefore, architecture design of big data systems to edge computing environments may be challenging. Findings The contribution of this paper is reference architecture (RA) design of a big data system utilising ML techniques in edge computing environments. An earlier version of the RA has been extended based on 16 realised implementation architectures, which have been developed to edge/distributed computing environments. Also, deployment of architectural elements in different environments is described. Finally, a system view is provided of the software engineering aspects of ML model development and deployment. Conclusions The presented RA may facilitate concrete architecture design of use cases in edge computing environments. The value of RAs is reduction of development and maintenance costs of systems, reduction of risks, and facilitation of communication between different stakeholders.

[1]  A. Ehrenberg,et al.  The Design of Replicated Studies , 1993 .

[2]  Jitka Komarkova,et al.  Developing a government enterprise architecture framework to support the requirements of big and open linked data with the use of cloud computing , 2019, Int. J. Inf. Manag..

[3]  Carlos E. Cuesta,et al.  The Solid architecture for real-time management of big semantic data , 2015, Future Gener. Comput. Syst..

[4]  Elisa Yumi Nakagawa,et al.  Characterizing big data software architectures: a systematic mapping study , 2017, SBCARS.

[5]  Lorenza Giupponi,et al.  From 4G to 5G: Self-organized Network Management meets Machine Learning , 2017, Comput. Commun..

[6]  Abhishek Chandra,et al.  DLion: Decentralized Distributed Deep Learning in Micro-Clouds , 2021, HotCloud.

[7]  Yan Chen,et al.  Intelligent 5G: When Cellular Networks Meet Artificial Intelligence , 2017, IEEE Wireless Communications.

[8]  Ali Farhadi,et al.  YOLO9000: Better, Faster, Stronger , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Helmut Krcmar,et al.  Research for practice , 2019, Commun. ACM.

[10]  Peng Huang,et al.  AIOps: Real-World Challenges and Research Innovations , 2019, 2019 IEEE/ACM 41st International Conference on Software Engineering: Companion Proceedings (ICSE-Companion).

[11]  Mohammad Alian,et al.  A Network-Centric Hardware/Algorithm Co-Design to Accelerate Distributed Training of Deep Neural Networks , 2018, 2018 51st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[12]  Zhenming Liu,et al.  DeepDecision: A Mobile Deep Learning Framework for Edge Video Analytics , 2018, IEEE INFOCOM 2018 - IEEE Conference on Computer Communications.

[13]  Juan M. Corchado,et al.  A review of edge computing reference architectures and a new global edge proposal , 2019, Future Gener. Comput. Syst..

[14]  Hilde van der Togt,et al.  Publisher's Note , 2003, J. Netw. Comput. Appl..

[15]  Dawei Li,et al.  DeepCham: Collaborative Edge-Mediated Adaptive Deep Learning for Mobile Object Recognition , 2016, 2016 IEEE/ACM Symposium on Edge Computing (SEC).

[16]  Onur Mutlu,et al.  Gaia: Geo-Distributed Machine Learning Approaching LAN Speeds , 2017, NSDI.

[17]  Hyeontaek Lim,et al.  3LC: Lightweight and Effective Traffic Compression for Distributed Machine Learning , 2018, MLSys.

[18]  Wei Wei,et al.  Gradient-driven parking navigation using a continuous information potential field based on wireless sensor network , 2017, Inf. Sci..

[19]  Paris Avgeriou,et al.  Empirically-grounded reference architectures: a proposal , 2011, QoSA-ISARCS '11.

[20]  Soo-Mook Moon,et al.  Computation Offloading for Machine Learning Web Apps in the Edge Server Environment , 2018, 2018 IEEE 38th International Conference on Distributed Computing Systems (ICDCS).

[21]  H. T. Kung,et al.  Distributed Deep Neural Networks Over the Cloud, the Edge and End Devices , 2017, 2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS).

[22]  Pekka Pääkkönen,et al.  Evaluating the Quality of Social Media Data in Big Data Architecture , 2015, IEEE Access.

[23]  Anshul Jaiswal,et al.  Realtime Data Processing at Facebook , 2016, SIGMOD Conference.

[24]  Indranil Gupta,et al.  Stateful Scalable Stream Processing at LinkedIn , 2017, Proc. VLDB Endow..

[25]  Xukan Ran,et al.  Deep Learning With Edge Computing: A Review , 2019, Proceedings of the IEEE.

[26]  Paul W. P. J. Grefen,et al.  A framework for analysis and design of software reference architectures , 2012, Inf. Softw. Technol..

[27]  Paul de Vrieze,et al.  Simplifying Big Data Analytics Systems with a Reference Architecture , 2017, PRO-VE.

[28]  Eduardo B. Fernandez,et al.  Secure Development of Big Data Ecosystems , 2019, IEEE Access.

[29]  Parijat Dube,et al.  ModelOps: Cloud-Based Lifecycle Management for Reliable and Trusted AI , 2019, 2019 IEEE International Conference on Cloud Engineering (IC2E).

[30]  Philip S. Yu,et al.  Distributed Deep Learning Model for Intelligent Video Surveillance Systems with Edge Computing , 2019, IEEE Transactions on Industrial Informatics.

[31]  Xavier Franch,et al.  Benefits and drawbacks of software reference architectures: A case study , 2017, Inf. Softw. Technol..

[32]  Jan Bosch,et al.  Technical Debt tracking: Current state of practice: A survey and multiple case study in 15 large organizations , 2018, Sci. Comput. Program..

[33]  Soo-Mook Moon,et al.  IONN: Incremental Offloading of Neural Network Computations from Mobile Devices to Edge Servers , 2018, SoCC.

[34]  Paul de Vrieze,et al.  A reference architecture for big data systems , 2016, 2016 10th International Conference on Software, Knowledge, Information Management & Applications (SKIMA).

[35]  Bruno Sericola,et al.  Distributed deep learning on edge-devices: Feasibility via adaptive compression , 2017, 2017 IEEE 16th International Symposium on Network Computing and Applications (NCA).

[36]  Sezer Gören,et al.  A Deep Learning Based Distributed Smart Surveillance Architecture using Edge and Cloud Computing , 2019, 2019 International Conference on Deep Learning and Machine Learning in Emerging Applications (Deep-ML).

[37]  Sascha Bosse,et al.  Decision-Support for Selecting Big Data Reference Architectures , 2019, BIS.

[38]  Myriana Rifai,et al.  Transparent AR Processing Acceleration at the Edge , 2019, EdgeSys@EuroSys.

[39]  Xavier Franch,et al.  A software reference architecture for semantic-aware Big Data systems , 2017, Inf. Softw. Technol..

[40]  Hao Wen,et al.  Distributing Deep Neural Networks with Containerized Partitions at the Edge , 2019, HotEdge.

[41]  Chanchal Kumar Roy,et al.  Towards a Reference Architecture for Cloud-Based Plant Genotyping and Phenotyping Analysis Frameworks , 2017, 2017 IEEE International Conference on Software Architecture (ICSA).

[42]  Yue Wang,et al.  Artificial Intelligence for Elastic Management and Orchestration of 5G Networks , 2019, IEEE Wireless Communications.

[43]  FranchXavier,et al.  Benefits and drawbacks of software reference architectures , 2017 .

[44]  Pekka Pääkkönen,et al.  Quality management architecture for social media data , 2017, Journal of Big Data.

[45]  Daniel Pakkala,et al.  Reference Architecture and Classification of Technologies, Products and Services for Big Data Systems , 2015, Big Data Res..

[46]  Marco Gruteser,et al.  Edge Assisted Real-time Object Detection for Mobile Augmented Reality , 2019, MobiCom.

[47]  Rajkumar Buyya,et al.  HealthFog: An Ensemble Deep Learning based Smart Healthcare System for Automatic Diagnosis of Heart Diseases in Integrated IoT and Fog Computing Environments , 2019, Future Gener. Comput. Syst..

[48]  Nathan Marz,et al.  Big Data: Principles and best practices of scalable realtime data systems , 2015 .

[49]  Cong Wang,et al.  Twitter Heron: Towards Extensible Streaming Engines , 2017, 2017 IEEE 33rd International Conference on Data Engineering (ICDE).