On the use of big data frameworks for big service composition

Abstract Over the last years, big data has emerged as a new paradigm for the processing and analysis of massive volumes of data. Big data processing has been combined with service and cloud computing, leading to a new class of services called “Big Services”. In this new model, services can be seen as an abstract layer that hides the complexity of the processed big data. To meet users' complex and heterogeneous needs in the era of big data, service reuse is a natural and efficient means that helps orchestrating available services' operations, to provide customer on-demand big services. However different from traditional Web service composition, composing big services refers to the reuse of, not only existing high-quality services, but also high-quality data sources, while taking into account their security constraints (e.g., data provenance, threat level and data leakage). Moreover, composing heterogeneous and large-scale data-centric services faces several challenges, apart from security risks, such as the big services' high execution time and the incompatibility between providers' policies across multiple domains and clouds. Aiming to solve the above issues, we propose a scalable approach for big service composition, which considers not only the quality of reused services (QoS), but also the quality of their consumed data sources (QoD). Since the correct representation of big services requirements is the first step towards an effective composition, we first propose a quality model for big services and we quantify the data breaches using L-Severity metrics. Then to facilitate processing and mining big services' related information during composition, we exploit the strong mathematical foundation of fuzzy Relational Concept Analysis (fuzzy RCA) to build the big services' repository as a lattice family. We also used fuzzy RCA to cluster services and data sources based on various criteria, including their quality levels, their domains, and the relationships between them. Finally, we define algorithms that parse the lattice family to select and compose high-quality and secure big services in a parallel fashion. The proposed method, which is implemented on top of Spark big data framework, is compared with two existing approaches, and experimental studies proved the effectiveness of our big service composition approach in terms of QoD-aware composition, scalability, and security breaches.

[1]  Haithem Mezni,et al.  A negotiation‐based service selection approach using swarm intelligence and kernel density estimation , 2018, Softw. Pract. Exp..

[2]  Haithem Mezni,et al.  Reusing process fragments for fast service composition: a clustering-based approach , 2019, Enterp. Inf. Syst..

[3]  Sidi Mohamed Benslimane,et al.  Composing Data Services with Uncertain Semantics , 2015, IEEE Transactions on Knowledge and Data Engineering.

[4]  Yang Xu,et al.  QoS-Based Service Selection Method for Big Data Service Composition , 2017, 22017 IEEE International Conference on Computational Science and Engineering (CSE) and IEEE International Conference on Embedded and Ubiquitous Computing (EUC).

[5]  Zibin Zheng,et al.  Distributed QoS Evaluation for Real-World Web Services , 2010, 2010 IEEE International Conference on Web Services.

[6]  Rachida Dssouli,et al.  Big Data Pre-processing: A Quality Framework , 2015, 2015 IEEE International Congress on Big Data.

[7]  Jérôme David,et al.  A Guided Walk into Link Key Candidate Extraction with Relational Concept Analysis , 2020, JT@ISWC.

[8]  Thomas Devogele,et al.  IoT Mashups: From IoT Big Data to IoT Big Service , 2017, ICFNDS.

[9]  Benjamin T. Hazen,et al.  Data quality for data science, predictive analytics, and big data in supply chain management: An introduction to the problem and suggestions for research and applications , 2014 .

[10]  Keke Gai,et al.  Dynamic energy-aware cloudlet-based mobile cloud computing model for green computing , 2016, J. Netw. Comput. Appl..

[11]  M. Shamim Hossain,et al.  Big Data-Driven Service Composition Using Parallel Clustered Particle Swarm Optimization in Mobile Environment , 2016, IEEE Transactions on Services Computing.

[12]  Zhihui Lu,et al.  Fusion of Cognitive Wireless Networks and Edge Computing , 2019, IEEE Wireless Communications.

[13]  Amedeo Napoli,et al.  Relational concept analysis: mining concept lattices from multi-relational data , 2013, Annals of Mathematics and Artificial Intelligence.

[14]  Fei Hao,et al.  Modeling a Big Medical Data Cognitive System with N-Ary Formal Concept Analysis , 2016 .

[15]  Elisa Bertino Big data security and privacy , 2016, BigData.

[16]  Keke Gai,et al.  Energy-aware task assignment for mobile cyber-enabled applications in heterogeneous cloud computing , 2018, J. Parallel Distributed Comput..

[17]  Le Wei,et al.  Design of Manufacturing Big Data Access Platform Based on SOA , 2018, 2018 IEEE 4th International Conference on Computer and Communications (ICCC).

[18]  Quan Z. Sheng,et al.  From Big Data to Big Service , 2015, Computer.

[19]  Yangyong Zhu,et al.  The Challenges of Data Quality and Data Quality Assessment in the Big Data Era , 2015, Data Sci. J..

[20]  Xiaofei Xu,et al.  A new paradigm of software service engineering in big data and big service era , 2018, Computing.

[21]  Rokia Missaoui,et al.  Formal Concept Analysis for Knowledge Discovery and Data Mining: The New Challenges , 2004, ICFCA.

[22]  Huajun Chen,et al.  Ontology-based Scientific Data Service Composition: A Query Rewriting-based Approach , 2008, AAAI Spring Symposium: Semantic Scientific Knowledge Integration.

[23]  Ch. Aswani Kumar,et al.  Knowledge Representation Using Formal Concept Analysis: A study on Concept Generation , 2014 .

[24]  Mohand-Said Hacid,et al.  A FCA framework for inference control in data integration systems , 2018, Distributed and Parallel Databases.

[25]  Nora Cuppens-Boulahia,et al.  PrivComp: a privacy-aware data service composition system , 2013, EDBT '13.

[26]  Laurence T. Yang,et al.  A Tensor-Based Big Service Framework for Enhanced Living Environments , 2016, IEEE Cloud Computing.

[27]  Francesco Orciuoli,et al.  Distributed online Temporal Fuzzy Concept Analysis for stream processing in smart cities , 2017, J. Parallel Distributed Comput..

[28]  Rajkumar Buyya,et al.  QoS-aware Big service composition using MapReduce based evolutionary algorithm with guided mutation , 2017, Future Gener. Comput. Syst..

[29]  Diane M. Strong,et al.  Beyond Accuracy: What Data Quality Means to Data Consumers , 1996, J. Manag. Inf. Syst..

[30]  Donald P. Ballou,et al.  Modeling Completeness versus Consistency Tradeoffs in Information Decision Contexts , 2003, IEEE Trans. Knowl. Data Eng..

[31]  Carlos R. Rivero,et al.  A novel model for distributed big data service composition using stratified functional graph matching , 2017, WIMS.

[32]  Sabeur Aridhi,et al.  An experimental survey on big data frameworks , 2016, Future Gener. Comput. Syst..

[33]  Mohamed Adel Serhani,et al.  Big Data Quality: A Quality Dimensions Evaluation , 2016, 2016 Intl IEEE Conferences on Ubiquitous Intelligence & Computing, Advanced and Trusted Computing, Scalable Computing and Communications, Cloud and Big Data Computing, Internet of People, and Smart World Congress (UIC/ATC/ScalCom/CBDCom/IoP/SmartWorld).

[34]  Jalel Akaichi,et al.  Using Mapreduce for Efficient Parallel Processing of Continuous K nearest Neighbors in Road Networks , 2016 .

[35]  Sergei O. Kuznetsov,et al.  On interestingness measures of formal concepts , 2016, Inf. Sci..

[36]  Ling Huang,et al.  Reliable and efficient big service selection , 2017, Inf. Syst. Frontiers.

[37]  Xiaofei Xu,et al.  A New Paradigm of Software Service Engineering in the Era of Big Data and Big Service , 2016, ArXiv.

[38]  Haithem Mezni,et al.  Multi-cloud service composition using Formal Concept Analysis , 2017, J. Syst. Softw..

[39]  Danilo Ardagna,et al.  Context-aware data quality assessment for big data , 2018, Future Gener. Comput. Syst..

[40]  Vincenzo Loia,et al.  Formal and relational concept analysis for fuzzy-based automatic semantic annotation , 2013, Applied Intelligence.

[41]  Nicola Zannone,et al.  A severity-based quantification of data leakages in database systems , 2016, J. Comput. Secur..

[42]  Nicola Zannone,et al.  Data Leakage Quantification , 2014, DBSec.

[43]  Frank Eliassen,et al.  From IoT big data to IoT big services , 2017, SAC.

[44]  Maude Manouvrier,et al.  Web services composition: Complexity and models , 2015, Discret. Appl. Math..