Vs-Driven Big Data Process Development

Big Data solutions aim to cope with the overwhelming amount of data generated by various domains, such as social networks and the Internet of Things, thereby enabling a new generation of data-intensive applications (DIAs) and services. At the same time, to facilitate DIA design and development processes and address (Big) data management requirements, proper techniques and tools are requested. To this purpose, this paper proposes an approach, which takes into account the established Big Data V-attributes, (i.e. Volume, Velocity, and Variety) to model and predict computational demands at design time. To do so, the approach relies on annotating Big Data process workflows (and their individual elements) with relevant V-attribute values, which are then mapped into resource requirements and used in a performance model.

[1]  C. L. Philip Chen,et al.  Data-intensive applications, challenges, techniques and technologies: A survey on Big Data , 2014, Inf. Sci..

[2]  Amit P. Sheth,et al.  Modeling Quality of Service for Workflows and Web Service Processes , 2002 .

[3]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[4]  Bran Selic,et al.  The UML – MARTE Standardized Profile , 2008 .

[5]  Qiang Gao,et al.  Performance modeling of big data applications in the cloud centers , 2017, The Journal of Supercomputing.

[6]  Barry W. Boehm,et al.  Quantitative evaluation of software quality , 1976, ICSE '76.

[7]  Edward D. Lazowska,et al.  Quantitative system performance - computer system analysis using queueing network models , 1983, Int. CMG Conference.

[8]  C. Murray Woodside,et al.  Performance modeling from software components , 2004, WOSP '04.

[9]  Simonetta Balsamo,et al.  Performance evaluation of UML software architectures with multiclass Queueing Network models , 2005, WOSP '05.

[10]  Ayoub Ait Lahcen,et al.  Big Data technologies: A survey , 2017, J. King Saud Univ. Comput. Inf. Sci..

[11]  José Merseguer,et al.  Performance by unified model analysis (PUMA) , 2005, WOSP '05.

[12]  Alfons Kemper,et al.  Optimized Workflow Authorization in Service Oriented Architectures , 2006, ETRICS.

[13]  Serge Abiteboul,et al.  Querying Semi-Structured Data , 1997, Encyclopedia of Database Systems.

[14]  Dario Bruneo,et al.  Stochastic Evaluation of QoS in Service-Based Systems , 2013, IEEE Transactions on Parallel and Distributed Systems.

[15]  Iraklis Paraskakis,et al.  Utilising stream reasoning techniques to underpin an autonomous framework for cloud application platforms , 2014, Journal of Cloud Computing.

[16]  Giuseppe Serazzi,et al.  JMT: performance engineering tools for system modeling , 2009, PERV.

[17]  Abdullah Gani,et al.  A survey on indexing techniques for big data: taxonomy and performance evaluation , 2016, Knowledge and Information Systems.

[18]  Mauro Iacono,et al.  Performance evaluation of NoSQL big-data applications using multi-formalism models , 2014, Future Gener. Comput. Syst..

[19]  Dilpreet Singh,et al.  A survey on platforms for big data analytics , 2014, Journal of Big Data.

[20]  Yuqing Zhu,et al.  BigDataBench: A big data benchmark suite from internet services , 2014, 2014 IEEE 20th International Symposium on High Performance Computer Architecture (HPCA).