Quantitative Analysis of Apache Storm Applications: The NewsAsset Case Study

The development of Information Systems today faces the era of Big Data. Large volumes of information need to be processed in real-time, for example, for Facebook or Twitter analysis. This paper addresses the redesign of NewsAsset, a commercial product that helps journalists by providing services, which analyzes millions of media items from the social network in real-time. Technologies like Apache Storm can help enormously in this context. We have quantitatively analyzed the new design of NewsAsset to assess whether the introduction of Apache Storm can meet the demanding performance requirements of this media product. Our assessment approach, guided by the Unified Modeling Language (UML), takes advantage, for performance analysis, of the software designs already used for development. In addition, we converted UML into a domain-specific modeling language (DSML) for Apache Storm, thus creating a profile for Storm. Later, we transformed said DSML into an appropriate language for performance evaluation, specifically, stochastic Petri nets. The assessment ended with a successful software design that certainly met the scalability requirements of NewsAsset.

[1]  Abhishek Verma,et al.  Predicting Job Completion Time in Heterogeneous MapReduce Environments , 2016, 2016 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW).

[2]  Bernhard Rumpe,et al.  MontiMatcher: Ähnlichkeitsanalyse- Framework zur Produktlinienextraktion und Evolutionsüberwachung , 2016, Softwaretechnik-Trends.

[3]  Simona Bernardi,et al.  A dependability profile within MARTE , 2011, Software & Systems Modeling.

[4]  Pavel Zezula,et al.  Performance Analysis of Distributed Stream Processing Applications Through Colored Petri Nets , 2015, MEMICS.

[5]  Averill M. Law,et al.  Simulation Modeling and Analysis , 1982 .

[6]  Tomasz Rak Response Time Analysis of Distributed Web Systems Using QPNs , 2015 .

[7]  Eugenio Gianniti,et al.  Modeling Performance of Hadoop Applications: A Journey from Queueing Networks to Stochastic Well Formed Nets , 2016, ICA3PP.

[8]  Simona Bernardi,et al.  Performance Analysis of Apache Storm Applications Using Stochastic Petri Nets , 2017, 2017 IEEE International Conference on Information Reuse and Integration (IRI).

[9]  Yiannis Kompatsiaris,et al.  SocialSensor: sensing user generated input for improved media discovery and experience , 2012, WWW.

[10]  Kewen Wang,et al.  Performance Prediction for Apache Spark Platform , 2015, 2015 IEEE 17th International Conference on High Performance Computing and Communications, 2015 IEEE 7th International Symposium on Cyberspace Safety and Security, and 2015 IEEE 12th International Conference on Embedded Software and Systems.

[11]  Slawomir Samolej,et al.  Simulation and Performance Analysis of Distributed Internet Systems Using TCPNs , 2009, Informatica.

[12]  Sébastien Gérard,et al.  Improving uml profile design practices by leveraging conceptual domain models , 2007, ASE.

[13]  Rajiv Ranjan,et al.  Modeling and Simulation in Performance Optimization of Big Data Processing Frameworks , 2014, IEEE Cloud Computing.

[14]  Pavel Zezula,et al.  Model for Performance Analysis of Distributed Stream Processing Applications , 2015, DEXA.

[15]  Helmut Krcmar,et al.  Modeling Big Data Systems by Extending the Palladio Component Model , 2015, Softwaretechnik-Trends.

[16]  Marco Ajmone Marsan,et al.  Modelling with Generalized Stochastic Petri Nets , 1995, PERV.

[17]  Marco Ajmone Marsan,et al.  Generalized Stochastic Petri Nets: A Definition at the Net Level and Its Implications , 1993, IEEE Trans. Software Eng..

[18]  Bran Selic,et al.  A Systematic Approach to Domain-Specific Language Design Using UML , 2007, 10th IEEE International Symposium on Object and Component-Oriented Real-Time Distributed Computing (ISORC'07).

[19]  Carl E. Landwehr,et al.  Basic concepts and taxonomy of dependable and secure computing , 2004, IEEE Transactions on Dependable and Secure Computing.

[20]  Armin Zimmermann Modelling and Performance Evaluation with TimeNET 4.4 , 2017, QEST.

[21]  Helmut Krcmar,et al.  Modeling and Simulating Apache Spark Streaming Applications , 2016, Softwaretechnik-Trends.

[22]  Marco Gribaudo,et al.  Fluid Petri Nets for the Performance Evaluation of MapReduce and Spark Applications , 2017, PERV.