Scalable and distributed architecture based on Apache Spark Streaming and PROM6 for processing RoRo terminals logs

With the increasing number of companies encompassing Big data, it is evident that the utmost challenge is to associate enormous quantity of event data to operational business processes that are extremely dynamic. To unbind the value of event data, events need to be firmly affiliated to the monitoring and control of operational processes. Nevertheless, Big data technologies converge essentially on storage, processing and rarely center on improving processes. In view of this, we advocate the integration of technologies of big data and process analysis namely Apache Kafka, Spark Streaming and PROM6. In this work, we design a scalable and distributed architecture for real time monitoring of operational business processes of a RoRo port terminal. The proposed solution permits the exploitation of process mining techniques to process large amount of events logs of several hundreds of gigabytes for process mining analysis.