An Evaluation of Data Stream Processing Systems for Data Driven Applications

Real-time data stream processing technologies play an important role in enabling time-critical decision making in many applications. This paper aims at evaluating the performance of platforms that are capable of processing streaming data. Candidate technologies include Storm, Samza, and Spark Streaming. To form the recommendation, a prototype pipeline is designed and implemented in each of the platforms using data collected from sensors used in monitoring heavy-haul railway systems. Through the testing and evaluation of each candidate platform, using both quantitative and qualitative metrics, the paper describes the findings, where Storm is found to be the most appropriate candidate.