Fault-Tolerant Distributed Stream Processing System

Real-time data processing systems are more and more popular nowadays. Data warehouses not only collect terabytes of data, they also process endless data streams. To support such a situation, a data extraction process must become a continuous process also. Here a problem of a failure resistance arises. It is important not only to process a set of data on time, even more important is not to lose any data when a failure occurs. We achieve this by applying a redundant distributed stream processing. In this paper, we present a fault-tolerant system designed for processing data streams originating from geographically distributed sources