Streaming Linear Regression on Spark MLlib and MOA

In recent years, analyzing data streams has attracted considerable attention in different fields of computer science. In this paper, two different frameworks, namely MOA and Spark MLlib, are examined for linear regression on streaming data. The focus is placed on determining how well the linear regression techniques implemented in the frameworks that could be used to model the data streams. We also examine the challenges of massive data streams and how MOA and Spark Streaming solve these kinds of challenges. As a result of the experiments, we see that although the usage of MOA is more easier than Spark MLlib, Spark MLlib linear regression performance on streaming data is better.