Abstract Nowadays, distributed systems have become requisite to process and analyse the large amount of generated data. In particular, spatio-temporal data has known an exponential growth in the last years. This could be explained by the proliferation of indoor and outdoor tracking devices and the value of the knowledge that can be extracted from their analysis. The data describe a moving objects’ behavior in space and time. Once these coordinates are assembled they form a raw trajectory. The knowledge extracted by analysing the set of trajectories is very valuable; they can be even mapped to other contextual data to add value. Preprocessing is an essential step in the mining process to filter unnecessary records and clean the noise. Furthermore, reactive systems offer the possibility to process massive data in a non-blocking, resilient, responsive and elastic manner. Despite this interest, to the best of our knowledge no works have been conducted into offering a reactive system to preprocess massive trajectories. The aim of this study is to fill this gap by proposing a reactive system based on distributed actors to manage big trajectory data. Our system is deployed with the Play Framework, Akka, MongoDB, AngularJS and D3.js. Initially, the system can load batches of trajectories stored in HDFS in a distributed manner. The scope of our study is to provide an overview of the system, to study the impact of the increase in computing resources over the scalability, and to provide the optimal node configuration. The results indicate a higher scalability of the system, and the evaluation is conducted by considering the Geolife project’s GPS trajectory dataset.
[1]
Dongyu Liu,et al.
SmartAdP: Visual Analytics of Large-scale Taxi Trajectories for Selecting Billboard Locations
,
2017,
IEEE Transactions on Visualization and Computer Graphics.
[2]
Javam C. Machado,et al.
Efficient and Distributed DBScan Algorithm Using MapReduce to Detect Density Areas on Traffic Data
,
2014,
ICEIS.
[3]
Yu Zheng,et al.
Real-Time City-Scale Taxi Ridesharing
,
2015,
IEEE Transactions on Knowledge and Data Engineering.
[4]
Xiaoyong Du,et al.
Elite: an elastic infrastructure for big spatiotemporal trajectories
,
2016,
The VLDB Journal.