A Distributed Execution Pipeline for Clustering Trajectories Based on a Fuzzy Similarity Relation

The proliferation of indoor and outdoor tracking devices has led to a vast amount of spatial data. Each object can be described by several trajectories that, once analysed, can yield to significant knowledge. In particular, pattern analysis by clustering generic trajectories can give insight into objects sharing the same patterns. Still, sequential clustering approaches fail to handle large volumes of data. Hence, the necessity of distributed systems to be able to infer knowledge in a trivial time interval. In this paper, we detail an efficient, scalable and distributed execution pipeline for clustering raw trajectories. The clustering is achieved via a fuzzy similarity relation obtained by the transitive closure of a proximity relation. Moreover, the pipeline is integrated in Spark, implemented in Scala and leverages the Core and Graphx libraries making use of Resilient Distributed Datasets (RDD) and graph processing. Furthermore, a new simple, but very efficient, partitioning logic has been deployed in Spark and integrated into the execution process. The objective behind this logic is to equally distribute the load among all executors by considering the complexity of the data. In particular, resolving the load balancing issue has reduced the conventional execution time in an important manner. Evaluation and performance of the whole distributed process has been analysed by handling the Geolife project’s GPS trajectory dataset.

[1]  Stefano Ceri,et al.  Distributed Transitive Closure Computations: The Disconnection Set Approach , 1990, VLDB.

[2]  Robert LIN,et al.  NOTE ON FUZZY SETS , 2014 .

[3]  Xing Xie,et al.  Mining interesting locations and travel sequences from GPS trajectories , 2009, WWW '09.

[4]  Wei-Ying Ma,et al.  Understanding mobility based on GPS data , 2008, UbiComp.

[5]  Eric Gribkoff Distributed Algorithms for the Transitive Closure , 2013 .

[6]  A. Boulmakoul,et al.  FUZZY STRUCTURAL PRIMITIVES FOR SPATIAL DATA MINING , 2003 .

[7]  Miin-Shen Yang A survey of fuzzy clustering , 1993 .

[8]  Michael L. Brodie On knowledge base management systems: integrating artificial intelligence and database technologies , 2011, Topics in information systems.

[9]  Nehal Magdy,et al.  Review on trajectory similarity measures , 2015, 2015 IEEE Seventh International Conference on Intelligent Computing and Information Systems (ICICIS).

[10]  Yannis E. Ioannidis,et al.  On the Computation of the Transitive Closure of Relational Operators , 1986, VLDB.

[11]  Xing Xie,et al.  GeoLife: A Collaborative Social Networking Service among User, Location and Trajectory , 2010, IEEE Data Eng. Bull..

[12]  Peter M. G. Apers,et al.  Data fragmentation for parallel transitive closure strategies , 1993, Proceedings of IEEE 9th International Conference on Data Engineering.

[13]  Ahmed Lbath,et al.  Moving Object Trajectories Meta-Model And Spatio-Temporal Queries , 2012, ArXiv.

[14]  Manoranjan Dash,et al.  Entropy-based fuzzy clustering and fuzzy modeling , 2000, Fuzzy Sets Syst..

[15]  Shinichi Tamura,et al.  Pattern Classification Based on Fuzzy Relations , 1971, IEEE Trans. Syst. Man Cybern..

[16]  Jinya Su,et al.  Trajectory Clustering Aided Personalized Driver Intention Prediction for Intelligent Vehicles , 2019, IEEE Transactions on Industrial Informatics.

[17]  Natalia Kondruk Clustering method based on fuzzy binary relation , 2017 .

[18]  Naixue Xiong,et al.  Spatio-Temporal Vessel Trajectory Clustering Based on Data Mapping and Density , 2018, IEEE Access.

[19]  Gin-Shuh Liang,et al.  Computing, Artificial Intelligence and Information Technology Cluster analysis based on fuzzy equivalence relation , 2005 .

[20]  J. Bezdek,et al.  Fuzzy partitions and relations; an axiomatic basis for clustering , 1978 .

[21]  Miin-Shen Yang,et al.  Cluster analysis based on fuzzy relations , 2001, Fuzzy Sets Syst..

[22]  Daniel S. Hirschberg,et al.  A linear space algorithm for computing maximal common subsequences , 1975, Commun. ACM.

[23]  Shakhatreh,et al.  Fuzzy Logic and Its Applications , 2007 .

[24]  Xing Xie,et al.  Collaborative Filtering Meets Mobile Recommendation: A User-Centered Approach , 2010, AAAI.