Splitter: Mining Fine-Grained Sequential Patterns in Semantic Trajectories

Driven by the advance of positioning technology and the popularity of location-sharing services, semantic-enriched trajectory data have become unprecedentedly available. The sequential patterns hidden in such data, when properly defined and extracted, can greatly benefit tasks like targeted advertising and urban planning. Unfortunately, classic sequential pattern mining algorithms developed for transactional data cannot effectively mine patterns in semantic trajectories, mainly because the places in the continuous space cannot be regarded as independent "items". Instead, similar places need to be grouped to collaboratively form frequent sequential patterns. That said, it remains a challenging task to mine what we call fine-grained sequential patterns, which must satisfy spatial compactness, semantic consistency and temporal continuity simultaneously. We propose Splitter to effectively mine such fine-grained sequential patterns in two steps. In the first step, it retrieves a set of spatially coarse patterns, each attached with a set of trajectory snippets that precisely record the pattern's occurrences in the database. In the second step, Splitter breaks each coarse pattern into fine-grained ones in a top-down manner, by progressively detecting dense and compact clusters in a higher-dimensional space spanned by the snippets. Splitter uses an effective algorithm called weighted snippet shift to detect such clusters, and leverages a divide-and-conquer strategy to speed up the top-down pattern splitting process. Our experiments on both real and synthetic data sets demonstrate the effectiveness and efficiency of Splitter.

[1]  Ramakrishnan Srikant,et al.  Mining sequential patterns , 1995, Proceedings of the Eleventh International Conference on Data Engineering.

[2]  Umeshwar Dayal,et al.  PrefixSpan: Mining Sequential Patterns by Prefix-Projected Growth , 2001, ICDE 2001.

[3]  Dimitrios Gunopulos,et al.  Efficient Mining of Spatiotemporal Patterns , 2001, SSTD.

[4]  Patrick Laube,et al.  Analyzing Relative Motion within Groups of Trackable Moving Point Objects , 2002, GIScience.

[5]  Dorin Comaniciu,et al.  Mean Shift: A Robust Approach Toward Feature Space Analysis , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[6]  Ilan Shimshoni,et al.  Mean shift based clustering in high dimensions: a texture classification example , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[7]  Guizhen Yang,et al.  The complexity of mining maximal frequent itemsets and maximal frequent patterns , 2004, KDD.

[8]  Mohammed J. Zaki,et al.  SPADE: An Efficient Algorithm for Mining Frequent Sequences , 2004, Machine Learning.

[9]  Mong-Li Lee,et al.  FlowMiner: finding flow patterns in spatio-temporal databases , 2004, 16th IEEE International Conference on Tools with Artificial Intelligence.

[10]  Dino Pedreschi,et al.  Trajectory pattern mining , 2007, KDD '07.

[11]  Bart Kuijpers,et al.  Towards Semantic Trajectory Knowledge Discovery , 2007 .

[12]  Christian S. Jensen,et al.  Discovery of convoys in trajectory databases , 2008, Proc. VLDB Endow..

[13]  Xing Xie,et al.  Mining interesting locations and travel sequences from GPS trajectories , 2009, WWW '09.

[14]  Jiawei Han,et al.  Swarm: Mining Relaxed Temporal Moving Object Clusters , 2010, Proc. VLDB Endow..

[15]  Wang-Chien Lee,et al.  Semantic trajectory mining for location prediction , 2011, GIS.

[16]  Stefano Spaccapietra,et al.  SeMiTri: a framework for semantic annotation of heterogeneous trajectories , 2011, EDBT/ICDT '11.

[17]  Gang Chen,et al.  See-to-retrieve: efficient processing of spatio-visual keyword queries , 2012, SIGIR '12.

[18]  Gang Chen,et al.  Evaluating geo-social influence in location-based social networks , 2012, CIKM.

[19]  Nicholas Jing Yuan,et al.  On discovery of gathering patterns from trajectories , 2013, 2013 IEEE 29th International Conference on Data Engineering (ICDE).

[20]  Gang Chen,et al.  Supporting Pattern-Preserving Anonymization for Time-Series Data , 2013, IEEE Transactions on Knowledge and Data Engineering.

[21]  Lei Chen,et al.  Finding time period-based most frequent path in big trajectory data , 2013, SIGMOD '13.

[22]  Qi Tian,et al.  Perception-Guided Multimodal Feature Fusion for Photo Aesthetics Assessment , 2014, ACM Multimedia.

[23]  Chengyuan Zhang Efficient Processing of Spatial Keyword Queries , 2015 .

[24]  Wei Zhang,et al.  STREAMCUBE: Hierarchical spatio-temporal hashtag clustering for event exploration over the Twitter stream , 2015, 2015 IEEE 31st International Conference on Data Engineering.