Mintra: Mining anonymized trajectories with annotations

Time-series of geo-tagged data are routinely generated from GPS enabled devices, satellites and other motion capturing instruments. Such data can be thought of as sequences of locations where every location is associated with additional text annotations. Pattern mining for important sequences (aka. trajectory mining) is essential to extract information from such a database. However, the current trend of anonymization to avoid privacy breach makes it difficult to identify any correlation in the data, thus making it even harder, if not impossible, to look for actual trajectories. Noting this difficulty, we define our goal as mining for trajectory-patterns which is a generalization of trajectories. We first design a pattern-growth based algorithm towards this objective. Further, by identifying the limitation of the state-of-the-art sequential pattern growth algorithms in growing trajectory-patterns, we propose a new pattern growth algorithm-- Mintra. Experiments were performed to demonstrate efficiency and effectiveness of Mintra. We, therefore, show that important patterns can be mined from anonymized data without compromising user privacy.

[1]  Shamkant B. Navathe,et al.  Mining Frequent Spatial-Textual Sequence Patterns , 2015, DASFAA.

[2]  Cynthia Dwork,et al.  Differential Privacy , 2006, ICALP.

[3]  Philip S. Yu,et al.  UP-Growth: an efficient algorithm for high utility itemset mining , 2010, KDD.

[4]  Vikram Goyal,et al.  Efficient Enforcement of Privacy for Moving Object Trajectories , 2013, ICISS.

[5]  Dimitrios Gunopulos,et al.  Efficient Mining of Spatiotemporal Patterns , 2001, SSTD.

[6]  Longbing Cao,et al.  USpan: an efficient algorithm for mining high utility sequential patterns , 2012, KDD.

[7]  Nikos Mamoulis,et al.  Mining frequent spatio-temporal sequential patterns , 2005, Fifth IEEE International Conference on Data Mining (ICDM'05).

[8]  Tak-Chung Fu,et al.  An evolutionary approach to pattern-based time series segmentation , 2004, IEEE Transactions on Evolutionary Computation.

[9]  Shen-Shyang Ho,et al.  Differential privacy for location pattern mining , 2011, SPRINGL '11.

[10]  Jian Pei,et al.  Mining Frequent Patterns without Candidate Generation: A Frequent-Pattern Tree Approach , 2006, Sixth IEEE International Conference on Data Mining - Workshops (ICDMW'06).

[11]  Nitin Gupta,et al.  Mining Quantitative Association Rules in Protein Sequences , 2006, Selected Papers from AusDM.

[12]  Ramakrishnan Srikant,et al.  Mining sequential patterns , 1995, Proceedings of the Eleventh International Conference on Data Engineering.

[13]  Johannes Gehrke,et al.  Sequential PAttern mining using a bitmap representation , 2002, KDD.

[14]  Dino Pedreschi,et al.  Trajectory pattern mining , 2007, KDD '07.

[15]  Jian Pei,et al.  Mining frequent patterns without candidate generation , 2000, SIGMOD '00.

[16]  ASHWIN MACHANAVAJJHALA,et al.  L-diversity: privacy beyond k-anonymity , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[17]  Helena Ahonen-Myka Discovery of Frequent Word Sequences in Text , 2002, Pattern Detection and Discovery.

[18]  Umeshwar Dayal,et al.  FreeSpan: frequent pattern-projected sequential pattern mining , 2000, KDD '00.

[19]  Yücel Saygin,et al.  Towards trajectory anonymization: a generalization-based approach , 2008, SPRINGL '08.

[20]  Elio Masciari,et al.  Sequential pattern mining from trajectory data , 2013, IDEAS '13.

[21]  Jiawei Han,et al.  Discovery of Spatial Association Rules in Geographic Information Databases , 1995, SSD.

[22]  Qiming Chen,et al.  PrefixSpan,: mining sequential patterns efficiently by prefix-projected pattern growth , 2001, Proceedings 17th International Conference on Data Engineering.

[23]  Benjamin C. M. Fung,et al.  Differentially Private Trajectory Data Publication , 2011, ArXiv.

[24]  Young-Koo Lee,et al.  Efficient Tree Structures for High Utility Pattern Mining in Incremental Databases , 2009, IEEE Transactions on Knowledge and Data Engineering.

[25]  Vikram Goyal,et al.  Preserving Location Privacy for Continuous Queries on Known Route , 2011, ICISS.

[26]  Mohammed J. Zaki,et al.  SPADE: An Efficient Algorithm for Mining Frequent Sequences , 2004, Machine Learning.

[27]  Daqing Zhang,et al.  Modeling User Activity Preference by Leveraging User Spatial Temporal Characteristics in LBSNs , 2015, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[28]  Ramakrishnan Srikant,et al.  Mining Sequential Patterns: Generalizations and Performance Improvements , 1996, EDBT.

[29]  Suh-Yin Lee,et al.  On mining webclick streams for path traversal patterns , 2004, WWW Alt. '04.