Detecting Significant Locations from Raw GPS Data Using Random Space Partitioning

We present a fast algorithm for probabilistically extracting significant locations from raw GPS data based on data point density. Extracting significant locations from raw GPS data is the first essential step of algorithms designed for location-aware applications. Most current algorithms compare spatial/temporal variables with given fixed thresholds to extract significant locations. However, the appropriate threshold values are not clearly known in priori, and algorithms with fixed thresholds are inherently error-prone, especially under high noise levels. Moreover, they do not often scale in response to increase in system size since direct distance computation is required. We developed a fast algorithm for selective data point sampling around significant locations based on density information by constructing random histograms using locality-sensitive hashing. Theoretical analysis and evaluations show that significant locations are accurately detected with a loose parameter setting even under high noise levels.

[1]  Nobuyuki Enomoto,et al.  Algorithm for Detecting Significant Locations from Raw GPS Data , 2010, Discovery Science.

[2]  Hanan Samet,et al.  Octree approximation an compression methods , 2002, Proceedings. First International Symposium on 3D Data Processing Visualization and Transmission.

[3]  Nicole Immorlica,et al.  Locality-sensitive hashing scheme based on p-stable distributions , 2004, SCG '04.

[4]  Moses Charikar,et al.  Similarity estimation techniques from rounding algorithms , 2002, STOC '02.

[5]  Henry A. Kautz,et al.  Location-Based Activity Recognition using Relational Markov Networks , 2005, IJCAI.

[6]  Thad Starner,et al.  Using GPS to learn significant locations and predict movement across multiple users , 2003, Personal and Ubiquitous Computing.

[7]  A. Raftery,et al.  Nearest-Neighbor Clutter Removal for Estimating Features in Spatial Point Processes , 1998 .

[8]  Zhe Wang,et al.  Efficiently matching sets of features with random histograms , 2008, ACM Multimedia.

[9]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[10]  Kai Li,et al.  Asymmetric distance estimation with sketches for similarity search in high-dimensional spaces , 2008, SIGIR '08.

[11]  Xing Xie,et al.  Mining interesting locations and travel sequences from GPS trajectories , 2009, WWW '09.

[12]  Henry Kautz,et al.  Building Personal Maps from GPS Data , 2006, Annals of the New York Academy of Sciences.

[13]  Eduardo Mario Nebot,et al.  Mining GPS data for extracting significant places , 2009, 2009 IEEE International Conference on Robotics and Automation.

[14]  Xing Xie,et al.  Learning transportation mode from raw gps data for geographic applications on the web , 2008, WWW.

[15]  J. H. Ward Hierarchical Grouping to Optimize an Objective Function , 1963 .

[16]  Kentaro Toyama,et al.  Project Lachesis: Parsing and Modeling Location Histories , 2004, GIScience.