Uncovering individual's mobility patterns from GPS dataset

Human mobility patterns, including issues such as locations of significance, modes of transport, trajectory patterns, location-based activities, are of great importance to a wide range of research areas and location-related applications. Based on the patterns uncovered, various mobility models may be proposed to predict individual’s future whereabouts, or to evaluate the protocols for wireless communications, among other applications. In this thesis, I present a study of individual’s mobility patterns based on GPS records. The study in this thesis includes inferring the modes of transport, analyzing the predictability of individual’s mobility, constructing mobility model, and predicting future locations. Modes of transport, such as walking, biking, driving, or taking a bus, are a basic pattern of individual’s mobility. Current studies on inferring the modes of transport apply supervised methods, which include a tedious training process. In this thesis, I present an unsupervised method for inferring the modes of transport, which eliminates the need of manual labeling and training while attaining equal or greater accuracy compared to the best known supervised method. The unsupervised method relies on Kolmogorov-Smirnov Test which offers a theoretical assurance when comparing segments of records. Various probabilistic models and algorithms, such as Markov models, Bayes models, pattern mining methods, have been proposed to predict individual’s next moves. The predicting accuracy has been greatly improved because of these efforts. However, little is known whether the predicting accuracy is already approaching the limit and hence further research efforts may yield diminishing returns. Moreover, the predicting accuracy is apparently affected by the scale of the places visited and the time interval concerned. In this thesis, I present a study of the predictability of individual’s mobility sequences. The predictability quantifies the potential to foresee the next moves of an individual based on his/her historical records. Using high-resolution GPS data, the scaling effects on predictability are investigated. Given specified spatio-temporal scales, recorded trajectories are encoded into long strings of distinct locations, and several information-theoretic measures of predictability are derived. I show that high predictability is still present at rather high spatial/temporal resolutions. The predictability is found to be independent of the overall mobility area covered, which suggests highly regular mobility behaviors. Moreover, by varying the scales over a wide range, an invariance between the predicting accuracy and spatial resolution is observed which suggests that certain trade-offs between these two are unavoidable.

[1]  Huan Liu,et al.  Exploring Social-Historical Ties on Location-Based Social Networks , 2012, ICWSM.

[2]  Jae-Gil Lee,et al.  TraClass: trajectory classification using hierarchical region-based and trajectory-based clustering , 2008, Proc. VLDB Endow..

[3]  Xiaoyan Hong,et al.  A group mobility model for ad hoc wireless networks , 1999, MSWiM '99.

[4]  John B. Heywood,et al.  The Other Climate Threat: Transportation , 2009 .

[5]  Christophe Hurter,et al.  From movement tracks through events to places: Extracting and characterizing significant places from mobility data , 2011, 2011 IEEE Conference on Visual Analytics Science and Technology (VAST).

[6]  Aravind Srinivasan,et al.  Modelling disease outbreaks in realistic urban social networks , 2004, Nature.

[7]  Nando de Freitas,et al.  Rao-Blackwellised Particle Filtering for Dynamic Bayesian Networks , 2000, UAI.

[8]  Ivan Junier,et al.  Periodic pattern detection in sparse boolean sequences , 2010, Algorithms for Molecular Biology.

[9]  Alessandro Vespignani,et al.  Modeling the Worldwide Spread of Pandemic Influenza: Baseline Case and Containment Interventions , 2007, PLoS medicine.

[10]  Kentaro Toyama,et al.  Project Lachesis: Parsing and Modeling Location Histories , 2004, GIScience.

[11]  Scott Kirkpatrick,et al.  Location Based Services Location Based Services , 2005 .

[12]  Chunming Qiao,et al.  Sociological orbit aware location approximation and routing (SOLAR) in MANET , 2007, Ad Hoc Networks.

[13]  Dirk P. Kroese,et al.  Kernel density estimation via diffusion , 2010, 1011.2602.

[14]  Jorma T. Virtamo,et al.  A Markovian Waypoint Mobility Model with Application to Hotspot Modeling , 2006, 2006 IEEE International Conference on Communications.

[15]  Henry A. Kautz,et al.  Learning and inferring transportation routines , 2004, Artif. Intell..

[16]  S. Chong,et al.  SLAW : A Mobility Model for Human Walks , 2009 .

[17]  Chaoming Song,et al.  Modelling the scaling properties of human mobility , 2010, 1010.0436.

[18]  T. Geisel,et al.  The scaling laws of human travel , 2006, Nature.

[19]  Murat Ali Bayir,et al.  Mobility profiler: A framework for discovering mobility profiles of cell phone users , 2010, Pervasive Mob. Comput..

[20]  Thad Starner,et al.  Using GPS to learn significant locations and predict movement across multiple users , 2003, Personal and Ubiquitous Computing.

[21]  Kyunghan Lee,et al.  On the Levy-Walk Nature of Human Mobility , 2008, IEEE INFOCOM 2008 - The 27th Conference on Computer Communications.

[22]  J. Kleinberg Computing: the wireless epidemic. , 2007, Nature.

[23]  Henry A. Kautz,et al.  Extracting Places and Activities from GPS Traces Using Hierarchical Conditional Random Fields , 2007, Int. J. Robotics Res..

[24]  Wei-Ying Ma,et al.  Understanding mobility based on GPS data , 2008, UbiComp.

[25]  Jan Larsen,et al.  Estimating human predictability from mobile sensor data , 2010, 2010 IEEE International Workshop on Machine Learning for Signal Processing.

[26]  Jae-Gil Lee,et al.  MoveMine: mining moving object databases , 2010, SIGMOD Conference.

[27]  Xing Xie,et al.  Learning transportation mode from raw gps data for geographic applications on the web , 2008, WWW.

[28]  Wen-Jing Hsu,et al.  Brownian Bridge Model for High Resolution Location Predictions , 2014, PAKDD.

[29]  Kevin C. Almeroth,et al.  Towards realistic mobility models for mobile ad hoc networks , 2003, MobiCom '03.

[30]  Wen-Jing Hsu,et al.  Mining GPS data for mobility patterns: A survey , 2014, Pervasive Mob. Comput..

[31]  Esteban Moro Egido,et al.  The dynamical strength of social ties in information spreading , 2010, Physical review. E, Statistical, nonlinear, and soft matter physics.

[32]  Richard E. Neapolitan,et al.  Learning Bayesian networks , 2007, KDD '07.

[33]  Alessandro Vespignani,et al.  Human Mobility Networks, Travel Restrictions, and the Global Spread of 2009 H1N1 Pandemic , 2011, PloS one.

[34]  Cecilia Mascolo,et al.  NextPlace: A Spatio-temporal Prediction Framework for Pervasive Systems , 2011, Pervasive.

[35]  Dino Pedreschi,et al.  Trajectory pattern mining , 2007, KDD '07.

[36]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[37]  Marta C. González,et al.  A Metric of Influential Spreading during Contagion Dynamics through the Air Transportation Network , 2012, PloS one.

[38]  Deborah Estrin,et al.  Using mobile phones to determine transportation modes , 2010, TOSN.

[39]  Laura Ferrari,et al.  Classification and prediction of whereabouts patterns from the Reality Mining dataset , 2013, Pervasive Mob. Comput..

[40]  Alessandro Vespignani,et al.  Phase transitions in contagion processes mediated by recurrent mobility patterns , 2011, Nature physics.

[41]  F. Calabrese,et al.  Urban gravity: a model for inter-city telecommunication flows , 2009, 0905.0692.

[42]  Fosca Giannotti,et al.  Synthetic generation of cellular network positioning data , 2005, GIS '05.

[43]  H. Stanley,et al.  Modelling urban growth patterns , 1995, Nature.

[44]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[45]  Robert Tibshirani,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition , 2001, Springer Series in Statistics.

[46]  Dino Pedreschi,et al.  Time-focused clustering of trajectories of moving objects , 2006, Journal of Intelligent Information Systems.

[47]  Shashi Shekhar,et al.  Discovering personal gazetteers: an interactive clustering approach , 2004, GIS '04.

[48]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[49]  Alistair Moffat,et al.  Implementing the PPM data compression scheme , 1990, IEEE Trans. Commun..

[50]  Hojung Cha,et al.  Mobility prediction-based smartphone energy optimization for everyday location monitoring , 2011, SenSys.

[51]  Ryuji Suzuki,et al.  Information entropy of humpback whale songs. , 1999, The Journal of the Acoustical Society of America.

[52]  Marco Gruteser,et al.  USENIX Association , 1992 .

[53]  Henry A. Kautz,et al.  Inferring High-Level Behavior from Low-Level Sensors , 2003, UbiComp.

[54]  Ravi Jain,et al.  Evaluating location predictors with extensive Wi-Fi mobility data , 2004, INFOCOM.

[55]  Changbao Wu,et al.  Jackknife, Bootstrap and Other Resampling Methods in Regression Analysis , 1986 .

[56]  Wesley W. Chu,et al.  An index-based approach for similarity search supporting time warping in large sequence databases , 2001, Proceedings 17th International Conference on Data Engineering.

[57]  Hui Fang,et al.  Mining User Position Log for Construction of Personalized Activity Map , 2009, ADMA.

[58]  Xing Xie,et al.  Mining interesting locations and travel sequences from GPS trajectories , 2009, WWW '09.

[59]  David A. Maltz,et al.  Dynamic Source Routing in Ad Hoc Wireless Networks , 1994, Mobidata.

[60]  Sari Haj Hussein Effective Density Queries on Continuously Moving Objects; in Slides , 2012 .

[61]  P. Greenwood,et al.  A Guide to Chi-Squared Testing , 1996 .

[62]  Injong Rhee,et al.  STEP: A spatio-temporal mobility model for humans walks , 2010, The 7th IEEE International Conference on Mobile Ad-hoc and Sensor Systems (IEEE MASS 2010).

[63]  Ling Chen,et al.  A system for destination and future route prediction based on trajectory mining , 2010, Pervasive Mob. Comput..

[64]  Jiawei Han,et al.  Mining event periodicity from incomplete observations , 2012, KDD.

[65]  Jure Leskovec,et al.  Friendship and mobility: user movement in location-based social networks , 2011, KDD.

[66]  Lei Chen,et al.  On The Marriage of Lp-norms and Edit Distance , 2004, VLDB.

[67]  Shan Jiang,et al.  Clustering daily patterns of human activities in the city , 2012, Data Mining and Knowledge Discovery.

[68]  Hans-Peter Kriegel,et al.  A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.

[69]  Jussi Myllymaki,et al.  Buddy tracking - efficient proximity detection among mobile friends , 2007, Pervasive Mob. Comput..

[70]  Heng Tao Shen,et al.  Mining Trajectory Patterns Using Hidden Markov Models , 2007, DaWaK.

[71]  Ahmed Helmy,et al.  IMPORTANT: a framework to systematically analyze the Impact of Mobility on Performance of Routing Protocols for Adhoc Networks , 2003, IEEE INFOCOM 2003. Twenty-second Annual Joint Conference of the IEEE Computer and Communications Societies (IEEE Cat. No.03CH37428).

[72]  Xing Xie,et al.  Mining user similarity based on location history , 2008, GIS '08.

[73]  Alessandro Vespignani,et al.  Multiscale mobility networks and the spatial spreading of infectious diseases , 2009, Proceedings of the National Academy of Sciences.

[74]  Neri Merhav,et al.  Universal prediction of individual sequences , 1992, IEEE Trans. Inf. Theory.

[75]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[76]  Alex Pentland,et al.  Reality mining: sensing complex social systems , 2006, Personal and Ubiquitous Computing.

[77]  Wen-Jing Hsu,et al.  Modeling High Predictability and Scaling Laws of Human Mobility , 2013, 2013 IEEE 14th International Conference on Mobile Data Management.

[78]  S. Shreve Stochastic calculus for finance , 2004 .

[79]  Chita R. Das,et al.  Clustered Mobility Model for Scale-Free Wireless Networks , 2006, Proceedings. 2006 31st IEEE Conference on Local Computer Networks.

[80]  Wen-Jing Hsu,et al.  Detecting modes of transport from unlabelled positioning sensor data , 2013, J. Locat. Based Serv..

[81]  Wen-Jing Hsu,et al.  Predictability of individuals' mobility with high-resolution positioning data , 2012, UbiComp.

[82]  Nikos Pelekis,et al.  Clustering Trajectories of Moving Objects in an Uncertain World , 2009, 2009 Ninth IEEE International Conference on Data Mining.

[83]  Morton E. O'Kelly,et al.  EMBEDDING ECONOMIES OF SCALE CONCEPTS FOR HUB NETWORK DESIGN. , 2001 .

[84]  Mark A. Pitt,et al.  Advances in Minimum Description Length: Theory and Applications , 2005 .

[85]  Jae-Gil Lee,et al.  Mining Discriminative Patterns for Classifying Trajectories on Road Networks , 2011, IEEE Transactions on Knowledge and Data Engineering.

[86]  T. Geisel,et al.  Forecast and control of epidemics in a globalized world. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[87]  Dino Pedreschi,et al.  Visually driven analysis of movement data by progressive clustering , 2008, Inf. Vis..

[88]  Yun Gao,et al.  Estimating the Entropy of Binary Time Series: Methodology, Some Theory and a Simulation Study , 2008, Entropy.

[89]  Jingjing Wang,et al.  Periodicity Based Next Place Prediction , 2012 .

[90]  Dino Pedreschi,et al.  Human mobility, social ties, and link prediction , 2011, KDD.

[91]  Nikos Mamoulis,et al.  Mining frequent spatio-temporal sequential patterns , 2005, Fifth IEEE International Conference on Data Mining (ICDM'05).

[92]  Eric Horvitz,et al.  Predestination: Inferring Destinations from Partial Trajectories , 2006, UbiComp.

[93]  Eduard Heindl,et al.  Understanding the spreading patterns of mobile phone viruses , 2012 .

[94]  Georg Gartner,et al.  Applications of location–based services: a selected review , 2007, J. Locat. Based Serv..

[95]  Jiawei Han,et al.  Swarm: Mining Relaxed Temporal Moving Object Clusters , 2010, Proc. VLDB Endow..

[96]  Lei Chen,et al.  Robust and fast similarity search for moving object trajectories , 2005, SIGMOD '05.

[97]  Qing Liu,et al.  A Hybrid Prediction Model for Moving Objects , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[98]  Ran El-Yaniv,et al.  On Prediction Using Variable Order Markov Models , 2004, J. Artif. Intell. Res..

[99]  Stephen M. Krone,et al.  Analyzing animal movements using Brownian bridges. , 2007, Ecology.

[100]  David Tse,et al.  Mobility increases the capacity of ad-hoc wireless networks , 2001, Proceedings IEEE INFOCOM 2001. Conference on Computer Communications. Twentieth Annual Joint Conference of the IEEE Computer and Communications Society (Cat. No.01CH37213).

[101]  Sourav Bhattacharya,et al.  Identifying Meaningful Places: The Non-parametric Way , 2009, Pervasive.

[102]  Christian S. Jensen,et al.  Mining significant semantic locations from GPS data , 2010, Proc. VLDB Endow..

[103]  Ehsan Kazemi,et al.  Been There, Done That: What Your Mobility Traces Reveal about Your Behavior , 2012 .

[104]  Yuri M. Suhov,et al.  Nonparametric Entropy Estimation for Stationary Processesand Random Fields, with Applications to English Text , 1998, IEEE Trans. Inf. Theory.

[105]  Michael Frankfurter,et al.  Numerical Recipes In C The Art Of Scientific Computing , 2016 .

[106]  P. J. Kim On the Exact and Approximate Sampling Distribution of the Two Sample Kolmogorov-Smirnov Criterion D mn , m ≤ n , 1969 .

[107]  Nikos Mamoulis,et al.  Discovery of Periodic Patterns in Spatiotemporal Sequences , 2007, IEEE Transactions on Knowledge and Data Engineering.

[108]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[109]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[110]  Jennifer C. Hou,et al.  Modeling steady-state and transient behaviors of user mobility: formulation, analysis, and application , 2006, MobiHoc '06.

[111]  Tristan Henderson,et al.  CRAWDAD: A Community Resource for Archiving Wireless Data at Dartmouth , 2005, IEEE Pervasive Comput..

[112]  Brian L. Mark,et al.  Mobility Tracking Based on Autoregressive Models , 2011, IEEE Transactions on Mobile Computing.

[113]  William G. Griswold,et al.  Mobility Detection Using Everyday GSM Traces , 2006, UbiComp.

[114]  Jiawei Han,et al.  Mining periodic behaviors for moving objects , 2010, KDD.

[115]  Sajal K. Das,et al.  LeZi-Update: An Information-Theoretic Framework for Personal Mobility Tracking in PCS Networks , 2002, Wirel. Networks.

[116]  Pietro Manzoni,et al.  ANEJOS: a Java based simulator for ad hoc networks , 2001, Future Gener. Comput. Syst..

[117]  Zygmunt J. Haas,et al.  Predictive distance-based mobility management for multidimensional PCS networks , 2003, TNET.

[118]  Jeffrey T. Henrikson Completeness and Total Boundedness of the Hausdorff Metric , 1999 .

[119]  Jae-Gil Lee,et al.  Trajectory clustering: a partition-and-group framework , 2007, SIGMOD '07.

[120]  Daniel Gatica-Perez,et al.  Contextual conditional models for smartphone-based human mobility prediction , 2012, UbiComp.

[121]  Greg Welch,et al.  Welch & Bishop , An Introduction to the Kalman Filter 2 1 The Discrete Kalman Filter In 1960 , 1994 .

[122]  Xing Xie,et al.  Mining Individual Life Pattern Based on Location History , 2009, 2009 Tenth International Conference on Mobile Data Management: Systems, Services and Middleware.

[123]  Henry A. Kautz,et al.  Location-Based Activity Recognition using Relational Markov Networks , 2005, IJCAI.

[124]  Yee Whye Teh,et al.  A Bayesian Interpretation of Interpolated Kneser-Ney , 2006 .

[125]  Tong Liu,et al.  Mobility modeling, location tracking, and trajectory prediction in wireless ATM networks , 1998, IEEE J. Sel. Areas Commun..

[126]  F. Massey The Kolmogorov-Smirnov Test for Goodness of Fit , 1951 .

[127]  Henry Kautz,et al.  Building Personal Maps from GPS Data , 2006, Annals of the New York Academy of Sciences.

[128]  Louise E. Moser,et al.  An analysis of the optimum node density for ad hoc mobile networks , 2001, ICC 2001. IEEE International Conference on Communications. Conference Record (Cat. No.01CH37240).

[129]  Hui Fang,et al.  Cognitive personal positioning based on activity map and adaptive particle filter , 2009, MSWiM '09.

[130]  Guohong Cao,et al.  Fine-grained mobility characterization: steady and transient state behaviors , 2010, MobiHoc '10.

[131]  Wen-Jing Hsu,et al.  Uncovering Temporal and Spatial Localities in Individuals' Mobility , 2013, 2013 IEEE 14th International Conference on Mobile Data Management.

[132]  Tristan Henderson,et al.  The changing usage of a mature campus-wide wireless network , 2008, Comput. Networks.

[133]  C. D. Kemp,et al.  Density Estimation for Statistics and Data Analysis , 1987 .

[134]  Jiawei Han,et al.  Data Mining: Concepts and Techniques , 2000 .

[135]  Hans-Peter Kriegel,et al.  OPTICS: ordering points to identify the clustering structure , 1999, SIGMOD '99.

[136]  David Kotz,et al.  Extracting a Mobility Model from Real User Traces , 2006, Proceedings IEEE INFOCOM 2006. 25TH IEEE International Conference on Computer Communications.

[137]  T. Geisel,et al.  Natural human mobility patterns and spatial spread of infectious diseases , 2011, 1103.6224.

[138]  Dimitrios Gunopulos,et al.  Discovering similar multidimensional trajectories , 2002, Proceedings 18th International Conference on Data Engineering.

[139]  Xing Xie,et al.  Understanding transportation modes based on GPS data for web applications , 2010, TWEB.

[140]  M. Shlesinger,et al.  Random walks with infinite spatial and temporal moments , 1982 .

[141]  Injong Rhee,et al.  CRAWDAD dataset ncsu/mobilitymodels (v.2009-07-23) , 2009 .

[142]  Jiliang Tang,et al.  Mobile Location Prediction in Spatio-Temporal Context , 2012 .

[143]  Daniel P. Huttenlocher,et al.  Comparing Images Using the Hausdorff Distance , 1993, IEEE Trans. Pattern Anal. Mach. Intell..

[144]  Dino Pedreschi,et al.  Interactive visual clustering of large collections of trajectories , 2009, 2009 IEEE Symposium on Visual Analytics Science and Technology.

[145]  Anna Monreale,et al.  WhereNext: a location predictor on trajectory pattern mining , 2009, KDD.

[146]  M. Stephens Use of the Kolmogorov-Smirnov, Cramer-Von Mises and Related Statistics without Extensive Tables , 1970 .

[147]  Jiawei Han,et al.  Efficient mining of partial periodic patterns in time series database , 1999, Proceedings 15th International Conference on Data Engineering (Cat. No.99CB36337).

[148]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[149]  Cecilia Mascolo,et al.  A community based mobility model for ad hoc network research , 2006, REALMAN '06.

[150]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.