Distributed and privacy preserving algorithms for mobility information processing

Smart-phones, wearables and mobile devices in general are the sensors of our modern world. Their sensing capabilities offer the means to analyze and interpret our behaviour and surroundings. When it comes to human behaviour, perhaps the most informative feature is our location and mobility habits. Insights from human mobility are useful in a number of everyday practical applications, such as the improvement of transportation and road network infrastructure, ride-sharing services, activity recognition, mobile data pre-fetching, analysis of the social behaviour of humans, etc. In this dissertation, we develop algorithms for processing mobility data. The analysis of mobility data is a non trivial task as it involves managing large quantities of location information, usually spread out spatially and temporally across many tracking sensors. An additional challenge in processing mobility information is to publish the data and the results of its analysis without jeopardizing the privacy of the involved individuals or the quality of the data. We look into a series of problems on processing mobility data from individuals and from a population. Our mission is to design algorithms with provable properties that allow for the fast and reliable extraction of insights. We present efficient solutions in terms of storage and computation requirements , with a focus on distributed computation, online processing and privacy preservation.

[1]  Nisheeth Shrivastava,et al.  Target tracking with binary proximity sensors , 2009, TOSN.

[2]  Young-Joo Suh,et al.  GPS tethering for energy conservation1 , 2015, 2015 IEEE Wireless Communications and Networking Conference (WCNC).

[3]  Cheng Long,et al.  Direction-Preserving Trajectory Simplification , 2013, Proc. VLDB Endow..

[4]  Latanya Sweeney,et al.  k-Anonymity: A Model for Protecting Privacy , 2002, Int. J. Uncertain. Fuzziness Knowl. Based Syst..

[5]  Marco Fiore,et al.  Preserving mobile subscriber privacy in open datasets of spatiotemporal trajectories , 2017, IEEE INFOCOM 2017 - IEEE Conference on Computer Communications.

[6]  Parameswaran Ramanathan,et al.  Distributed target classification and tracking in sensor networks , 2003 .

[7]  Carola Wenk,et al.  Chapter 5 Bounding the Fréchet distance by the Hausdorff distance , 2001 .

[8]  Esther M. Arkin,et al.  Mobile r-gather: Distributed and Geographic Clustering for Location Anonymity , 2017, MobiHoc.

[9]  Nirvana Meratnia,et al.  Spatiotemporal Compression Techniques for Moving Point Objects , 2004, EDBT.

[10]  Sariel Har-Peled,et al.  Approximating the Fréchet Distance for Realistic Curves in Near Linear Time , 2012, Discret. Comput. Geom..

[11]  Lusheng Ji,et al.  Characterizing and modeling internet traffic dynamics of cellular devices , 2011, SIGMETRICS '11.

[12]  Jie Gao,et al.  Persistence based online signal and trajectory simplification for mobile devices , 2014, SIGSPATIAL/GIS.

[13]  L. Venkata Subramaniam,et al.  Mining GPS data to determine interesting locations , 2011, IIWeb '11.

[14]  S. S. Ravi,et al.  Algorithms for compressing GPS trajectory data: an empirical evaluation , 2010, GIS '10.

[15]  Mahesh K. Marina,et al.  A semi-supervised learning approach for robust indoor-outdoor detection with smartphones , 2014, SenSys.

[16]  Ouri Wolfson,et al.  On-line data reduction and the quality of history in moving objects databases , 2006, MobiDE '06.

[17]  David Cohen-Steiner,et al.  Stability of Persistence Diagrams , 2007, Discret. Comput. Geom..

[18]  Ang Yan Sheng,et al.  Discrete Differential Geometry , 2017 .

[19]  Cynthia Dwork,et al.  Differential Privacy , 2006, ICALP.

[20]  David H. Douglas,et al.  ALGORITHMS FOR THE REDUCTION OF THE NUMBER OF POINTS REQUIRED TO REPRESENT A DIGITIZED LINE OR ITS CARICATURE , 1973 .

[21]  Mark de Berg,et al.  Streaming Algorithms for Line Simplification , 2010, Discret. Comput. Geom..

[22]  Dino Pedreschi,et al.  Trajectory pattern mining , 2007, KDD '07.

[23]  Rajeev Motwani,et al.  Sampling from a moving window over streaming data , 2002, SODA '02.

[24]  Balázs Csanád Csáji,et al.  Exploring the Mobility of Mobile Phone Users , 2012, ArXiv.

[25]  Herbert Edelsbrunner,et al.  Topological persistence and simplification , 2000, Proceedings 41st Annual Symposium on Foundations of Computer Science.

[26]  Jun Luo,et al.  Finding long and similar parts of trajectories , 2009, Comput. Geom..

[27]  Young-Jin Kim,et al.  Multi-dimensional range queries in sensor networks , 2003, SenSys '03.

[28]  Deborah Estrin,et al.  GHT: a geographic hash table for data-centric storage , 2002, WSNA '02.

[29]  Zhenhong Lin,et al.  Charging infrastructure planning for promoting battery electric vehicles: An activity-based approach using multiday travel data , 2014 .

[30]  Noga Alon,et al.  A Fast and Simple Randomized Parallel Algorithm for the Maximal Independent Set Problem , 1985, J. Algorithms.

[31]  Rodrigo Fernandes de Mello,et al.  Persistent homology for time series and spatial data clustering , 2015, Expert Syst. Appl..

[32]  Harald Niederreiter,et al.  Probability and computing: randomized algorithms and probabilistic analysis , 2006, Math. Comput..

[33]  Marc J. van Kreveld,et al.  Median trajectories using well-visited regions and shortest paths , 2011, GIS.

[34]  Keenan Crane,et al.  Geodesics in heat: A new approach to computing distance based on heat flow , 2012, TOGS.

[35]  Lukas Kencl,et al.  Performance study of active tracking in a cellular network using a modular signaling platform , 2010, MobiSys '10.

[36]  Marco Gruteser,et al.  USENIX Association , 1992 .

[37]  M. C. Jones,et al.  A Brief Survey of Bandwidth Selection for Density Estimation , 1996 .

[38]  Wang-Chien Lee,et al.  Mining user similarity from semantic trajectories , 2010, LBSN '10.

[39]  Jie Gao,et al.  Differential forms for target tracking and aggregate queries in distributed networks , 2013, TNET.

[40]  S. S. Ravi,et al.  SQUISH: an online approach for GPS trajectory compression , 2011, COM.Geo.

[41]  Jure Leskovec,et al.  Friendship and mobility: user movement in location-based social networks , 2011, KDD.

[42]  Haim Kaplan,et al.  Computing the Discrete Fréchet Distance in Subquadratic Time , 2013, SODA.

[43]  Hoeteck Wee,et al.  Toward Privacy in Public Databases , 2005, TCC.

[44]  Maarten Löffler,et al.  Segmentation of Trajectories for Non-Monotone Criteria , 2013, SODA.

[45]  David J. Lovell,et al.  Optimal Time Transfer in Bus Transit Route Network Design using a Genetic Algorithm , 2003 .

[46]  Ouri Wolfson,et al.  Spatio-temporal data reduction with deterministic error bounds , 2003, DIALM-POMC.

[47]  Marco Fiore,et al.  On the Sampling Frequency of Human Mobility , 2017, GLOBECOM 2017 - 2017 IEEE Global Communications Conference.

[48]  Lorenzo Bracciale,et al.  CRAWDAD dataset roma/taxi (v.2014-07-17) , 2014 .

[49]  Markus Schneider,et al.  Similarity measurement of moving object trajectories , 2012, IWGS '12.

[50]  Aline Carneiro Viana,et al.  From routine to network deployment for data offloading in metropolitan areas , 2014, 2014 Eleventh Annual IEEE International Conference on Sensing, Communication, and Networking (SECON).

[51]  Ramesh Govindan,et al.  Energy-efficient positioning for smartphones using Cell-ID sequence matching , 2011, MobiSys '11.

[52]  Helmut Alt,et al.  Computing the Fréchet distance between two polygonal curves , 1995, Int. J. Comput. Geom. Appl..

[53]  Timos K. Sellis,et al.  Sampling Trajectory Streams with Spatiotemporal Criteria , 2006, 18th International Conference on Scientific and Statistical Database Management (SSDBM'06).

[54]  Jie Gao,et al.  Light-Weight Contour Tracking in Wireless Sensor Networks , 2008, IEEE INFOCOM 2008 - The 27th Conference on Computer Communications.

[55]  Yin Wang,et al.  Mining large-scale gps streams for connectivity refinement of road maps , 2013, SIGSPATIAL/GIS.

[56]  Dieter Pfoser,et al.  On vehicle tracking data-based road network generation , 2012, SIGSPATIAL/GIS.

[57]  Yong Wang,et al.  Energy-efficient computing for wildlife tracking: design tradeoffs and early experiences with ZebraNet , 2002, ASPLOS X.

[58]  Shashi Shekhar,et al.  Discovering interesting sub-paths in spatiotemporal datasets: a summary of results , 2011, GIS.

[59]  Catuscia Palamidessi,et al.  Geo-indistinguishability: differential privacy for location-based systems , 2012, CCS.

[60]  Nabil H. Mustafa,et al.  Near-Linear Time Approximation Algorithms for Curve Simplification , 2002, ESA.

[61]  Stefano Secci,et al.  Estimating human trajectories and hotspots through mobile phone data , 2014, Comput. Networks.

[62]  Aaron Roth,et al.  The Algorithmic Foundations of Differential Privacy , 2014, Found. Trends Theor. Comput. Sci..

[63]  Lixin Gao,et al.  Profiling users in a 3g network using hourglass co-clustering , 2010, MobiCom.

[64]  Sivan Toledo,et al.  VTrack: accurate, energy-aware road traffic delay estimation using mobile phones , 2009, SenSys '09.

[65]  Mo Li,et al.  IODetector: a generic service for indoor outdoor detection , 2012, SenSys '12.

[66]  Kevin P. Murphy,et al.  Machine learning - a probabilistic perspective , 2012, Adaptive computation and machine learning series.

[67]  Albert-László Barabási,et al.  Understanding individual human mobility patterns , 2008, Nature.

[68]  Pierre-François Marteau,et al.  Speeding up simplification of polygonal curves using nested approximations , 2007, Pattern Analysis and Applications.

[69]  Yifan Li,et al.  Clustering moving objects , 2004, KDD.

[70]  Vaidy S. Sunderam,et al.  Differentially Private Multi-dimensional Time Series Release for Traffic Monitoring , 2013, DBSec.

[71]  M. Iri,et al.  Polygonal Approximations of a Curve — Formulations and Algorithms , 1988 .

[72]  Albert-László Barabási,et al.  Limits of Predictability in Human Mobility , 2010, Science.

[73]  Cecilia Mascolo,et al.  Mining users' significant driving routes with low-power sensors , 2014, SenSys.

[74]  Samir Khuller,et al.  Achieving anonymity via clustering , 2006, PODS '06.

[75]  José Ignacio Alvarez-Hamelin,et al.  On the regularity of human mobility , 2016, Pervasive Mob. Comput..

[76]  Xing Xie,et al.  Finding similar users using category-based location history , 2010, GIS '10.

[77]  Li Li,et al.  Practical Routing in Delay-Tolerant Networks , 2007, IEEE Trans. Mob. Comput..

[78]  James Biagioni,et al.  Thrifty tracking: online GPS tracking with low data uplink usage , 2013, SIGSPATIAL/GIS.

[79]  Christian Sohler,et al.  Clustering time series under the Fréchet distance , 2015, SODA.

[80]  Bernhard Mitschang,et al.  Usability analysis of compression algorithms for position data streams , 2010, GIS '10.

[81]  Li Xiong,et al.  Protecting Locations with Differential Privacy under Temporal Correlations , 2014, CCS.

[82]  V. Kavitha,et al.  Clustering Time Series Data Stream - A Literature Survey , 2010, ArXiv.

[83]  Yue Wang,et al.  Hotspot District Trajectory Prediction , 2010, WAIM Workshops.

[84]  Mikkel Baun Kjærgaard,et al.  Energy-efficient trajectory tracking for mobile devices , 2011, MobiSys '11.

[85]  Benjamin C. M. Fung,et al.  Differentially private transit data publication: a case study on the montreal transportation system , 2012, KDD.

[86]  Xing Xie,et al.  Mining interesting locations and travel sequences from GPS trajectories , 2009, WWW '09.

[87]  Calvin C. Newport Improving Wireless Network Performance Using Sensor Hints , 2011, NSDI.

[88]  Moustafa Youssef,et al.  No need to war-drive: unsupervised indoor localization , 2012, MobiSys '12.

[89]  Boris Aronov,et al.  Fréchet Distance for Curves, Revisited , 2006, ESA.

[90]  Dieter Pfoser,et al.  Novel Approaches to the Indexing of Moving Object Trajectories , 2000, VLDB.

[91]  Jie Gao,et al.  Double Rulings for Information Brokerage in Sensor Networks , 2006, IEEE/ACM Transactions on Networking.

[92]  Leonidas J. Guibas,et al.  Locating lucrative passengers for taxicab drivers , 2013, SIGSPATIAL/GIS.

[93]  Carl A. Gunter,et al.  Plausible Deniability for Privacy-Preserving Data Synthesis , 2017, Proc. VLDB Endow..

[94]  Amol Deshpande,et al.  Online Filtering, Smoothing and Probabilistic Modeling of Streaming data , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[95]  Gary M. Weiss,et al.  Activity recognition using cell phone accelerometers , 2011, SKDD.

[96]  Martin Raubal,et al.  Measuring similarity of mobile phone user trajectories– a Spatio-temporal Edit Distance method , 2014, Int. J. Geogr. Inf. Sci..

[97]  Rouzbeh Razavi,et al.  Urban small cell deployments: Impact on the network energy consumption , 2012, 2012 IEEE Wireless Communications and Networking Conference Workshops (WCNCW).

[98]  Fang Zhao,et al.  Large-Scale Transit Network Optimization by Minimizing User Cost and Transfers , 2006 .

[99]  Maarten Löffler,et al.  Median Trajectories , 2010, Algorithmica.

[100]  Lin Sun,et al.  Activity Recognition on an Accelerometer Embedded Mobile Phone with Varying Positions and Orientations , 2010, UIC.

[101]  Pasi Fränti,et al.  A Fast $O(N)$ Multiresolution Polygonal Approximation Algorithm for GPS Trajectory Simplification , 2012, IEEE Transactions on Image Processing.

[102]  Vijay Erramilli,et al.  Is there a case for mobile phone content pre-staging? , 2013, CoNEXT.

[103]  Silvia Giordano,et al.  Using barometric pressure data to recognize vertical displacement activities on smartphones , 2016, Comput. Commun..

[104]  Lei Chen,et al.  Finding time period-based most frequent path in big trajectory data , 2013, SIGMOD '13.

[105]  Frank Dürr,et al.  Online trajectory data reduction using connection-preserving dead reckoning , 2008, MobiQuitous.

[106]  Yan Huang,et al.  Modeling Herds and Their Evolvements from Trajectory Data , 2008, GIScience.

[107]  Bettina Speckmann,et al.  Trajectory grouping structure , 2013, J. Comput. Geom..

[108]  Erik C. Rye,et al.  A Study of MAC Address Randomization in Mobile Devices and When it Fails , 2017, Proc. Priv. Enhancing Technol..

[109]  Joachim Gudmundsson,et al.  Algorithms for hotspot computation on trajectory data , 2013, SIGSPATIAL/GIS.

[110]  Joachim Gudmundsson,et al.  Detecting Commuting Patterns by Clustering Subtrajectories , 2011, Int. J. Comput. Geom. Appl..

[111]  Joachim Gudmundsson,et al.  Reporting flock patterns , 2008, Comput. Geom..

[112]  Leonidas J. Guibas,et al.  Discrete Mobile Centers , 2003, Discret. Comput. Geom..

[113]  Jae-Gil Lee,et al.  Trajectory clustering: a partition-and-group framework , 2007, SIGMOD '07.

[114]  Walid G. Aref,et al.  Casper*: Query processing for location services without compromising privacy , 2006, TODS.

[115]  Maike Buchin,et al.  Segmenting trajectories: A framework and algorithms using spatiotemporal criteria , 2011, J. Spatial Inf. Sci..

[116]  Alexander Kolesnikov,et al.  Multiresolution polygonal approximation of digital curves , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..

[117]  Nikos Pelekis,et al.  Trajectory Compression under Network Constraints , 2009, SSTD.

[118]  Nenghai Yu,et al.  Trajectory simplification method for location-based social networking services , 2009, LBSN '09.

[119]  Leonidas J. Guibas,et al.  Persistence Barcodes for Shapes , 2005, Int. J. Shape Model..

[120]  Basile Chaix,et al.  Detecting activity locations from raw GPS data: a novel kernel-based algorithm , 2013, International Journal of Health Geographics.