Differentially Private Analysis of Transportation Data

To optimize the planning and operations of transportation systems, engineers analyze large amounts of data related to individual travelers, obtained through an increasing number and variety of sensors and data sources. For example, location traces collected from personal smartphones or smart cards in public transit systems can now cost-effectively complement or replace traditional data collection mechanisms such as phone surveys or vehicle detectors on highways, allowing to significantly increase the sensor coverage as well as the spatial and temporal resolution of the collected data. This trend allows for more accurate statistical estimates of the state and evolution of a transportation system, and improved responsiveness. At the same time, it raises privacy concerns, due to the possibility of making inferences on the history of visited locations and activities of individual citizens. This chapter presents some of the issues related to the privacy-preserving analysis of transportation data. We first illustrate the well-known difficulty of publishing location microdata (i.e., individual location traces) with privacy guarantees, though a case study based on the “MTL Trajet” dataset, a smartphone-based travel survey carried out in recent years in the city of Montreal. In contrast, the publication of aggregate statistics can be protected formally using state-of-the-art tools such as differential privacy, a formal notion of privacy that prevents certain types of inferences by adversaries with arbitrary side information. To illustrate the application of differential privacy to transportation data, the chapter presents a methodology for estimating the dynamic macroscopic traffic state (density, velocity) along a highway segment in real-time from single-loop detector and floating car data, while providing privacy guarantees for the individual driver trajectories. Enforcing privacy constraints impacts estimation performance (depending on the desired privacy level), but the effect is mitigated here by using a nonlinear model of the traffic dynamics, fused with the sensor measurements using data assimilation methods such as nonlinear Kalman filters.

[1]  Martin Treiber,et al.  Traffic Flow Dynamics , 2013 .

[2]  Hubert André Estimation de trafic routier par filtre de Kalman d'ensemble sous contrainte de confidentialité différentielle , 2017 .

[3]  Carmela Troncoso,et al.  Unraveling an old cloak: k-anonymity for location privacy , 2010, WPES '10.

[4]  Jerome Le Ny,et al.  A differentially private ensemble Kalman Filter for road traffic estimation , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[5]  C. Daganzo THE CELL TRANSMISSION MODEL.. , 1994 .

[6]  Cynthia Dwork,et al.  Practical privacy: the SuLQ framework , 2005, PODS.

[7]  Chao Chen,et al.  The PeMS algorithms for accurate, real-time estimates of g-factors and speeds from single-loop detectors , 2001, ITSC 2001. 2001 IEEE Intelligent Transportation Systems. Proceedings (Cat. No.01TH8585).

[8]  Catuscia Palamidessi,et al.  Geo-indistinguishability: differential privacy for location-based systems , 2012, CCS.

[9]  Alexandre M. Bayen,et al.  An ensemble Kalman filtering approach to highway traffic estimation using GPS enabled mobile devices , 2008, 2008 47th IEEE Conference on Decision and Control.

[10]  Catherine Morency,et al.  Smart card data use in public transit: A literature review , 2011 .

[11]  Aaron Roth,et al.  The Algorithmic Foundations of Differential Privacy , 2014, Found. Trends Theor. Comput. Sci..

[12]  Emiliano De Cristofaro,et al.  What Does The Crowd Say About You? Evaluating Aggregation-based Location Privacy , 2017, Proc. Priv. Enhancing Technol..

[13]  Alexandre M. Bayen,et al.  Evaluation of traffic data obtained via GPS-enabled mobile phones: The Mobile Century field experiment , 2009 .

[14]  Juliana Freire,et al.  Anonymizing NYC Taxi Data: Does It Matter? , 2016, 2016 IEEE International Conference on Data Science and Advanced Analytics (DSAA).

[15]  Vitaly Shmatikov,et al.  Robust De-anonymization of Large Sparse Datasets , 2008, 2008 IEEE Symposium on Security and Privacy (sp 2008).

[16]  Cynthia Dwork,et al.  Calibrating Noise to Sensitivity in Private Data Analysis , 2006, TCC.

[17]  George J. Pappas,et al.  Real-time privacy-preserving model-based estimation of traffic flows , 2014, 2014 ACM/IEEE International Conference on Cyber-Physical Systems (ICCPS).

[18]  Moni Naor,et al.  Our Data, Ourselves: Privacy Via Distributed Noise Generation , 2006, EUROCRYPT.

[19]  L Sweeney,et al.  Weaving Technology and Policy Together to Maintain Confidentiality , 1997, Journal of Law, Medicine & Ethics.

[20]  Alexandre M. Bayen,et al.  Enhancing Privacy and Accuracy in Probe Vehicle-Based Traffic Monitoring via Virtual Trip Lines , 2012, IEEE Transactions on Mobile Computing.

[21]  Sébastien Gambs,et al.  De-anonymization attack on geolocated data , 2014, J. Comput. Syst. Sci..

[22]  Hui Zang,et al.  Anonymization of location data does not work: a large-scale measurement study , 2011, MobiCom.

[23]  Vaidy S. Sunderam,et al.  Differentially Private Multi-dimensional Time Series Release for Traffic Monitoring , 2013, DBSec.

[24]  Mohamad Talas,et al.  "Midtown in Motion": A New Active Traffic Management Methodology and Its Implementation in New York City , 2013 .

[25]  Geir Evensen,et al.  The Ensemble Kalman Filter: theoretical formulation and practical implementation , 2003 .

[26]  Gabriel Ghinita,et al.  Privacy for Location-based Services , 2013, Privacy for Location-based Services.

[27]  Xiaoming Fu,et al.  Trajectory Recovery From Ash: User Privacy Is NOT Preserved in Aggregated Mobility Data , 2017, WWW.

[28]  Martin Treiber,et al.  Traffic Flow Dynamics: Data, Models and Simulation , 2012 .

[29]  Christian G. Claudel,et al.  A framework for privacy and security analysis of probe-based traffic information systems , 2013, HiCoNS '13.

[30]  Shen-Shyang Ho,et al.  Differential privacy for location pattern mining , 2011, SPRINGL '11.

[31]  César A. Hidalgo,et al.  Unique in the Crowd: The privacy bounds of human mobility , 2013, Scientific Reports.

[32]  Latanya Sweeney,et al.  k-Anonymity: A Model for Protecting Privacy , 2002, Int. J. Uncertain. Fuzziness Knowl. Based Syst..

[33]  George J. Pappas,et al.  Differentially Private Filtering , 2012, IEEE Transactions on Automatic Control.

[34]  Ninghui Li,et al.  t-Closeness: Privacy Beyond k-Anonymity and l-Diversity , 2007, 2007 IEEE 23rd International Conference on Data Engineering.