Fighting Statistical Re-Identification in Human Trajectory Publication

The maturing of mobile devices and systems provides an unprecedented opportunity to collect a large amount of real world human motion data at all scales. While the rich knowledge contained in these data sets is valuable in many fields, various types of personally sensitive information can be easily learned from such trajectory data. The ones that are of most concerns are frequent locations, frequent co-locations and trajectory re-identification through spatio-temporal data points. In this work we analyze privacy protection and data utility when trajectory IDs are randomly mixed during co-location events for data collection or publication. We demonstrate through both analyses and simulations that the global geometric shape of each individual trajectory is sufficiently altered such that re-identification via frequent locations, co-location pairs or spatial temporal data points is not possible with high probability. Meanwhile, a decent number of local geometric features of the trajectory data set are still preserved, including the density distribution and local traffic flow.