A Language of Life: Characterizing People Using Cell Phone Tracks

Mobile devices can produce continuous streams of data which areoften specific to the person carrying them. We show that cellphone tracks from the MIT Reality dataset can be used to reliablycharacterize individual people. This is done by treating eachperson's data as a separate language by building a standardn-gram language model for each ``author.'' We then compute theperplexities of an unlabelled sample as based on each person'slanguage model. The sample is assigned to the user yielding thelowest perplexity score. This technique achieves 85\% accuracyand can also be used for clustering. We also show how languagemodels can also be used for predicting movement and proposemetrics to measure the accuracy of the predictions. Finally, wedevelop an alternative method for identifying individuals bycounting the subsequences in a sample which are unique to theirauthors. This is done by building a generalized suffix tree of thetraining set and counting each subsequence from a sample which isunique for some person as evidence towards identifying that personas the author. We present the identification and prediction as apart of a {\sc humble} human behavior modeling framework, outlinegeneral modeling goals, and show how our methods help. Ourresults suggest that people's medium-scale movement behavioralpatterns, at the granularity of cell tower footprints, can be usedto characterize individuals.