Are call detail records biased for sampling human mobility?

Call detail records (CDRs) have recently been used in studying different aspects of human mobility. While CDRs provide a means of sampling user locations at large population scales, they may not sample all locations proportionate to the visitation frequency of a user, owing to sparsity in time and space of voice-calls, thereby introducing a bias. Also, as the rate of sampling is inherently dependent on the calling frequencies of an individual, high voice-call activity users are often chosen for conducting a meaningful study. Such a selection process can, inadvertently, lead to a biased view as high frequency callers may not always be representative of an entire population. With the advent of 3G technology and wide adoption of smart-phones, cellular devices have become versatile end-hosts. As the data accessed on these devices does not always require human initiation, it affords us with an unprecedented opportunity to validate the utility of CDRs for studying human mobility. In this work, we investigate various metrics for human mobility studied in literature for over a million cellular users in the San Francisco bay-area, for over a month. Our findings reveal that although the voice-call process does well to sample significant locations, such as home and work, it may in some cases incur biases in capturing the overall spatio-temporal characteristics of individual human mobility. Additionally, we motivate an "artificially" imposed sampling process, vis-a-vis the voice-call process with the same average intensity. We observe that in many cases such an imposed sampling process yields better performance results based on the usual metrics like entropies and marginal distributions used often in literature.

[1]  Ravi Jain,et al.  Evaluating Next-Cell Predictors with Extensive Wi-Fi Mobility Data , 2006, IEEE Transactions on Mobile Computing.

[2]  David Kotz,et al.  Extracting a Mobility Model from Real User Traces , 2006, Proceedings IEEE INFOCOM 2006. 25TH IEEE International Conference on Computer Communications.

[3]  Sougata Mukherjea,et al.  On the structural properties of massive telecom call graphs: findings and implications , 2006, CIKM '06.

[4]  Cecilia Mascolo,et al.  A Tale of Many Cities: Universal Patterns in Human Urban Mobility , 2011, PloS one.

[5]  Zbigniew Smoreda,et al.  Chatty Mobiles:Individual mobility and communication patterns , 2013, ArXiv.

[6]  Margaret Martonosi,et al.  Identifying Important Places in People's Lives from Cellular Network Data , 2011, Pervasive.

[7]  Injong Rhee,et al.  On the levy-walk nature of human mobility , 2011, TNET.

[8]  Aleksandar Kuzmanovic,et al.  Measuring serendipity: connecting people, locations and interests in a mobile 3G network , 2009, IMC '09.

[9]  Pan Hui,et al.  Impact of Human Mobility on the Design of Opportunistic Forwarding Algorithms , 2006, Proceedings IEEE INFOCOM 2006. 25TH IEEE International Conference on Computer Communications.

[10]  Vincent Kanade,et al.  Clustering Algorithms , 2021, Wireless RF Energy Transfer in the Massive IoT Era.

[11]  Zhi-Li Zhang,et al.  Un-zipping cellular infrastructure locations via user geo-intent , 2011, 2011 Proceedings IEEE INFOCOM.

[12]  Albert-László Barabási,et al.  Understanding individual human mobility patterns , 2008, Nature.

[13]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[14]  T. Geisel,et al.  The scaling laws of human travel , 2006, Nature.

[15]  Albert-László Barabási,et al.  Limits of Predictability in Human Mobility , 2010, Science.

[16]  Albert-László Barabási,et al.  The origin of bursts and heavy tails in human dynamics , 2005, Nature.

[17]  David Kotz,et al.  Periodic properties of user mobility and access-point popularity , 2007, Personal and Ubiquitous Computing.

[18]  Jianhua Lin,et al.  Divergence measures based on the Shannon entropy , 1991, IEEE Trans. Inf. Theory.