Adaptive non-parametric identification of dense areas using cell phone records for urban analysis

Pervasive large-scale infrastructures (like GPS, WLAN networks or cell-phone networks) generate large datasets containing human behavior information. One of the applications that can benefit from this data is the study of urban environments. In this context, one of the main problems is the detection of dense areas, i.e., areas with a high density of individuals within a specific geographical region and time period. Nevertheless, the techniques used so far face an important limitation: the definition of dense area is not adaptive and as a result the areas identified are related to a threshold applied over the density of individuals, which usually implies that dense areas are mainly identified in downtowns. In this paper, we propose a novel technique, called AdaptiveDAD, to detect dense areas that adaptively define the concept of density using the infrastructure provided by a cell phone network. We evaluate and validate our approach with a real dataset containing the Call Detail Records (CDR) of fifteen million individuals.

[1]  Lida Xu,et al.  A local-density based spatial clustering algorithm with noise , 2007, Inf. Syst..

[2]  Enrique Frías-Martínez,et al.  A Customizable Behavior Model for Temporal Prediction of Web User Sequences , 2002, WEBKDD.

[3]  Víctor Soto,et al.  Automated land use identification using cell-phone records , 2011, HotPlanet '11.

[4]  Panos Kalnis,et al.  On Discovering Moving Clusters in Spatio-temporal Data , 2005, SSTD.

[5]  Jon M. Kleinberg,et al.  Mapping the world's photos , 2009, WWW '09.

[6]  Dimitrios Gunopulos,et al.  On-Line Discovery of Dense Areas in Spatio-temporal Databases , 2003, SSTD.

[7]  Vanessa Frías-Martínez,et al.  An Agent-Based Model of Epidemic Spread Using Human Mobility and Social Network Information , 2011, 2011 IEEE Third Int'l Conference on Privacy, Security, Risk and Trust and 2011 IEEE Third Int'l Conference on Social Computing.

[8]  Itzhak Benenson,et al.  Modeling Population Dynamics in the City: from a Regional to a Multi-Agent Approach , 1999 .

[9]  Dorin Comaniciu,et al.  Mean Shift: A Robust Approach Toward Feature Space Analysis , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[10]  Anil K. Jain,et al.  Data clustering: a review , 1999, CSUR.

[11]  Petko Bakalov,et al.  Querying Spatio-temporal Patterns in Mobile Phone-Call Databases , 2010, 2010 Eleventh International Conference on Mobile Data Management.

[12]  Thanasis Hadzilacos,et al.  Advances in Spatial and Temporal Databases , 2015, Lecture Notes in Computer Science.

[13]  Tian Zhang,et al.  BIRCH: an efficient data clustering method for very large databases , 1996, SIGMOD '96.

[14]  Elias S. Manolakos,et al.  Signal Background Estimation and Baseline Correction Algorithms for Accurate DNA Sequencing , 2003, J. VLSI Signal Process..

[15]  Lian Duan,et al.  A Local Density Based Spatial Clustering Algorithm with Noise , 2006, 2006 IEEE International Conference on Systems, Man and Cybernetics.

[16]  Liang Gao,et al.  Building Extraction from Aerial Imagery Based on the Principle of Confrontation and Priori Knowledge , 2009, 2009 Second International Conference on Computer and Electrical Engineering.

[17]  Carlo Ratti,et al.  Mobile Landscapes: Using Location Data from Cell Phones for Urban Analysis , 2006 .

[18]  Beng Chin Ooi,et al.  Continuous Clustering of Moving Objects , 2007, IEEE Transactions on Knowledge and Data Engineering.

[19]  Henry A. Kautz,et al.  Learning and inferring transportation routines , 2004, Artif. Intell..

[20]  Jiong Yang,et al.  STING: A Statistical Information Grid Approach to Spatial Data Mining , 1997, VLDB.

[21]  Nuria Oliver,et al.  ARBUD : A Reusable Architecture for Building User Models from Massive Datasets , 2010 .

[22]  Larry D. Hostetler,et al.  The estimation of the gradient of a density function, with applications in pattern recognition , 1975, IEEE Trans. Inf. Theory.

[23]  Zhengyuan Zhu,et al.  Spatial scan statistics: approximations and performance study , 2006, KDD '06.

[24]  Patrick Hébert,et al.  Median Filtering in Constant Time , 2007, IEEE Transactions on Image Processing.

[25]  Yi-Zeng Liang,et al.  Baseline correction using adaptive iteratively reweighted penalized least squares. , 2010, The Analyst.

[26]  T. Geisel,et al.  The scaling laws of human travel , 2006, Nature.

[27]  Jae-Gil Lee,et al.  Trajectory clustering: a partition-and-group framework , 2007, SIGMOD '07.

[28]  Pemetaan Jumlah Balita,et al.  Spatial Scan Statistic , 2014, Encyclopedia of Social Network Analysis and Mining.

[29]  Xiangyun Hu,et al.  Automatic Road Extraction from Dense Urban Area by Integrated Processing of High Resolution Imagery and Lidar Data , 2004 .

[30]  G. Madey,et al.  Uncovering individual and collective human dynamics from mobile phone records , 2007, 0710.2939.

[31]  Chinya V. Ravishankar,et al.  Pointwise-Dense Region Queries in Spatio-temporal Databases , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[32]  Carlo Ratti,et al.  Cellular Census: Explorations in Urban Data Collection , 2007, IEEE Pervasive Computing.

[33]  D. Brockmann,et al.  Human Mobility and Spatial Disease Dynamics , 2010 .