Integrating multi-source big data to infer building functions

ABSTRACT Information about the functions of urban buildings is helpful not only for developing a better understanding of how cities work, but also for establishing a basis for policy makers to evaluate and improve the effectiveness of urban planning. Despite these advantages, however, and perhaps simply due to a lack of available data, few academic studies to date have succeeded in integrating multi-source ‘big data’ to examine urban land use at the building level. Responding to this deficiency, this study integrated multi-source big data (WeChat users’ real-time location records, taxi GPS trajectories data, Points of Interest (POI) data, and building footprint data from high-resolution Quickbird images), and applied the proposed density-based method to infer the functions of urban buildings in Tianhe District, Guangzhou, China. The results of the study conformed to an overall detection rate of 72.22%. When results were verified against ground-truth investigation data, the accuracy rate remained above 65%. Two important conclusions can be drawn from our analysis: 1.The use of WeChat data delivers better inference results than those obtained using taxi data when used to identify residential buildings, offices, and urban villages. Conversely, shopping centers, hotels, and hospitals, were more easily identified using taxi data. 2. The use of integrated multi-source big data is more effective than single-source big data in revealing the relation between human dynamics and urban complexes at the building scale.

[1]  Satish V. Ukkusuri,et al.  Inferring Urban Land Use Using Large-Scale Social Media Check-in Data , 2014 .

[2]  Fahui Wang,et al.  Urban land uses and traffic 'source-sink areas': Evidence from GPS-enabled taxi data in Shanghai , 2012 .

[3]  Chenghu Zhou,et al.  A new insight into land use classification based on aggregated mobile phone data , 2013, Int. J. Geogr. Inf. Sci..

[4]  Francisco C. Pereira,et al.  Mining point-of-interest data from social networks for urban land use classification and disaggregation , 2015, Comput. Environ. Urban Syst..

[5]  Daqing Zhang,et al.  Measuring social functions of city regions from large-scale taxi behaviors , 2011, 2011 IEEE International Conference on Pervasive Computing and Communications Workshops (PERCOM Workshops).

[6]  Vanessa Frías-Martínez,et al.  Spectral clustering for sensing urban land use using Twitter activity , 2014, Engineering applications of artificial intelligence.

[7]  Michael Batty,et al.  Inferring building functions from a probabilistic model using public transportation data , 2014, Comput. Environ. Urban Syst..

[8]  R. Platt,et al.  An Evaluation of an Object-Oriented Paradigm for Land Use/Land Cover Classification , 2008 .

[9]  A. J. W. De Wit,et al.  Efficiency and accuracy of per-field classification for operational crop mapping , 2004 .

[10]  Hans-Peter Kriegel,et al.  A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.

[11]  M. S. Moran,et al.  Opportunities and limitations for image-based remote sensing in precision crop management , 1997 .

[12]  J. Rogan,et al.  Remote sensing technology for mapping and monitoring land-cover and land-use change , 2004 .

[13]  Xiaoping Liu,et al.  International Journal of Geographical Information Science an Improved Artificial Immune System for Seeking the Pareto Front of Land-use Allocation Problem in Large Areas an Improved Artificial Immune System for Seeking the Pareto Front of Land-use Allocation Problem in Large Areas , 2022 .

[14]  M. Goodchild Citizens as sensors: the world of volunteered geography , 2007 .

[15]  D. Massart,et al.  Looking for natural patterns in data: Part 1. Density-based approach , 2001 .

[16]  Tak-Chung Fu,et al.  A review on time series data mining , 2011, Eng. Appl. Artif. Intell..

[17]  Xing Xie,et al.  Discovering regions of different functions in a city using human mobility and POIs , 2012, KDD.

[18]  Vipin Kumar,et al.  Finding Clusters of Different Sizes, Shapes, and Densities in Noisy, High Dimensional Data , 2003, SDM.

[19]  Xiaoping Liu,et al.  Simulating urban growth by integrating landscape expansion index (LEI) and cellular automata , 2014, Int. J. Geogr. Inf. Sci..

[20]  Yinhai Wang,et al.  Uncovering urban human mobility from large scale taxi GPS data , 2015 .

[21]  C. Lo,et al.  Using a time series of satellite imagery to detect land use and land cover changes in the Atlanta, Georgia metropolitan area , 2002 .

[22]  Yanliu Lin,et al.  Understanding the ‘Village in the City’ in Guangzhou , 2011 .

[23]  Christian Heipke,et al.  Crowdsourcing geospatial data , 2010 .

[24]  Jianhong Wu,et al.  Data clustering - theory, algorithms, and applications , 2007 .

[25]  Chaogui Kang,et al.  Intra-urban human mobility patterns: An urban morphology perspective , 2012 .

[26]  Tao Cheng,et al.  CLUSTERING ANALYSIS OF OFFICER'S BEHAVIOURS IN LONDON POLICE FOOT PATROL ACTIVITIES , 2015 .

[27]  V. Mesev The use of census data in urban image classification , 1998 .

[28]  Shougeng Hu,et al.  Automated urban land-use classification with remote sensing , 2013 .

[29]  Mark Rounsevell,et al.  The limitations of spatial land use data in environmental analysis , 2006 .

[30]  Li Gong,et al.  Revealing travel patterns and city structure with taxi trip data , 2016 .

[31]  Lun Wu,et al.  Intra-Urban Human Mobility and Activity Transition: Evidence from Social Media Check-In Data , 2014, PloS one.

[32]  Yanglin Wang,et al.  Urbanization and informal development in China: Urban villages in Shenzhen , 2009 .

[33]  Chaogui Kang,et al.  Social Sensing: A New Approach to Understanding Our Socioeconomic Environments , 2015 .

[34]  Zhaohui Wu,et al.  This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS 1 Land-Use Classification Using Taxi GPS Traces , 2022 .

[35]  Jules J. Berman,et al.  Principles of Big Data: Preparing, Sharing, and Analyzing Complex Information , 2013 .

[36]  Víctor Soto,et al.  Automated land use identification using cell-phone records , 2011, HotPlanet '11.