Cross-Modal Learning of Housing Quality in Amsterdam

In our research we test data and models for the recognition of housing quality in the city of Amsterdam from ground-level and aerial imagery. For ground-level images we compare Google StreetView (GSV) to Flickr images. Our results show that GSV predicts the most accurate building quality scores, approximately 30% better than using only aerial images. However, we find that through careful filtering and by using the right pre-trained model, Flickr image features combined with aerial image features are able to halve the performance gap to GSV features from 30% to 15%. Our results indicate that there are viable alternatives to GSV for liveability factor prediction, which is encouraging as GSV images are more difficult to acquire and not always available.

[1]  I. Kawachi,et al.  Neighborhood Disadvantage and Cumulative Biological Risk Among a Socioeconomically Diverse Sample of African American Adults: An Examination in the Jackson Heart Study , 2016, Journal of Racial and Ethnic Health Disparities.

[2]  M. Kendall A NEW MEASURE OF RANK CORRELATION , 1938 .

[3]  Bolei Zhou,et al.  Places: A 10 Million Image Database for Scene Recognition , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  J. Kent,et al.  Healthy Built Environments Supporting Everyday Occupations: Current Thinking in Urban Planning , 2014 .

[5]  Daniele Quercia,et al.  Jane Jacobs in the Sky , 2021, Proc. ACM Hum. Comput. Interact..

[6]  Ramesh Raskar,et al.  Deep Learning the City: Quantifying Urban Perception at a Global Scale , 2016, ECCV.

[7]  G. Evans The built environment and mental health , 2003, Journal of Urban Health.

[8]  Devis Tuia,et al.  Understanding urban landuse from the above and ground perspectives: a deep learning, multimodal solution , 2019, Remote Sensing of Environment.

[9]  Ramesh Raskar,et al.  Streetscore -- Predicting the Perceived Safety of One Million Streetscapes , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops.

[10]  J. Bongaarts,et al.  United Nations Department of Economic and Social Affairs, Population Division World Family Planning 2020: Highlights, United Nations Publications, 2020. 46 p. , 2020 .

[11]  Devis Tuia,et al.  Fine-grained landuse characterization using ground-based pictures: a deep learning solution based on globally available data , 2018, Int. J. Geogr. Inf. Sci..

[12]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Devis Tuia,et al.  Defining and spatially modelling cultural ecosystem services using crowdsourced data , 2020 .