Unsupervised Representation Learning of Spatial Data via Multimodal Embedding

Increasing urbanization across the globe has coincided with greater access to urban data; this enables researchers and city administrators with better tools to understand urban dynamics, such as crime, traffic, and living standards. In this paper, we study the Learning an Embedding Space for Regions (LESR) problem, wherein we aim to produce vector representations of discrete regions. Recent studies have shown that embedding geospatial regions in a latent vector space can be useful in a variety of urban computing tasks. However, previous studies do not consider regions across multiple modalities in an end-to-end framework. We argue that doing so facilitates the learning of greater semantic relationships among regions. We propose a novel method, RegionEncoder, that jointly learns region representations from satellite image, point-of-interest, human mobility, and spatial graph data. We demonstrate that these region embeddings are useful as features in two regression tasks and across two distinct urban environments. Additionally, we perform an ablation study that evaluates each major architectural component. Finally, we qualitatively explore the learned embedding space, and show that semantic relationships are discovered across modalities

[1]  Steven Skiena,et al.  DeepWalk: online learning of social representations , 2014, KDD.

[2]  Daniel Kifer,et al.  Crime Rate Inference with Big Data , 2016, KDD.

[3]  Eric R. Ziegel,et al.  The Elements of Statistical Learning , 2003, Technometrics.

[4]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[5]  Stefano Ermon,et al.  Tile2Vec: Unsupervised representation learning for spatially distributed data , 2018, AAAI.

[6]  I. Heath Divided we fail. , 2011, Clinical medicine.

[7]  Yu Zheng,et al.  Deep Distributed Fusion Network for Air Quality Prediction , 2018, KDD.

[8]  Charu C. Aggarwal,et al.  You Are How You Drive: Peer and Temporal-Aware Representation Learning for Driving Behavior Analysis , 2018, KDD.

[9]  Zhenhui Li,et al.  IntelliLight: A Reinforcement Learning Approach for Intelligent Traffic Light Control , 2018, KDD.

[10]  W. Tobler A Computer Movie Simulating Urban Growth in the Detroit Region , 1970 .

[11]  Le Wu,et al.  Efficient Region Embedding with Multi-View Spatial Networks: A Perspective of Locality-Constrained Spatial Autocorrelations , 2019, AAAI.

[12]  Alan L. Yuille,et al.  Training and Evaluating Multimodal Word Embeddings with Large-scale Web Annotated Images , 2016, NIPS.

[13]  Jure Leskovec,et al.  node2vec: Scalable Feature Learning for Networks , 2016, KDD.

[14]  Zhenhui Li,et al.  Region Representation Learning via Mobility Flow , 2017, CIKM.

[15]  Andreas Krause,et al.  Advances in Neural Information Processing Systems (NIPS) , 2014 .

[16]  Marta C. González,et al.  Using Convolutional Networks and Satellite Imagery to Identify Patterns in Urban Environments at a Large Scale , 2017, KDD.

[17]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[18]  Xavier Bresson,et al.  Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering , 2016, NIPS.

[19]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[20]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[21]  Marc'Aurelio Ranzato,et al.  DeViSE: A Deep Visual-Semantic Embedding Model , 2013, NIPS.

[22]  Mingzhe Wang,et al.  LINE: Large-scale Information Network Embedding , 2015, WWW.

[23]  Joan Bruna,et al.  Deep Convolutional Networks on Graph-Structured Data , 2015, ArXiv.

[24]  Angeliki Lazaridou,et al.  Combining Language and Vision with a Multimodal Skip-gram Model , 2015, NAACL.

[25]  Yoshua Bengio,et al.  Greedy Layer-Wise Training of Deep Networks , 2006, NIPS.

[26]  Yoshua Bengio,et al.  What regularized auto-encoders learn from the data-generating distribution , 2012, J. Mach. Learn. Res..