Subspace Optimal Transport for Spatial Bias Correction of Social Media Data: A Case Study of 2013 Boulder Flood Event

Social media data generated from individuals provides a unique opportunity to gain valuable insight on information flow, especially for emergency response. However, the inherent limitations associated to these data (particularly, the spatial bias) restrict its precise application. Existing research on spatial bias correction of social media data mainly face two issues: 1) the geographic extent in target domains may be underestimated, and 2) source elements may be transported within inappropriate distance. In this paper, we take 2013 Boulder, Colorado flood event as a case study, and present a new method called subspace optimal transport (SOT). Our proposed SOT aims at transporting biased tweets from dry to real flooded areas with a relatively close distance. Specifically, a comparison between our newly developed SOT and the traditional optimal transport (OT) and geographic optimal transport (GOT) is performed. Experimental results demonstrate that our new SOT method is able to correct the spatially biased geo-referenced tweets, with high precision and excellent computing performance.