Estimating Online User Location Distribution without GPS Location

We focus on the problem of offline user location estimation using online information, particularly for the application of TV segment advertising. Unlike previous works, the proposed method does not assume GPS information, but works with loosely structured information such as English location description. We propose to use a neural language model to capture the semantic similarity among the location descriptions. The language model can help reduce the otherwise expensive geolocating service lookups by internally resolving similar areas, neighborhoods, etc. Onto the same description. We also propose a metric for comparing geodemographic histograms. This metric considers the demographic gap between the online world and the offline world. In the experiments section, we demonstrate the recall and accuracy of our language-based, GPS-free user location distribution estimation. In addition, we illustrate the effectiveness of the proposed distribution estimation metric.

[1]  G. Foody Thematic map comparison: Evaluating the statistical significance of differences in classification accuracy , 2004 .

[2]  Alok N. Choudhary,et al.  Detecting and Tracking Disease Outbreaks by Mining Social Media Data , 2013, IJCAI.

[3]  Hui Zhao,et al.  Follow You from Your Photos , 2013, 2013 IEEE International Conference on Green Computing and Communications and IEEE Internet of Things and IEEE Cyber, Physical and Social Computing.

[4]  Jon Louis Bentley,et al.  Multidimensional binary search trees used for associative searching , 1975, CACM.

[5]  M. Walsh,et al.  A method for statistically comparing spatial distribution maps , 2009, International journal of health geographics.

[6]  Henry A. Kautz,et al.  Finding your friends and following them to where you are , 2012, WSDM '12.

[7]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[8]  Jeremy Ginsberg,et al.  Detecting influenza epidemics using search engine query data , 2009, Nature.

[9]  Geoffrey E. Hinton,et al.  A Scalable Hierarchical Distributed Language Model , 2008, NIPS.

[10]  Franz Aurenhammer,et al.  Voronoi diagrams—a survey of a fundamental geometric data structure , 1991, CSUR.

[11]  Alok N. Choudhary,et al.  SILVERBACK: Scalable association mining for temporal data in columnar probabilistic databases , 2014, 2014 IEEE 30th International Conference on Data Engineering.

[12]  Yi Yang,et al.  Incorporating conditional random fields and active learning to improve sentiment identification , 2014, Neural Networks.

[13]  J. E. Hirsch,et al.  An index to quantify an individual's scientific research output , 2005, Proc. Natl. Acad. Sci. USA.

[14]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[15]  Carlos Ortíz de Landázuri Heavenly Mathematics. The Forgotten Art of Spherical Trigonometry , 2013 .