Multi-Task Learning for Spatio-Temporal Event Forecasting

Spatial event forecasting from social media is an important problem but encounters critical challenges, such as dynamic patterns of features (keywords) and geographic heterogeneity (e.g., spatial correlations, imbalanced samples, and different populations in different locations). Most existing approaches (e.g., LASSO regression, dynamic query expansion, and burst detection) are designed to address some of these challenges, but not all of them. This paper proposes a novel multi-task learning framework which aims to concurrently address all the challenges. Specifically, given a collection of locations (e.g., cities), we propose to build forecasting models for all locations simultaneously by extracting and utilizing appropriate shared information that effectively increases the sample size for each location, thus improving the forecasting performance. We combine both static features derived from a predefined vocabulary by domain experts and dynamic features generated from dynamic query expansion in a multi-task feature learning framework; we investigate different strategies to balance homogeneity and diversity between static and dynamic terms. Efficient algorithms based on Iterative Group Hard Thresholding are developed to achieve efficient and effective model training and prediction. Extensive experimental evaluations on Twitter data from four different countries in Latin America demonstrated the effectiveness of our proposed approach.

[1]  Liang Zhao,et al.  Spatiotemporal Event Forecasting in Social Media , 2015, SDM.

[2]  Hans-Peter Kriegel,et al.  SigniTrend: scalable detection of emerging topics in textual streams by hashed significance thresholds , 2014, KDD.

[3]  Dimitrios Gunopulos,et al.  On The Spatiotemporal Burstiness of Terms , 2012, Proc. VLDB Endow..

[4]  Yutaka Matsuo,et al.  Earthquake shakes Twitter users: real-time event detection by social sensors , 2010, WWW '10.

[5]  Xiaofeng Wang,et al.  Spatio-temporal modeling of criminal incidents using geographic, demographic, and twitter-derived information , 2012, 2012 IEEE International Conference on Intelligence and Security Informatics.

[6]  Chang-Tien Lu,et al.  Unsupervised Spatial Event Detection in Targeted Domains with Applications to Civil Unrest Modeling , 2014, PloS one.

[7]  Massimiliano Pontil,et al.  Multi-Task Feature Learning , 2006, NIPS.

[8]  Benyuan Liu,et al.  Predicting Flu Trends using Twitter data , 2011, 2011 IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS).

[9]  Jieping Ye,et al.  Simultaneous feature and feature group selection through hard thresholding , 2014, KDD.

[10]  Marc Teboulle,et al.  A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems , 2009, SIAM J. Imaging Sci..

[11]  Brendan T. O'Connor,et al.  From Tweets to Polls: Linking Text Sentiment to Public Opinion Time Series , 2010, ICWSM.

[12]  Rich Caruana,et al.  Multitask Learning , 1998, Encyclopedia of Machine Learning and Data Mining.

[13]  Charu C. Aggarwal,et al.  Event Detection in Social Streams , 2012, SDM.

[14]  Bu-Sung Lee,et al.  Event Detection in Twitter , 2011, ICWSM.

[15]  Wei Shen,et al.  Improving Traffic Prediction with Tweet Semantics , 2013, IJCAI.

[16]  Isabell M. Welpe,et al.  Predicting Elections with Twitter: What 140 Characters Reveal about Political Sentiment , 2010, ICWSM.

[17]  Tong Zhang,et al.  A Framework for Learning Predictive Structures from Multiple Tasks and Unlabeled Data , 2005, J. Mach. Learn. Res..

[18]  Matthew S. Gerber,et al.  Predicting crime using Twitter and kernel density estimation , 2014, Decis. Support Syst..

[19]  M. Osborne,et al.  Using Prediction Markets and Twitter to Predict a Swine Flu Pandemic , 2009 .

[20]  Johan Bollen,et al.  Twitter mood predicts the stock market , 2010, J. Comput. Sci..

[21]  Argimiro Arratia,et al.  Forecasting with twitter data , 2013, ACM Trans. Intell. Syst. Technol..

[22]  Jieping Ye,et al.  An accelerated gradient method for trace norm minimization , 2009, ICML '09.

[23]  Francis R. Bach,et al.  A New Approach to Collaborative Filtering: Operator Estimation with Spectral Regularization , 2008, J. Mach. Learn. Res..

[24]  Naren Ramakrishnan,et al.  Modeling mass protest adoption in social network communities using geometric brownian motion , 2014, KDD.

[25]  Chang-Tien Lu,et al.  Misinformation Propagation in the Age of Twitter , 2014, Computer.

[26]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[27]  Jieping Ye,et al.  A General Iterative Shrinkage and Thresholding Algorithm for Non-convex Regularized Optimization Problems , 2013, ICML.

[28]  Massimiliano Pontil,et al.  Regularized multi--task learning , 2004, KDD.

[29]  Xiaofeng Wang,et al.  Automatic Crime Prediction Using Events Extracted from Twitter Posts , 2012, SBP.

[30]  Alberto Maria Segre,et al.  The Use of Twitter to Track Levels of Disease Activity and Public Concern in the U.S. during the Influenza A H1N1 Pandemic , 2011, PloS one.

[31]  Aravind Srinivasan,et al.  'Beating the news' with EMBERS: forecasting civil unrest using open source indicators , 2014, KDD.

[32]  Mike E. Davies,et al.  Iterative Hard Thresholding for Compressed Sensing , 2008, ArXiv.

[33]  Jieping Ye,et al.  Learning Incoherent Sparse and Low-Rank Patterns from Multiple Tasks , 2010, TKDD.

[34]  Sebastian Thrun,et al.  Clustering Learning Tasks and the Selective Cross-Task Transfer of Knowledge , 1998, Learning to Learn.