Probabilistic Models of Topics and Social Events

Structured probabilistic inference has shown to be useful in modeling complex latent structures of data. One successful way in which this technique has been applied is in the discovery of latent topical structures of text data, which is usually referred to as topic modeling. With the recent popularity of mobile devices and social networking, we can now easily acquire text data attached to meta information, such as geo-spatial coordinates and time stamps. This metadata can provide rich and accurate information that is helpful in answering many research questions related to spatial and temporal reasoning. However, such data must be treated differently from text data. For example, spatial data is usually organized in terms of a two dimensional region while temporal information can exhibit periodicities. While some work existing in the topic modeling community that utilizes some of the meta information, these models largely focused on incorporating metadata into text analysis, rather than providing models that make full use of the joint distribution of metainformation and text. In this thesis, I propose the event detection problem, which is a multidimensional latent clustering problem on spatial, temporal and topical data. I start with a simple parametric model to discover independent events using geo-tagged Twitter data. The model is then improved toward two directions. First, I augmented the model using Recurrent Chinese Restaurant Process (RCRP) to discover events that are dynamic in nature. Second, I studied a model that can detect events using data from multiple media sources. I studied the characteristics of different media in terms of reported event times and linguistic patterns. The approaches studied in this thesis are largely based on Bayesian nonparametric methods to deal with steaming data and unpredictable number of clusters. The research will not only serve the event detection problem itself but also shed light into a more general structured clustering problem in spatial, temporal and textual data.

[1]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[2]  D. Aldous Exchangeability and related topics , 1985 .

[3]  Nando de Freitas,et al.  An Introduction to Sequential Monte Carlo Methods , 2001, Sequential Monte Carlo Methods in Practice.

[4]  Lukás Burget,et al.  Recurrent neural network based language model , 2010, INTERSPEECH.

[5]  Noriko Kando,et al.  Applying a Burst Model to Detect Bursty Topics in a Topic Model , 2012, JapTAL.

[6]  Alexander J. Smola,et al.  Neural Machine Translation with Recurrent Attention Modeling , 2016, EACL.

[7]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[8]  Wei Lo,et al.  A Bayesian Graphical Model to Discover Latent Events from Twitter , 2015, ICWSM.

[9]  Thomas Ertl,et al.  Spatiotemporal anomaly detection through visual analysis of geolocated Twitter messages , 2012, 2012 IEEE Pacific Visualization Symposium.

[10]  Thorsten Brants,et al.  Topic-based document segmentation with probabilistic latent semantic analysis , 2002, CIKM '02.

[11]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[12]  Alexander J. Smola,et al.  Discovering geographical topics in the twitter stream , 2012, WWW.

[13]  Oren Etzioni,et al.  Open domain event extraction from twitter , 2012, KDD.

[14]  Le Song,et al.  Dirichlet-Hawkes Processes with Applications to Clustering Continuous-Time Document Streams , 2015, KDD.

[15]  Thorsten Brants,et al.  A System for new event detection , 2003, SIGIR.

[16]  Diyi Yang,et al.  Hierarchical Attention Networks for Document Classification , 2016, NAACL.

[17]  James Allan,et al.  Text classification and named entities for new event detection , 2004, SIGIR '04.

[18]  Starr Roxanne Hiltz,et al.  Structuring computer-mediated communication systems to avoid information overload , 1985, CACM.

[19]  Alexander J. Smola,et al.  Nested Chinese Restaurant Franchise Process: Applications to User Tracking and Document Modeling , 2013, ICML.

[20]  Leysia Palen,et al.  (How) will the revolution be retweeted?: information diffusion and the 2011 Egyptian uprising , 2012, CSCW.

[21]  W. Sudderth,et al.  Polya Trees and Random Distributions , 1992 .

[22]  Alexander J. Smola,et al.  ACCAMS: Additive Co-Clustering to Approximate Matrices Succinctly , 2014, WWW.

[23]  F. Comunello,et al.  Will the revolution be tweeted? A conceptual framework for understanding the social media and the Arab Spring , 2012 .

[24]  Chun How Tan,et al.  Beyond "local", "categories" and "friends": clustering foursquare users with latent "topics" , 2012, UbiComp.

[25]  John D. Lafferty,et al.  Correlated Topic Models , 2005, NIPS.

[26]  Padhraic Smyth,et al.  Statistical entity-topic models , 2006, KDD '06.

[27]  Hanna M. Wallach,et al.  Topic modeling: beyond bag-of-words , 2006, ICML.

[28]  Nando de Freitas,et al.  An Introduction to MCMC for Machine Learning , 2004, Machine Learning.

[29]  Thomas L. Griffiths,et al.  The nested chinese restaurant process and bayesian nonparametric inference of topic hierarchies , 2007, JACM.

[30]  Simon J. Godsill,et al.  An Overview of Existing Methods and Recent Advances in Sequential Monte Carlo , 2007, Proceedings of the IEEE.

[31]  T. Başar,et al.  A New Approach to Linear Filtering and Prediction Problems , 2001 .

[32]  Yutaka Matsuo,et al.  Earthquake shakes Twitter users: real-time event detection by social sensors , 2010, WWW '10.

[33]  Le Song,et al.  Deep Fried Convnets , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[34]  Geoffrey E. Hinton,et al.  Parameter estimation for linear dynamical systems , 1996 .

[35]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[36]  Ciro Cattuto,et al.  Mining Concurrent Topical Activity in Microblog Streams , 2014, #MSM.

[37]  Alexander J. Smola,et al.  Fast Kronecker Inference in Gaussian Processes with non-Gaussian Likelihoods , 2015, ICML.

[38]  Eric P. Xing,et al.  Dynamic Non-Parametric Mixture Models and the Recurrent Chinese Restaurant Process: with Applications to Evolutionary Clustering , 2008, SDM.

[39]  Andrew McCallum,et al.  Efficient methods for topic model inference on streaming document collections , 2009, KDD.

[40]  Regina Barzilay,et al.  Event Discovery in Social Media Feeds , 2011, ACL.

[41]  Thomas L. Griffiths,et al.  Hierarchical Topic Models and the Nested Chinese Restaurant Process , 2003, NIPS.

[42]  Thomas L. Griffiths,et al.  Online Inference of Topics with Latent Dirichlet Allocation , 2009, AISTATS.

[43]  Peter Wiemer-Hastings,et al.  Latent semantic analysis , 2004, Annu. Rev. Inf. Sci. Technol..

[44]  Tom M. Mitchell,et al.  Joint Extraction of Events and Entities within a Document Context , 2016, NAACL.

[45]  Freda Kemp,et al.  An Introduction to Sequential Monte Carlo Methods , 2003 .

[46]  John D. Lafferty,et al.  Dynamic topic models , 2006, ICML.

[47]  Patrick Paroubek,et al.  Twitter as a Corpus for Sentiment Analysis and Opinion Mining , 2010, LREC.

[48]  Qinghua Zheng,et al.  Attributing Hacks , 2016, AISTATS.

[49]  David M. Blei,et al.  Supervised Topic Models , 2007, NIPS.

[50]  W. Eric L. Grimson,et al.  Spatial Latent Dirichlet Allocation , 2007, NIPS.

[51]  Xing Xie,et al.  Discovering regions of different functions in a city using human mobility and POIs , 2012, KDD.

[52]  Alexander J. Smola,et al.  Reducing the sampling complexity of topic models , 2014, KDD.

[53]  A. Hawkes Spectra of some self-exciting and mutually exciting point processes , 1971 .

[54]  John D. Lafferty,et al.  A correlated topic model of Science , 2007, 0708.3601.

[55]  Chong Wang,et al.  Nested Hierarchical Dirichlet Processes , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[56]  David M. Blei,et al.  Hierarchical relational models for document networks , 2009, 0909.4331.

[57]  Chao-Yuan Wu Additive Co-Clustering of Gaussians and Poissons for Joint Modeling of Ratings and Reviews , 2015 .

[58]  Alexander J. Smola,et al.  Stacked Attention Networks for Image Question Answering , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[59]  T. Ferguson A Bayesian Analysis of Some Nonparametric Problems , 1973 .

[60]  G. Casella,et al.  Explaining the Gibbs Sampler , 1992 .

[61]  D. Blackwell,et al.  Ferguson Distributions Via Polya Urn Schemes , 1973 .

[62]  David M. Blei,et al.  Detecting and Characterizing Events , 2016, EMNLP.

[63]  Alexander J. Smola,et al.  Online Inference for the Infinite Topic-Cluster Model: Storylines from Streaming Text , 2011, AISTATS.

[64]  Eli Pariser,et al.  The Filter Bubble: What the Internet Is Hiding from You , 2011 .

[65]  Tadej Štajner,et al.  Story Link Detection With Entity Resolution , 2009 .

[66]  Tie-Yan Liu,et al.  LightLDA: Big Topic Models on Modest Computer Clusters , 2014, WWW.

[67]  Alexander J. Smola,et al.  Explaining Reviews and Ratings with PACO: Poisson Additive Co-Clustering , 2015, WWW.

[68]  Tom M. Mitchell,et al.  Weakly Supervised Extraction of Computer Security Events from Twitter , 2015, WWW.

[69]  Julio Gonzalo,et al.  Towards real-time summarization of scheduled events from twitter streams , 2012, HT '12.

[70]  D. Boyd,et al.  The Arab Spring| The Revolutions Were Tweeted: Information Flows during the 2011 Tunisian and Egyptian Revolutions , 2011 .

[71]  M. Georgiopoulos,et al.  Feed-forward neural networks , 1994, IEEE Potentials.

[72]  Michael I. Jordan,et al.  Hierarchical Dirichlet Processes , 2006 .

[73]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[74]  Jean-Michel Renders,et al.  Who broke the news?: an analysis on first reports of news events , 2013, WWW.

[75]  Lisa Anderson,et al.  Demystifying the Arab Spring , 2011 .

[76]  David M. Blei,et al.  Syntactic Topic Models , 2008, NIPS.

[77]  Mark Steyvers,et al.  Finding scientific topics , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[78]  Thomas L. Griffiths,et al.  The Author-Topic Model for Authors and Documents , 2004, UAI.

[79]  Carl E. Rasmussen,et al.  Factorial Hidden Markov Models , 1997 .

[80]  Alexander J. Smola,et al.  Jointly modeling aspects, ratings and sentiments for movie recommendation (JMARS) , 2014, KDD.

[81]  Alexander J. Smola,et al.  Scalable inference in latent variable models , 2012, WSDM '12.

[82]  Sandor Laki,et al.  On a keyword-lifecycle model for real-time event detection in social network data , 2013, 2013 IEEE 4th International Conference on Cognitive Infocommunications (CogInfoCom).

[83]  John W. Paisley,et al.  Markov Mixed Membership Models , 2015, ICML.

[84]  Qinghua Zheng,et al.  Joint Hacking and Latent Hazard Rate Estimation , 2016, 1611.06843.