Data Mining and Knowledge Discovery

Our physical world is being projected into online cyberspace at an unprecedented rate. People nowadays visit different places and leave behind them million-scale digital traces such as tweets, check-ins, Yelp reviews, and Uber trajectories. Such digital data are a result of social sensing: namely people act as human sensors that probe different places in the physical world and share their activities online. The availability of massive social-sensing data provides a unique opportunity for understanding urban space in a data-driven manner and improving many urban computing applications, ranging from urban planning and traffic scheduling to disaster control and trip planning. In this chapter, we present recent developments in data-mining techniques for urban activity modeling, a fundamental task for extracting useful urban knowledge from social-sensing data. We first describe traditional approaches to urban activity modeling, including pattern discovery methods and statistical models. Then, we present the latest developments in multimodal embedding techniques for this task, which learns vector representations for different modalities to model people's spatiotemporal activities. We study the empirical performance of these methods and demonstrate how data-mining techniques can be successfully applied to social-sensing data to extract actionable knowledge and facilitate downstream applications.

[1]  Cecilia Mascolo,et al.  An Empirical Study of Geographic User Activity Patterns in Foursquare , 2011, ICWSM.

[2]  Jiawei Han,et al.  Mining periodic behaviors for moving objects , 2010, KDD.

[3]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[4]  Jiawei Han,et al.  Geographical topic discovery and comparison , 2011, WWW.

[5]  Jiawei Han,et al.  Mining event periodicity from incomplete observations , 2012, KDD.

[6]  Nicholas Jing Yuan,et al.  Regularity and Conformity: Location Prediction Using Heterogeneous Mobility Data , 2015, KDD.

[7]  Sergej Sizov,et al.  GeoFolk: latent spatial semantics in web 2.0 social media , 2010, WSDM '10.

[8]  Chenliang Li,et al.  Twevent: segment-based event detection from tweets , 2012, CIKM.

[9]  Wei Zhang,et al.  PRED: Periodic Region Detection for Mobility Modeling of Social Media Users , 2017, WSDM.

[10]  Chao Liu,et al.  A probabilistic approach to spatiotemporal theme pattern mining on weblogs , 2006, WWW '06.

[11]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[12]  Albert-László Barabási,et al.  Understanding individual human mobility patterns , 2008, Nature.

[13]  Luming Zhang,et al.  ReAct: Online Multimodal Embedding for Recency-Aware Spatiotemporal Activity Modeling , 2017, SIGIR.

[14]  Michael Gertz,et al.  EvenTweet: Online Localized Event Detection from Twitter , 2013, Proc. VLDB Endow..

[15]  Alexander J. Smola,et al.  Discovering geographical topics in the twitter stream , 2012, WWW.

[16]  Charu C. Aggarwal,et al.  Event Detection in Social Streams , 2012, SDM.

[17]  Shaowen Wang,et al.  Regions, Periods, Activities: Uncovering Urban Dynamics via Cross-Modal Representation Learning , 2017, WWW.

[18]  Ee-Peng Lim,et al.  Analyzing feature trajectories for event detection , 2007, SIGIR.

[19]  Nick Koudas,et al.  TwitterMonitor: trend detection over the twitter stream , 2010, SIGMOD Conference.

[20]  Zhe Zhu,et al.  What's Your Next Move: User Activity Prediction in Location-based Social Networks , 2013, SDM.

[21]  Shaowen Wang,et al.  GeoBurst: Real-Time Local Event Detection in Geo-Tagged Tweet Streams , 2016, SIGIR.

[22]  Felix Kling,et al.  Prediction of user location using the radiation model and social check-ins , 2013, UrbComp '13.

[23]  Chong Wang,et al.  Mining geographic knowledge using location aware topic model , 2007, GIR '07.

[24]  Richard A. Harshman,et al.  Foundations of the PARAFAC procedure: Models and conditions for an "explanatory" multi-model factor analysis , 1970 .

[25]  Bu-Sung Lee,et al.  Event Detection in Twitter , 2011, ICWSM.

[26]  Jiajun Liu,et al.  Understanding Human Mobility from Twitter , 2014, PloS one.

[27]  Steffen Staab,et al.  Detecting non-gaussian geographical topics in tagged photo collections , 2014, WSDM.

[28]  Jiawei Han,et al.  Swarm: Mining Relaxed Temporal Moving Object Clusters , 2010, Proc. VLDB Endow..

[29]  Thomas Hofmann,et al.  Probabilistic Latent Semantic Indexing , 1999, SIGIR Forum.

[30]  Patrick Laube,et al.  Analyzing Relative Motion within Groups of Trackable Moving Point Objects , 2002, GIScience.

[31]  Ling Chen,et al.  Event detection from flickr data through wavelet-based spatial analysis , 2009, CIKM.

[32]  Prithwish Basu,et al.  Discovering Latent Semantic Structure in Human Mobility Traces , 2015, EWSN.

[33]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[34]  Chaoming Song,et al.  Modelling the scaling properties of human mobility , 2010, 1010.0436.

[35]  Jure Leskovec,et al.  Friendship and mobility: user movement in location-based social networks , 2011, KDD.

[36]  Nadia Magnenat-Thalmann,et al.  Who, where, when and what: discover spatio-temporal topics for twitter users , 2013, KDD.

[37]  Yutaka Matsuo,et al.  Earthquake shakes Twitter users: real-time event detection by social sensors , 2010, WWW '10.

[38]  Wei Zhang,et al.  STREAMCUBE: Hierarchical spatio-temporal hashtag clustering for event exploration over the Twitter stream , 2015, 2015 IEEE 31st International Conference on Data Engineering.

[39]  Marta C. González,et al.  A universal model for mobility and migration patterns , 2011, Nature.

[40]  Luming Zhang,et al.  GMove: Group-Level Mobility Modeling Using Geo-Tagged Social Media , 2016, KDD.

[41]  T. Geisel,et al.  The scaling laws of human travel , 2006, Nature.

[42]  Kyumin Lee,et al.  Exploring Millions of Footprints in Location Sharing Services , 2011, ICWSM.

[43]  Dino Pedreschi,et al.  Trajectory pattern mining , 2007, KDD '07.

[44]  Kazutoshi Sumiya,et al.  Discovery of unusual regional social activities using geo-tagged microblogs , 2011, World Wide Web.

[45]  Eric Horvitz,et al.  Eyewitness: identifying local events via space-time signals in twitter feeds , 2015, SIGSPATIAL/GIS.

[46]  Bruno Martins,et al.  Predicting future locations with hidden Markov models , 2012, UbiComp.

[47]  Marta C. González,et al.  Understanding individual human mobility patterns , 2008, Nature.

[48]  Lidan Shou,et al.  Splitter: Mining Fine-Grained Sequential Patterns in Semantic Trajectories , 2014, Proc. VLDB Endow..

[49]  Nicholas Jing Yuan,et al.  On discovery of gathering patterns from trajectories , 2013, 2013 IEEE 29th International Conference on Data Engineering (ICDE).

[50]  Shaowen Wang,et al.  Mapping the global Twitter heartbeat: The geography of Twitter , 2013, First Monday.

[51]  Rui Li,et al.  TEDAS: A Twitter-based Event Detection and Analysis System , 2012, 2012 IEEE 28th International Conference on Data Engineering.