MixedTrails: Bayesian hypothesis comparison on heterogeneous sequential data

Sequential traces of user data are frequently observed online and offline, e.g., as sequences of visited websites or as sequences of locations captured by GPS. However, understanding factors explaining the production of sequence data is a challenging task, especially since the data generation is often not homogeneous. For example, navigation behavior might change in different phases of browsing a website or movement behavior may vary between groups of users. In this work, we tackle this task and propose MixedTrails , a Bayesian approach for comparing the plausibility of hypotheses regarding the generative processes of heterogeneous sequence data. Each hypothesis is derived from existing literature, theory, or intuition and represents a belief about transition probabilities between a set of states that can vary between groups of observed transitions. For example, when trying to understand human movement in a city and given some data, a hypothesis assuming tourists to be more likely to move towards points of interests than locals can be shown to be more plausible than a hypothesis assuming the opposite. Our approach incorporates such hypotheses as Bayesian priors in a generative mixed transition Markov chain model, and compares their plausibility utilizing Bayes factors. We discuss analytical and approximate inference methods for calculating the marginal likelihoods for Bayes factors, give guidance on interpreting the results, and illustrate our approach with several experiments on synthetic and empirical data from Wikipedia and Flickr. Thus, this work enables a novel kind of analysis for studying sequential data in many application areas.

[1]  Claude E. Shannon,et al.  The mathematical theory of communication , 1950 .

[2]  John G. Kemeny,et al.  Finite Markov Chains. , 1960 .

[3]  K. Gabriel,et al.  A Markov chain model for daily rainfall occurrence at Tel Aviv , 1962 .

[4]  L. Rabiner,et al.  An introduction to hidden Markov models , 1986, IEEE ASSP Magazine.

[5]  Lloyd M. Smith,et al.  Fluorescence detection in automated DNA sequence analysis , 1986, Nature.

[6]  James D. Hamilton Analysis of time series subject to changes in regime , 1990 .

[7]  C. S. Poulsen Mixed Markov and latent Markov modelling applied to brand choice behaviour , 1990 .

[8]  James E. Pitkow,et al.  Characterizing Browsing Strategies in the World-Wide Web , 1995, Comput. Networks ISDN Syst..

[9]  S. Chib Marginal Likelihood from the Gibbs Output , 1995 .

[10]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[11]  W. Bruce Croft,et al.  Text Segmentation by Topic , 1997, ECDL.

[12]  Richard L. Smith,et al.  Markov chain models for threshold exceedances , 1997 .

[13]  S. Goodman,et al.  Multiple comparisons, explained. , 1998, American journal of epidemiology.

[14]  Larry Gillick,et al.  Text segmentation and topic tracking on broadcast news via a hidden Markov model approach , 1998, ICSLP.

[15]  Huberman,et al.  Strong regularities in world wide web surfing , 1998, Science.

[16]  Matthew Chalmers,et al.  The Order of Things: Activity-Centred Information Access, , 1998, Comput. Networks.

[17]  Albert,et al.  Emergence of scaling in random networks , 1999, Science.

[18]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[19]  David M. Blei,et al.  Topic segmentation with an aspect hidden Markov model , 2001, SIGIR '01.

[20]  Ed H. Chi,et al.  Using information scent to model user information needs and actions and the Web , 2001, CHI.

[21]  William M. K. Trochim,et al.  Research methods knowledge base , 2001 .

[22]  Stuart J. Russell,et al.  Dynamic bayesian networks: representation, inference and learning , 2002 .

[23]  Michael I. Jordan,et al.  Factorial Hidden Markov Models , 1995, Machine Learning.

[24]  Sylvia Kaufmann,et al.  Model-Based Clustering of Multiple Time Series , 2004 .

[25]  Andrew Howes,et al.  Good Enough But I'll Just Check: Web-page Search as Attentional Refocusing , 2004, ICCM.

[26]  B. Nordstrom FINITE MARKOV CHAINS , 2005 .

[27]  Enric Plaza,et al.  Case-Based Sequential Ordering of Songs for Playlist Recommendation , 2006, ECCBR.

[28]  A. A. Markov,et al.  An Example of Statistical Investigation of the Text Eugene Onegin Concerning the Connection of Samples in Chains , 2006, Science in Context.

[29]  Hanna M. Wallach,et al.  Topic modeling: beyond bag-of-words , 2006, ICML.

[30]  Sang Joon Kim,et al.  A Mathematical Theory of Communication , 2006 .

[31]  Michael I. Jordan,et al.  Hierarchical Dirichlet Processes , 2006 .

[32]  Thomas L. Griffiths,et al.  A fully Bayesian approach to unsupervised part-of-speech tagging , 2007, ACL.

[33]  Alex Bateman,et al.  An introduction to hidden Markov models. , 2007, Current protocols in bioinformatics.

[34]  Christopher C. Strelioff,et al.  Inferring Markov chains: Bayesian estimation, model comparison, entropy rate, and out-of-class modeling. , 2007, Physical review. E, Statistical, nonlinear, and soft matter physics.

[35]  Norman Herr The Sourcebook for Teaching Science, Grades 6-12: Strategies, Activities, and Instructional Resources , 2008 .

[36]  Andrew Gelman,et al.  Why We (Usually) Don't Have to Worry About Multiple Comparisons , 2009, 0907.2478.

[37]  Albert-László Barabási,et al.  Understanding individual human mobility patterns , 2008, Nature.

[38]  Norman Herr The sourcebook for teaching science: strategies, activities, and instructional resources grades 6-12 / Norman Herr , 2008 .

[39]  Ryen W. White,et al.  Stream prediction using a generative model based on frequent episodes in event sequences , 2008, KDD.

[40]  Peter Pirolli,et al.  Information Foraging , 2009, Encyclopedia of Database Systems.

[41]  Jeffrey N. Rouder,et al.  Bayesian t tests for accepting and rejecting the null hypothesis , 2009, Psychonomic bulletin & review.

[42]  Doina Precup,et al.  Wikispeedia: An Online Game for Inferring Semantic Distances between Concepts , 2009, IJCAI.

[43]  Lars Schmidt-Thieme,et al.  Factorizing personalized Markov chains for next-basket recommendation , 2010, WWW '10.

[44]  Cong Yu,et al.  Automatic construction of travel itineraries using social breadcrumbs , 2010, HT '10.

[45]  W. Vanpaemel,et al.  Prior sensitivity in theory testing: An apologia for the Bayes factor , 2010 .

[46]  Ryen W. White,et al.  Assessing the scenic route: measuring the value of search trails in web logs , 2010, SIGIR.

[47]  Michael I. Jordan,et al.  Bayesian Nonparametric Methods for Learning Markov Switching Processes , 2010, IEEE Signal Processing Magazine.

[48]  Sébastien Gambs,et al.  Show me how you move and I will tell you who you are , 2010, SPRINGL '10.

[49]  Akinori Asahara,et al.  Pedestrian-movement prediction based on mixed Markov-chain model , 2011, GIS.

[50]  Cecilia Mascolo,et al.  Mining User Mobility Features for Next Place Prediction in Location-Based Services , 2012, 2012 IEEE 12th International Conference on Data Mining.

[51]  David Barber,et al.  Bayesian reasoning and machine learning , 2012 .

[52]  Cecilia Mascolo,et al.  A Tale of Many Cities: Universal Patterns in Human Urban Mobility , 2011, PloS one.

[53]  Jure Leskovec,et al.  Human wayfinding in information networks , 2012, WWW.

[54]  J. Thurgood,et al.  Functional Lung Imaging during HFV in Preterm Rabbits , 2012, PloS one.

[55]  Brian Hayes,et al.  First Links in the Markov Chain , 2013 .

[56]  J. Kruschke Bayesian estimation supersedes the t test. , 2013, Journal of experimental psychology. General.

[57]  Markus Strohmaier,et al.  Sequential Action Patterns in Collaborative Ontology-Engineering Projects: A Case-Study in the Biomedical Domain , 2014, CIKM.

[58]  Jure Leskovec,et al.  Finding progression stages in time-evolving event sequences , 2014, WWW.

[59]  John K. Kruschke,et al.  Doing Bayesian Data Analysis: A Tutorial with R, JAGS, and Stan , 2014 .

[60]  Christos Faloutsos,et al.  AutoPlait: automatic mining of co-evolving time sequences , 2014, SIGMOD Conference.

[61]  Marco Zaffalon,et al.  A Bayesian Wilcoxon signed-rank test based on the Dirichlet process , 2014, ICML.

[62]  Denis Helic,et al.  Detecting Memory and Structure in Human Navigation Patterns Using Markov Chain Models of Varying Order , 2014, PloS one.

[63]  Andreas Hotho,et al.  Photowalking the City: Comparing Hypotheses About Urban Photo Trails on Flickr , 2015, SocInfo.

[64]  John K. Kruschke Chapter 14 – Stan , 2015 .

[65]  A. Hotho,et al.  HypTrails: A Bayesian Approach for Comparing Hypotheses About Human Trails on the Web , 2014, WWW.

[66]  Andreas Hotho,et al.  Mining Subgroups with Exceptional Transition Behavior , 2016, KDD.

[67]  Andreas Hotho,et al.  SparkTrails: A MapReduce Implementation of HypTrails for Comparing Hypotheses About Human Trails , 2016, WWW.

[68]  Sergei Vassilvitskii,et al.  On Mixtures of Markov Chains , 2016, NIPS.

[69]  Ruud Wetzels,et al.  A Bayesian test for the hot hand phenomenon , 2016 .

[70]  Bruno Ribeiro,et al.  TribeFlow: Mining & Predicting User Trajectories , 2015, WWW.

[71]  Bruno Ribeiro,et al.  Mining Online Music Listening Trajectories , 2016, ISMIR.

[72]  Markus Strohmaier,et al.  What Makes a Link Successful on Wikipedia? , 2016, WWW.