A Dirichlet Mixture Model of Hawkes Processes for Event Sequence Clustering

We propose an effective method to solve the event sequence clustering problems based on a novel Dirichlet mixture model of a special but significant type of point processes --- Hawkes process. In this model, each event sequence belonging to a cluster is generated via the same Hawkes process with specific parameters, and different clusters correspond to different Hawkes processes. The prior distribution of the Hawkes processes is controlled via a Dirichlet distribution. We learn the model via a maximum likelihood estimator (MLE) and propose an effective variational Bayesian inference algorithm. We specifically analyze the resulting EM-type algorithm in the context of inner-outer iterations and discuss several inner iteration allocation strategies. The identifiability of our model, the convergence of our learning method, and its sample complexity are analyzed in both theoretical and empirical ways, which demonstrate the superiority of our method to other competitors. The proposed method learns the number of clusters automatically and is robust to model misspecification. Experiments on both synthetic and real-world data show that our method can learn diverse triggering patterns hidden in asynchronous event sequences and achieve encouraging performance on clustering purity and consistency.

[1]  S. Yakowitz,et al.  On the Identifiability of Finite Mixtures , 1968 .

[2]  T. Rothenberg Identification in Parametric Models , 1971 .

[3]  A. Hawkes Spectra of some self-exciting and mutually exciting point processes , 1971 .

[4]  C. Bruni,et al.  Identifiability of Continuous Mixtures of Unknown Gaussian Distributions , 1985 .

[5]  Bernard W. Silverman,et al.  Density Estimation for Statistics and Data Analysis , 1987 .

[6]  B. Silverman Density estimation for statistics and data analysis , 1986 .

[7]  Babatunde A. Ogunnaike,et al.  Process Dynamics, Modeling, and Control , 1994 .

[8]  P. Green Reversible jump Markov chain Monte Carlo computation and Bayesian model determination , 1995 .

[9]  B. Lindsay Mixture models : theory, geometry, and applications , 1995 .

[10]  Jarke J. van Wijk,et al.  Cluster and Calendar Based Visualization of Time Series Data , 1999, INFOVIS.

[11]  Carl E. Rasmussen,et al.  The Infinite Gaussian Mixture Model , 1999, NIPS.

[12]  G. Golub,et al.  Large sparse symmetric eigenvalue problems with homogeneous linear constraints: the Lanczos process with inner–outer iterations , 2000 .

[13]  Elizabeth Ann Maharaj,et al.  Cluster of Time Series , 2000, J. Classif..

[14]  R G Mark,et al.  MIMIC II: a massive temporal ICU patient database to support research in intelligent patient monitoring , 2002, Computers in Cardiology.

[15]  Zhihua Zhang,et al.  Learning a multivariate Gaussian mixture model with the reversible jump MCMC algorithm , 2004, Stat. Comput..

[16]  Robert Tibshirani,et al.  Cluster Validation by Prediction Strength , 2005 .

[17]  T. Warren Liao,et al.  Clustering of time series data - a survey , 2005, Pattern Recognit..

[18]  Yee Whye Teh,et al.  A Collapsed Variational Bayesian Inference Algorithm for Latent Dirichlet Allocation , 2006, NIPS.

[19]  Michael I. Jordan,et al.  Variational inference for Dirichlet process mixtures , 2006 .

[20]  Esko Valkeila,et al.  An Introduction to the Theory of Point Processes, Volume II: General Theory and Structure, 2nd Edition by Daryl J. Daley, David Vere‐Jones , 2008 .

[21]  Erik Meijer,et al.  A Simple Identification Proof for a Mixture of Two Univariate Normal Distributions , 2008, J. Classif..

[22]  Daeyoung Kim,et al.  MIXTURE INFERENCE AT THE EDGE OF IDENTIFIABILITY , 2008 .

[23]  G. Celeux,et al.  Variable Selection for Clustering with Gaussian Mixture Models , 2009, Biometrics.

[24]  Carl E. Rasmussen,et al.  Dirichlet Process Gaussian Mixture Models: Choice of the Base Distribution , 2010, Journal of Computer Science and Technology.

[25]  Christopher D. Manning,et al.  Introduction to Information Retrieval , 2010, J. Assoc. Inf. Sci. Technol..

[26]  Michael I. Jordan,et al.  Modeling Events with Cascades of Poisson Processes , 2010, UAI.

[27]  P. Reynaud-Bouret,et al.  Adaptive estimation for Hawkes processes; application to genome analysis , 2009, 0903.2919.

[28]  Christopher D. Manning,et al.  Spectral Chinese Restaurant Processes: Nonparametric Clustering Based on Similarities , 2011, AISTATS.

[29]  Erik A. Lewis,et al.  RESEARCH ARTICLE A Nonparametric EM algorithm for Multiscale Hawkes Processes , 2011 .

[30]  E. Bacry,et al.  Non-parametric kernel estimation for symmetric Hawkes processes. Application to high frequency financial data , 2011, 1112.1838.

[31]  Jasper Snoek,et al.  Practical Bayesian Optimization of Machine Learning Algorithms , 2012, NIPS.

[32]  Katherine A. Heller,et al.  Modelling Reciprocating Relationships with Hawkes Processes , 2012, NIPS.

[33]  Le Song,et al.  Learning Networks of Heterogeneous Influence , 2012, NIPS.

[34]  E. Bacry,et al.  Some limit theorems for Hawkes processes and application to financial statistics , 2013 .

[35]  Hongyuan Zha,et al.  Dyadic event attribution in social networks with mixtures of hawkes processes , 2013, CIKM.

[36]  Le Song,et al.  Learning Social Infectivity in Sparse Low-rank Networks Using Multi-dimensional Hawkes Processes , 2013, AISTATS.

[37]  Shuang-Hong Yang,et al.  Mixture of Mutually Exciting Processes for Viral Diffusion , 2013, ICML.

[38]  J. Rasmussen Bayesian Inference for Hawkes Processes , 2013 .

[39]  Fang Han,et al.  Transition Matrix Estimation in High Dimensional Time Series , 2013, ICML.

[40]  Le Song,et al.  Learning Triggering Kernels for Multi-dimensional Hawkes Processes , 2013, ICML.

[41]  Le Song,et al.  Shaping Social Activity by Incentivizing Users , 2014, NIPS.

[42]  Rong Xie,et al.  You Are What You Watch and When You Watch: Inferring Household Structures From IPTV Viewing Data , 2014, IEEE Transactions on Broadcasting.

[43]  Nicolas Vayatis,et al.  Nonparametric Markovian Learning of Triggering Kernels for Mutually Exciting and Mutually Inhibiting Multivariate Hawkes Processes , 2014, ECML/PKDD.

[44]  Hongyuan Zha,et al.  Trailer Generation via a Point Process-Based Visual Attractiveness Model , 2015, IJCAI.

[45]  Le Song,et al.  Dirichlet-Hawkes Processes with Applications to Clustering Continuous-Time Document Streams , 2015, KDD.

[46]  Hongyuan Zha,et al.  On Machine Learning towards Predictive Sales Pipeline Analytics , 2015, AAAI.

[47]  Vinayak A. Rao,et al.  A Multitask Point Process Predictive Model , 2015, ICML.

[48]  Jure Leskovec,et al.  SEISMIC: A Self-Exciting Point Process Model for Predicting Tweet Popularity , 2015, KDD.

[49]  Wenjun Zhang,et al.  Multi-Task Multi-Dimensional Hawkes Processes for Modeling Event Sequences , 2015, IJCAI.

[50]  Nando de Freitas,et al.  Taking the Human Out of the Loop: A Review of Bayesian Optimization , 2016, Proceedings of the IEEE.

[51]  Ulrike Goldschmidt,et al.  An Introduction To The Theory Of Point Processes , 2016 .

[52]  Hongyuan Zha,et al.  Learning Granger Causality for Hawkes Processes , 2016, ICML.

[53]  Wenjun Zhang,et al.  Learning Mixtures of Markov Chains from Aggregate Data with Structural Constraints , 2016, IEEE Transactions on Knowledge and Data Engineering.

[54]  Peter Müller,et al.  Bayesian inference for latent biologic structure with determinantal point processes (DPP) , 2015, Biometrics.

[55]  R. Dahlhaus,et al.  Graphical Modeling for Multivariate Hawkes Processes with Nonparametric Link Functions , 2016, 1605.06759.

[56]  Shamim Nemati,et al.  Patient Flow Prediction via Discriminative Learning of Mutually-Correcting Processes , 2016, IEEE Transactions on Knowledge and Data Engineering.

[57]  Hongyuan Zha,et al.  Learning Hawkes Processes from Short Doubly-Censored Event Sequences , 2017, ICML.