State-Sharing Sparse Hidden Markov Models for Personalized Sequences

Hidden Markov Model (HMM) is a powerful tool that has been widely adopted in sequence modeling tasks, such as mobility analysis, healthcare informatics, and online recommendation. However, using HMM for modeling personalized sequences remains a challenging problem: training a unified HMM with all the sequences often fails to uncover interesting personalized patterns; yet training one HMM for each individual inevitably suffers from data scarcity. We address this challenge by proposing a state-sharing sparse hidden Markov model (S3HMM) that can uncover personalized sequential patterns without suffering from data scarcity. This is achieved by two design principles: (1) all the HMMs in the ensemble share the same set of latent states; and (2) each HMM has its own transition matrix to model the personalized transitions. The result optimization problem for S3HMM becomes nontrivial, because of its two-layer hidden state design and the non-convexity in parameter estimation. We design a new Expectation-Maximization algorithm based, which treats the difference of convex programming as a sub-solver to optimize the non-convex function in the M-step with convergence guarantee. Our experimental results show that, S3HMM can successfully uncover personalized sequential patterns in various applications and outperforms baselines significantly in downstream prediction tasks.

[1]  Tie-Yan Liu,et al.  Large-Scale Low-Rank Matrix Learning with Nonconvex Regularizers , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  Vassilis Kostakos,et al.  Semantics-Aware Hidden Markov Model for Human Mobility , 2019, IEEE Transactions on Knowledge and Data Engineering.

[3]  Chao Zhang,et al.  SERM: A Recurrent Model for Next Location Prediction in Semantic Trajectories , 2017, CIKM.

[4]  Aniket Kittur,et al.  Bridging the gap between physical location and online social networks , 2010, UbiComp.

[5]  T. Minka Expectation-Maximization as lower bound maximization , 1998 .

[6]  Xiaohui Yu,et al.  Mining moving patterns for predicting next location , 2015, Inf. Syst..

[7]  Nasser M. Nasrabadi,et al.  Pattern Recognition and Machine Learning , 2006, Technometrics.

[8]  L. Baum,et al.  A Maximization Technique Occurring in the Statistical Analysis of Probabilistic Functions of Markov Chains , 1970 .

[9]  Alan L. Yuille,et al.  The Concave-Convex Procedure , 2003, Neural Computation.

[10]  Òscar Celma,et al.  Music Recommendation and Discovery - The Long Tail, Long Fail, and Long Play in the Digital Music Space , 2010 .

[11]  Vikram Pudi,et al.  Attentive neural architecture incorporating song features for music recommendation , 2018, RecSys.

[12]  Depeng Jin,et al.  Understanding Urban Dynamics via State-Sharing Hidden Markov Model , 2019, IEEE Transactions on Knowledge and Data Engineering.

[13]  Lei Lin,et al.  Music Sequence Prediction with Mixture Hidden Markov Models , 2018, 2019 IEEE International Conference on Big Data (Big Data).

[14]  Mohan S. Kankanhalli,et al.  Exploiting Music Play Sequence for Music Recommendation , 2017, IJCAI.

[15]  Òscar Celma,et al.  Music recommendation and discovery in the long tail , 2008 .

[16]  Bruno Martins,et al.  Predicting future locations with hidden Markov models , 2012, UbiComp.

[17]  Anna Monreale,et al.  WhereNext: a location predictor on trajectory pattern mining , 2009, KDD.

[18]  Alexandros Karatzoglou,et al.  Session-based Recommendations with Recurrent Neural Networks , 2015, ICLR.

[19]  Prithwish Basu,et al.  Discovering Latent Semantic Structure in Human Mobility Traces , 2015, EWSN.

[20]  Luming Zhang,et al.  GMove: Group-Level Mobility Modeling Using Geo-Tagged Social Media , 2016, KDD.

[21]  Feng Yu,et al.  A Dynamic Recurrent Model for Next Basket Recommendation , 2016, SIGIR.

[22]  Jing Li,et al.  Predicting Activity and Location with Multi-task Context Aware Recurrent Neural Network , 2018, IJCAI.

[23]  James T. Kwok,et al.  Efficient Learning with a Family of Nonconvex Regularizers by Redistributing Nonconvexity , 2016, ICML.

[24]  Vassilis Kostakos,et al.  Semantics-Aware Hidden Markov Model for Human Mobility , 2019 .

[25]  Le Thi Hoai An,et al.  The DC (Difference of Convex Functions) Programming and DCA Revisited with DC Models of Real World Nonconvex Optimization Problems , 2005, Ann. Oper. Res..

[26]  Chao Zhang,et al.  DeepMove: Predicting Human Mobility with Attentional Recurrent Networks , 2018, WWW.

[27]  Zheng Wang,et al.  Learning to Estimate the Travel Time , 2018, KDD.

[28]  Nemanja Djuric,et al.  E-commerce in Your Inbox: Product Recommendations at Scale , 2015, KDD.

[29]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[30]  Hongzhi Shi,et al.  Discovering Periodic Patterns for Large Scale Mobile Traffic Data: Method and Applications , 2018, IEEE Transactions on Mobile Computing.