SMF: Drift-Aware Matrix Factorization with Seasonal Patterns

Consider a stream of time-stamped events, such as taxi rides, where we record the start and end locations of each ride. How do we learn a matrix factorization model which takes into account seasonal patterns (such as: rides toward office areas occur more frequently in the morning), and use it to forecast taxi rides tomorrow? Also, how can we model drift (such as population growth), and detect sudden changes (or anomalies)? Existing matrix factorization algorithms do not take seasonal patterns into account. We propose SMF (Seasonal Matrix Factorization), a matrix factorization model for seasonal data, and a streaming algorithm for fitting it. SMF is (a) accurate in forecasting: outperforming baselines by 13% to 60% in RMSE; (b) online: requiring fixed memory even as more data is received over time, and scaling linearly; (c) effective: providing interpretable results. In addition, we propose SMF-A, an algorithm which detects anomalies in a computationally feasible way, without forecasting every observation in the matrix.

[1]  S. D. Collins The Influenza Epidemic of 1928-1929 with Comparative Data for 1918-1919. , 1930, American journal of public health and the nation's health.

[2]  J. E. Perkins,et al.  Effect of Ultra-violet Irradiation of Classrooms on Spread of Measles in Large Rural Central Schools Preliminary Report. , 1947, American journal of public health and the nation's health.

[3]  John Riedl,et al.  Application of Dimensionality Reduction in Recommender System - A Case Study , 2000 .

[4]  Thomas Hofmann,et al.  Latent semantic models for collaborative filtering , 2004, TOIS.

[5]  Xue Li,et al.  Time weight collaborative filtering , 2005, CIKM '05.

[6]  Jimeng Sun,et al.  Beyond streams and graphs: dynamic tensor analysis , 2006, KDD '06.

[7]  Yehuda Koren,et al.  Scalable Collaborative Filtering with Jointly Derived Neighborhood Interpolation Weights , 2007, Seventh IEEE International Conference on Data Mining (ICDM 2007).

[8]  Rob J Hyndman,et al.  Automatic Time Series Forecasting: The forecast Package for R , 2008 .

[9]  Yehuda Koren,et al.  Factorization meets the neighborhood: a multifaceted collaborative filtering model , 2008, KDD.

[10]  Tamara G. Kolda,et al.  Tensor Decompositions and Applications , 2009, SIAM Rev..

[11]  Yehuda Koren,et al.  Collaborative filtering with temporal dynamics , 2009, KDD.

[12]  Xi Chen,et al.  Temporal Collaborative Filtering with Bayesian Probabilistic Tensor Factorization , 2010, SDM.

[13]  Jérôme Idier,et al.  Algorithms for Nonnegative Matrix Factorization with the β-Divergence , 2010, Neural Computation.

[14]  Tamara G. Kolda,et al.  Temporal Link Prediction Using Matrix and Tensor Factorizations , 2010, TKDD.

[15]  Ee-Peng Lim,et al.  Modeling Temporal Adoptions Using Dynamic Matrix Factorization , 2013, 2013 IEEE 13th International Conference on Data Mining.

[16]  Wotao Yin,et al.  A Block Coordinate Descent Method for Regularized Multiconvex Optimization with Applications to Nonnegative Tensor Factorization and Completion , 2013, SIAM J. Imaging Sci..

[17]  Kush R. Varshney,et al.  Collaborative Kalman Filtering for Dynamic Matrix Factorization , 2014, IEEE Transactions on Signal Processing.

[18]  Danai Koutra,et al.  TimeCrunch: Interpretable Dynamic Graph Summarization , 2015, KDD.

[19]  Nicolas Kourtellis,et al.  Dynamic Matrix Factorization with Priors on Unknown Values , 2015, KDD.

[20]  Markus Strohmaier,et al.  Discovering and Characterizing Mobility Patterns in Urban Spaces: A Study of Manhattan Taxi Data , 2016, WWW.

[21]  Christos Faloutsos,et al.  TensorCast: Forecasting with Context Using Coupled Tensors (Best Paper Award) , 2017, 2017 IEEE International Conference on Data Mining (ICDM).

[22]  Christos Faloutsos,et al.  AutoCyclone: Automatic Mining of Cyclic Online Activities with Robust Tensor Factorization , 2017, WWW.

[23]  Christos Faloutsos,et al.  DenseAlert: Incremental Dense-Subtensor Detection in Tensor Streams , 2017, KDD.