Streaming Probabilistic Deep Tensor Factorization

Despite the success of existing tensor factorization methods, most of them conduct a multilinear decomposition, and rarely exploit powerful modeling frameworks, like deep neural networks, to capture a variety of complicated interactions in data. More important, for highly expressive, deep factorization, we lack an effective approach to handle streaming data, which are ubiquitous in real-world applications. To address these issues, we propose SPIDER, a Streaming ProbabilistIc Deep tEnsoR factorization method. We first use Bayesian neural networks (NNs) to construct a deep tensor factorization model. We assign a spike-and-slab prior over the NN weights to encourage sparsity and prevent overfitting. We then use Taylor expansions and moment matching to approximate the posterior of the NN output and calculate the running model evidence, based on which we develop an efficient streaming posterior inference algorithm in the assumed-density-filtering and expectation propagation framework. Our algorithm provides responsive incremental updates for the posterior of the latent factors and NN weights upon receiving new tensor entries, and meanwhile select and inhibit redundant/useless weights. We show the advantages of our approach in four real-world applications.

[1]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[2]  L. Tucker,et al.  Some mathematical notes on three-mode factor analysis , 1966, Psychometrika.

[3]  Xavier Boyen,et al.  Tractable Inference for Complex Stochastic Processes , 1998, UAI.

[4]  Peter D. Hoff,et al.  Hierarchical multilinear models for multiway data , 2010, Comput. Stat. Data Anal..

[5]  Shandian Zhe,et al.  Probabilistic Streaming Tensor Decomposition , 2018, 2018 IEEE International Conference on Data Mining (ICDM).

[6]  David B. Dunson,et al.  Bayesian Conditional Tensor Factorizations for High-Dimensional Classification , 2013, Journal of the American Statistical Association.

[7]  Zenglin Xu,et al.  Scalable Nonparametric Multiway Data Analysis , 2015, AISTATS.

[8]  Tamara G. Kolda,et al.  Scalable Tensor Factorizations for Incomplete Data , 2010, ArXiv.

[9]  David B. Dunson,et al.  Scalable Bayesian Low-Rank Decomposition of Incomplete Multiway Tensors , 2014, ICML.

[10]  Tamir Hazan,et al.  Non-negative tensor factorization with applications to statistics and computer vision , 2005, ICML.

[11]  Yan Liu,et al.  CoSTCo: A Neural Tensor Completion Model for Sparse Tensors , 2019, KDD.

[12]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[13]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[14]  Richard E. Turner,et al.  Streaming Sparse Gaussian Process Approximations , 2017, NIPS.

[15]  Matthew Harding,et al.  Scalable Probabilistic Tensor Factorization for Binary and Count Data , 2015, IJCAI.

[16]  Andre Wibisono,et al.  Streaming Variational Bayes , 2013, NIPS.

[17]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[18]  Zenglin Xu,et al.  NeuralCP: Bayesian Multiway Data Analysis with Neural Tensor Decomposition , 2018, Cognitive Computation.

[19]  Joshua B. Tenenbaum,et al.  Modelling Relational Data using Bayesian Clustered Tensor Factorization , 2009, NIPS.

[20]  Christos Faloutsos,et al.  GigaTensor: scaling tensor analysis up by 100 times - algorithms and discoveries , 2012, KDD.

[21]  Michael I. Jordan,et al.  Loopy Belief Propagation for Approximate Inference: An Empirical Study , 1999, UAI.

[22]  Lawrence Carin,et al.  Zero-Truncated Poisson Tensor Factorization for Massive Binary Tensors , 2015, UAI.

[23]  J. H. Choi,et al.  DFacTo: Distributed Factorization of Tensors , 2014, NIPS.

[24]  T. Kolda Multilinear operators for higher-order decompositions , 2006 .

[25]  Tom Minka,et al.  Expectation Propagation for approximate Bayesian inference , 2001, UAI.

[26]  Zheng Wang,et al.  Conditional Expectation Propagation , 2019, UAI.

[27]  Zenglin Xu,et al.  Distributed Flexible Nonlinear Tensor Factorization , 2016, NIPS.

[28]  Wei Chu,et al.  Probabilistic Models for Incomplete Multi-dimensional Arrays , 2009, AISTATS.

[29]  Michael I. Jordan,et al.  Graphical Models, Exponential Families, and Variational Inference , 2008, Found. Trends Mach. Learn..

[30]  Richard A. Harshman,et al.  Foundations of the PARAFAC procedure: Models and conditions for an "explanatory" multi-model factor analysis , 1970 .

[31]  Miguel Lázaro-Gredilla,et al.  Spike and Slab Variational Inference for Multi-Task and Multiple Kernel Learning , 2011, NIPS.

[32]  Zenglin Xu,et al.  Infinite Tucker Decomposition: Nonparametric Bayesian Models for Multiway Data Analysis , 2011, ICML.

[33]  Ryan P. Adams,et al.  Probabilistic Backpropagation for Scalable Learning of Bayesian Neural Networks , 2015, ICML.

[34]  J. S. Rao,et al.  Spike and slab variable selection: Frequentist and Bayesian strategies , 2005, math/0505633.

[35]  Tom Minka,et al.  A family of algorithms for approximate Bayesian inference , 2001 .