NeuralCP: Bayesian Multiway Data Analysis with Neural Tensor Decomposition

Multiway data are widely observed in neuroscience, health informatics, food science, etc. Tensor decomposition is an important technique for capturing high-order interactions among such multiway data. Classical tensor decomposition methods, such as the Tucker decomposition and the CANDECOMP/PARAFAC (CP), assume that the complex interactions among objects are multi-linear and thus insufficient to represent nonlinear relationships in data. To effectively model the complex nonlinear relationships of a tensor, we design a neural model joining neural networks with the Bayesian tensor decomposition, in which the high-order interactions are captured by neural networks. By taking advantages of the nonlinear modeling provided by the neural networks and the uncertainty modeling provided by Bayesian models, we replace the multi-linear product in traditional Bayesian tensor decomposition with a more flexible neural function (i.e., a multi-layer perceptron) whose parameters can be learned from data. Our model can be efficiently optimized with stochastic gradient descent. Accordingly, it is scalable to large real-world tensor. We conducted experiments on both synthetic data and real-world chemometrics tensor data. Experimental results have demonstrated that the proposed model can achieve significantly higher prediction performance than the state-of-the-art tensor decomposition approaches. The proposed nonlinear tensor decomposition method, i.e., NeuralCP, has been demonstrated to obtain promising prediction results on many multi-way data.

[1]  Zenglin Xu,et al.  Simple and efficient parallelization for probabilistic temporal tensor factorization , 2017, 2017 International Joint Conference on Neural Networks (IJCNN).

[2]  Tamara G. Kolda,et al.  Tensor Decompositions and Applications , 2009, SIAM Rev..

[3]  Yingjian Wang,et al.  Leveraging Features and Networks for Probabilistic Tensor Decomposition , 2015, AAAI.

[4]  Zenglin Xu,et al.  DinTucker: Scaling Up Gaussian Process Models on Large Multidimensional Arrays , 2016, AAAI.

[5]  Liqing Zhang,et al.  Bayesian CP Factorization of Incomplete Tensors with Automatic Rank Determination , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  J. Chang,et al.  Analysis of individual differences in multidimensional scaling via an n-way generalization of “Eckart-Young” decomposition , 1970 .

[7]  Danqi Chen,et al.  Reasoning With Neural Tensor Networks for Knowledge Base Completion , 2013, NIPS.

[8]  Zenglin Xu,et al.  Learning Compact Recurrent Neural Networks with Block-Term Tensor Decomposition , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[9]  Zenglin Xu,et al.  BT-Nets: Simplifying Deep Neural Networks via Block Term Decomposition , 2017, ArXiv.

[10]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[11]  C. F. Beckmann,et al.  Tensorial extensions of independent component analysis for multisubject FMRI analysis , 2005, NeuroImage.

[12]  Q. Shi,et al.  Gaussian Process Latent Variable Models for , 2011 .

[13]  Zenglin Xu,et al.  Scalable Nonparametric Multiway Data Analysis , 2015, AISTATS.

[14]  M. Irani Vision Day Schedule Time Speaker and Collaborators Affiliation Title a General Preprocessing Method for Improved Performance of Epipolar Geometry Estimation Algorithms on the Expressive Power of Deep Learning: a Tensor Analysis , 2016 .

[15]  R. Bro Exploratory study of sugar production using fluorescence spectroscopy and multi-way analysis , 1999 .

[16]  Tamara G. Kolda,et al.  On Tensors, Sparsity, and Nonnegative Factorizations , 2011, SIAM J. Matrix Anal. Appl..

[17]  P. Comon,et al.  Tensor decompositions, alternating least squares and other tales , 2009 .

[18]  Tamir Hazan,et al.  Non-negative tensor factorization with applications to statistics and computer vision , 2005, ICML.

[19]  Diederik P. Kingma,et al.  Stochastic Gradient VB and the Variational Auto-Encoder , 2013 .

[20]  R. Bro PARAFAC. Tutorial and applications , 1997 .

[21]  H. Neudecker,et al.  An approach ton-mode components analysis , 1986 .

[22]  David M. Blei,et al.  Variational Inference: A Review for Statisticians , 2016, ArXiv.

[23]  Zenglin Xu,et al.  Bayesian Nonparametric Models for Multiway Data Analysis , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[24]  L. Tucker,et al.  Some mathematical notes on three-mode factor analysis , 1966, Psychometrika.

[25]  Patrick Dupont,et al.  Canonical decomposition of ictal scalp EEG reliably detects the seizure onset zone , 2007, NeuroImage.

[26]  David B. Dunson,et al.  Scalable Bayesian Low-Rank Decomposition of Incomplete Multiway Tensors , 2014, ICML.

[27]  Zenglin Xu,et al.  Exact and Stable Recovery of Pairwise Interaction Tensors , 2013, NIPS.

[28]  Ron Sun,et al.  Moral Judgment, Human Motivation, and Neural Networks , 2013, Cognitive Computation.

[29]  Cheng Wu,et al.  Robust Bayesian Classification with Incomplete Data , 2012, Cognitive Computation.

[30]  Zenglin Xu,et al.  Distributed Flexible Nonlinear Tensor Factorization , 2016, NIPS.

[31]  Wei Chu,et al.  Probabilistic Models for Incomplete Multi-dimensional Arrays , 2009, AISTATS.

[32]  Xi Chen,et al.  Temporal Collaborative Filtering with Bayesian Probabilistic Tensor Factorization , 2010, SDM.

[33]  Ken-ichi Kawarabayashi,et al.  Expected Tensor Decomposition with Stochastic Gradient Descent , 2016, AAAI.

[34]  D. Dunson,et al.  Nonparametric Bayes Modeling of Multivariate Categorical Data , 2009, Journal of the American Statistical Association.

[35]  Richard A. Harshman,et al.  Foundations of the PARAFAC procedure: Models and conditions for an "explanatory" multi-model factor analysis , 1970 .

[36]  Zenglin Xu,et al.  Infinite Tucker Decomposition: Nonparametric Bayesian Models for Multiway Data Analysis , 2011, ICML.

[37]  L. K. Hansen,et al.  Automatic relevance determination for multi‐way models , 2009 .

[38]  Joos Vandewalle,et al.  A Multilinear Singular Value Decomposition , 2000, SIAM J. Matrix Anal. Appl..

[39]  Masataka Goto,et al.  Infinite Positive Semidefinite Tensor Factorization for Source Separation of Mixture Signals , 2013, ICML.

[40]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[41]  J. Kruskal Three-way arrays: rank and uniqueness of trilinear decompositions, with application to arithmetic complexity and statistics , 1977 .

[42]  Matthew Harding,et al.  Scalable Probabilistic Tensor Factorization for Binary and Count Data , 2015, IJCAI.

[43]  Danyang Li,et al.  Ensemble of Deep Neural Networks with Probability-Based Fusion for Facial Expression Recognition , 2017, Cognitive Computation.