An Analysis of Tensor Models for Learning on Structured Data

While tensor factorizations have become increasingly popular for learning on various forms of structured data, only very few theoretical results exist on the generalization abilities of these methods. Here, we discuss the tensor product as a principled way to represent structured data in vector spaces for machine learning tasks. By extending known bounds for matrix factorizations, we are able to derive generalization error bounds for the tensor case. Furthermore, we analyze analytically and experimentally how tensor factorization behaves when applied to over- and understructured representations, for instance, when two-way tensor factorization, i.e.i¾?matrix factorization, is applied to three-way tensor data.

[1]  H. Warren Lower bounds for approximation by nonlinear manifolds , 1968 .

[2]  J. Chang,et al.  Analysis of individual differences in multidimensional scaling via an n-way generalization of “Eckart-Young” decomposition , 1970 .

[3]  R. Harshman,et al.  PARAFAC: parallel factor analysis , 1994 .

[4]  D. Burdick An introduction to tensor products with applications to multiway data analysis , 1995 .

[5]  M.H. Hassoun,et al.  Fundamentals of Artificial Neural Networks , 1996, Proceedings of the IEEE.

[6]  Peter A. Flach,et al.  Propositionalization approaches to relational data mining , 2001 .

[7]  Noga Alon,et al.  Generalization Error Bounds for Collaborative Prediction with Low-Rank Matrices , 2004, NIPS.

[8]  Nathan Srebro,et al.  Learning with matrix factorizations , 2004 .

[9]  Lior Wolf,et al.  Modeling Appearances with Low-Rank SVM , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[10]  Tamara G. Kolda,et al.  Temporal Analysis of Semantic Graphs Using ASALSAN , 2007, Seventh IEEE International Conference on Data Mining (ICDM 2007).

[11]  Lieven De Lathauwer,et al.  Decompositions of a Higher-Order Tensor in Block Terms - Part II: Definitions and Uniqueness , 2008, SIAM J. Matrix Anal. Appl..

[12]  Mehryar Mohri,et al.  Rademacher Complexity Bounds for Non-I.I.D. Processes , 2008, NIPS.

[13]  Emmanuel J. Candès,et al.  Exact Matrix Completion via Convex Optimization , 2009, Found. Comput. Math..

[14]  Tamara G. Kolda,et al.  Tensor Decompositions and Applications , 2009, SIAM Rev..

[15]  Ryota Tomioka,et al.  Estimation of low-rank tensors via convex optimization , 2010, 1010.0789.

[16]  Lars Schmidt-Thieme,et al.  Factorizing personalized Markov chains for next-basket recommendation , 2010, WWW '10.

[17]  Hans-Peter Kriegel,et al.  Multivariate Prediction for Learning on the Semantic Web , 2010, ILP.

[18]  Emmanuel J. Candès,et al.  Matrix Completion With Noise , 2009, Proceedings of the IEEE.

[19]  Hisashi Kashima,et al.  Statistical Performance of Convex Tensor Decomposition , 2011, NIPS.

[20]  Jason Weston,et al.  Learning Structured Embeddings of Knowledge Bases , 2011, AAAI.

[21]  Johan A. K. Suykens,et al.  Tensor Versus Matrix Completion: A Comparison With Application to Spectral Data , 2011, IEEE Signal Processing Letters.

[22]  Peter A. Flach,et al.  Proceedings of the 28th International Conference on Machine Learning , 2011 .

[23]  B. Recht,et al.  Tensor completion and low-n-rank tensor recovery via convex optimization , 2011 .

[24]  Hans-Peter Kriegel,et al.  A Three-Way Model for Collective Learning on Multi-Relational Data , 2011, ICML.

[25]  Nicolas Le Roux,et al.  A latent factor model for highly multi-relational data , 2012, NIPS.

[26]  Hans-Peter Kriegel,et al.  Factorizing YAGO: scalable machine learning for linked data , 2012, WWW.

[27]  Marco Brambilla,et al.  A revenue sharing mechanism for federated search and advertising , 2012, WWW.

[28]  Achim Rettinger,et al.  Context-aware tensor decomposition for relation prediction in social networks , 2012, Social Network Analysis and Mining.

[29]  Jieping Ye,et al.  Tensor Completion for Estimating Missing Values in Visual Data , 2013, IEEE Trans. Pattern Anal. Mach. Intell..