Implicit Regularization in Deep Tensor Factorization

Attempts of studying implicit regularization associated to gradient descent (GD) have identified matrix completion as a suitable test-bed. Late findings suggest that this phenomenon cannot be phrased as a minimization-norm problem, implying that a paradigm shift is required and that dynamics has to be taken into account. In the present work we address the more general setup of tensor completion by leveraging two popularized tensor factorization, namely Tucker and TensorTrain (TT). We track relevant quantities such as tensor nuclear norm, effective rank, generalized singular values and we introduce deep Tucker and TT unconstrained factorization to deal with the completion task. Experiments on both synthetic and real data show that gradient descent promotes solution with low-rank, and validate the conjecture saying that the phenomenon has to be addressed from a dynamical perspective.

[1]  Nathan Srebro,et al.  Exploring Generalization in Deep Learning , 2017, NIPS.

[2]  Joos Vandewalle,et al.  A Multilinear Singular Value Decomposition , 2000, SIAM J. Matrix Anal. Appl..

[3]  Tamara G. Kolda,et al.  Tensor Decompositions and Applications , 2009, SIAM Rev..

[4]  B. Recht,et al.  Tensor completion and low-n-rank tensor recovery via convex optimization , 2011 .

[5]  Johan Håstad,et al.  Tensor Rank is NP-Complete , 1989, ICALP.

[6]  Sanjeev Arora,et al.  Implicit Regularization in Deep Matrix Factorization , 2019, NeurIPS.

[7]  R. Bro,et al.  PARAFAC and missing values , 2005 .

[8]  Minh N. Do,et al.  Efficient tensor completion: Low-rank tensor train , 2016, ArXiv.

[9]  Yan Liu,et al.  Spatial-temporal causal modeling for climate change attribution , 2009, KDD.

[10]  Samy Bengio,et al.  Understanding deep learning requires rethinking generalization , 2016, ICLR.

[11]  Emmanuel J. Candès,et al.  Exact Matrix Completion via Convex Optimization , 2008, Found. Comput. Math..

[12]  Leslie Pack Kaelbling,et al.  Generalization in Deep Learning , 2017, ArXiv.

[13]  Hanghang Tong,et al.  Factor Matrix Trace Norm Minimization for Low-Rank Tensor Completion , 2014, SDM.

[14]  Joos Vandewalle,et al.  On the Best Rank-1 and Rank-(R1 , R2, ... , RN) Approximation of Higher-Order Tensors , 2000, SIAM J. Matrix Anal. Appl..

[15]  Martin Vetterli,et al.  The effective rank: A measure of effective dimensionality , 2007, 2007 15th European Signal Processing Conference.

[16]  Ivan Oseledets,et al.  Tensor-Train Decomposition , 2011, SIAM J. Sci. Comput..

[17]  Di Guo,et al.  Hankel Matrix Nuclear Norm Regularized Tensor Completion for $N$-dimensional Exponential Signals , 2016, IEEE Transactions on Signal Processing.

[18]  Maja Pantic,et al.  TensorLy: Tensor Learning in Python , 2016, J. Mach. Learn. Res..

[19]  James Caverlee,et al.  Tensor Completion Algorithms in Big Data Analytics , 2017, ACM Trans. Knowl. Discov. Data.

[20]  L. Tucker,et al.  Some mathematical notes on three-mode factor analysis , 1966, Psychometrika.

[21]  Jieping Ye,et al.  Tensor Completion for Estimating Missing Values in Visual Data , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22]  Nicolas Gillis,et al.  Low-Rank Matrix Approximation with Weights or Missing Data Is NP-Hard , 2010, SIAM J. Matrix Anal. Appl..

[23]  Nathan Srebro,et al.  Implicit Regularization in Matrix Factorization , 2017, 2018 Information Theory and Applications Workshop (ITA).

[24]  Sebastian Ruder,et al.  An overview of gradient descent optimization algorithms , 2016, Vestnik komp'iuternykh i informatsionnykh tekhnologii.

[25]  Nadav Cohen,et al.  Implicit Regularization in Deep Learning May Not Be Explainable by Norms , 2020, NeurIPS.