Tensor Regression Networks

Convolutional neural networks typically consist of many convolutional layers followed by one or more fully connected layers. While convolutional layers map between high-order activation tensors, the fully connected layers operate on flattened activation vectors. Despite empirical success, this approach has notable drawbacks. Flattening followed by fully connected layers discards multilinear structure in the activations and requires many parameters. We address these problems by incorporating tensor algebraic operations that preserve multilinear structure at every layer. First, we introduce Tensor Contraction Layers (TCLs) that reduce the dimensionality of their input while preserving their multilinear structure using tensor contraction. Next, we introduce Tensor Regression Layers (TRLs), which express outputs through a low-rank multilinear mapping from a high-order activation tensor to an output tensor of arbitrary order. We learn the contraction and regression factors end-to-end, and produce accurate nets with fewer parameters. Additionally, our layers regularize networks by imposing low-rank constraints on the activations (TCL) and regression weights (TRL). Experiments on ImageNet show that, applied to VGG and ResNet architectures, TCLs and TRLs reduce the number of parameters compared to fully connected layers by more than 65% while maintaining or increasing accuracy. In addition to the space savings, our approach's ability to leverage topological structure can be crucial for structured data such as MRI. In particular, we demonstrate significant performance improvements over comparable architectures on three tasks associated with the UK Biobank dataset.

[1]  J. Ehrhardt,et al.  Intelligence and brain structure in normal individuals. , 1993, The American journal of psychiatry.

[2]  Demetri Terzopoulos,et al.  Multilinear Analysis of Image Ensembles: TensorFaces , 2002, ECCV.

[3]  Olaf Sporns,et al.  The Human Connectome: A Structural Description of the Human Brain , 2005, PLoS Comput. Biol..

[4]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[5]  Holger Hoefling A Path Algorithm for the Fused Lasso Signal Approximator , 2009, 0910.0526.

[6]  Tamara G. Kolda,et al.  Tensor Decompositions and Applications , 2009, SIAM Rev..

[7]  Andrzej Cichocki,et al.  Nonnegative Matrix and Tensor Factorization T , 2007 .

[8]  A. Toga,et al.  Brain Structure and Obesity , 2009, NeuroImage.

[9]  Nuria Oliver,et al.  Multiverse recommendation: n-dimensional tensor factorization for context-aware collaborative filtering , 2010, RecSys '10.

[10]  Mehrnoosh Sadrzadeh,et al.  Experimental Support for a Categorical Compositional Distributional Model of Meaning , 2011, EMNLP.

[11]  R. Tibshirani,et al.  The solution path of the generalized lasso , 2010, 1005.1971.

[12]  Nick C Fox,et al.  Brain imaging in Alzheimer disease. , 2012, Cold Spring Harbor perspectives in medicine.

[13]  Weiwei Guo,et al.  Tensor Learning for Regression , 2012, IEEE Transactions on Image Processing.

[14]  Hongtu Zhu,et al.  Tensor Regression with Applications in Neuroimaging Data Analysis , 2012, Journal of the American Statistical Association.

[15]  Alex R. Smith,et al.  Sex differences in the structural connectome of the human brain , 2013, Proceedings of the National Academy of Sciences.

[16]  K. Mills,et al.  Methods and considerations for longitudinal structural brain imaging analysis across development , 2014, Developmental Cognitive Neuroscience.

[17]  A. Brickman,et al.  Body mass index and brain structure in healthy children and adolescents , 2014, The International journal of neuroscience.

[18]  Anima Anandkumar,et al.  Tensor decompositions for learning latent variable models , 2012, J. Mach. Learn. Res..

[19]  Nadav Cohen,et al.  On the Expressive Power of Deep Learning: A Tensor Analysis , 2015, COLT 2016.

[20]  René Vidal,et al.  Global Optimality in Tensor Factorization, Deep Learning, and Beyond , 2015, ArXiv.

[21]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[22]  Zheng Zhang,et al.  MXNet: A Flexible and Efficient Machine Learning Library for Heterogeneous Distributed Systems , 2015, ArXiv.

[23]  Ivan V. Oseledets,et al.  Speeding-up Convolutional Neural Networks Using Fine-tuned CP-Decomposition , 2014, ICLR.

[24]  Anima Anandkumar,et al.  Generalization Bounds for Neural Networks through Tensor Factorization , 2015, ArXiv.

[25]  Andrzej Cichocki,et al.  Tensor Decompositions for Signal Processing Applications: From two-way to multiway component analysis , 2014, IEEE Signal Processing Magazine.

[26]  P. Elliott,et al.  UK Biobank: An Open Access Resource for Identifying the Causes of a Wide Range of Complex Diseases of Middle and Old Age , 2015, PLoS medicine.

[27]  Alexander Novikov,et al.  Tensorizing Neural Networks , 2015, NIPS.

[28]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[29]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Tim Salimans,et al.  Weight Normalization: A Simple Reparameterization to Accelerate Training of Deep Neural Networks , 2016, NIPS.

[31]  Eunhyeok Park,et al.  Compression of Deep Convolutional Neural Networks for Fast and Low Power Mobile Applications , 2015, ICLR.

[32]  Hachem Kadri,et al.  Low-Rank Regression with Tensor Responses , 2016, NIPS.

[33]  Kilian Q. Weinberger,et al.  Deep Networks with Stochastic Depth , 2016, ECCV.

[34]  Jian Sun,et al.  Identity Mappings in Deep Residual Networks , 2016, ECCV.

[35]  Rose Yu,et al.  Learning from Multiway Data: Simple and Efficient Tensor Regression , 2016, ICML.

[36]  Anima Anandkumar,et al.  Training Input-Output Recurrent Neural Networks through Spectral Methods , 2016, ArXiv.

[37]  Anima Anandkumar,et al.  Tensor Contractions with Extended BLAS Kernels on CPU and GPU , 2016, 2016 IEEE 23rd International Conference on High Performance Computing (HiPC).

[38]  Xiaogang Wang,et al.  Convolutional neural networks with low-rank regularization , 2015, ICLR.

[39]  Yongxin Yang,et al.  Deep Multi-task Representation Learning: A Tensor Factorisation Approach , 2016, ICLR.

[40]  Kilian Q. Weinberger,et al.  Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[41]  Giovanni Montana,et al.  Predicting brain age with deep learning from raw imaging data results in a reliable and heritable biomarker , 2016, NeuroImage.

[42]  Anima Anandkumar,et al.  Beating the Perils of Non-Convexity: Guaranteed Training of Neural Networks using Tensor Methods , 2017 .

[43]  Chen Yunpeng,et al.  Sharing Residual Units Through Collective Tensor Factorization in Deep Neural Networks , 2017, ArXiv.

[44]  Anima Anandkumar,et al.  Tensor Contraction Layers for Parsimonious Deep Nets , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[45]  Nikos D. Sidiropoulos,et al.  Tensors for Data Mining and Data Fusion , 2016, ACM Trans. Intell. Syst. Technol..

[46]  Luca Antiga,et al.  Automatic differentiation in PyTorch , 2017 .

[47]  Amnon Shashua,et al.  On the Expressive Power of Overlapping Operations of Deep Networks , 2017, ArXiv.

[48]  Nikos D. Sidiropoulos,et al.  Tensor Decomposition for Signal Processing and Machine Learning , 2016, IEEE Transactions on Signal Processing.

[49]  Stuart J. Ritchie,et al.  Brain age predicts mortality , 2017, Molecular Psychiatry.

[50]  Matthias Schwab,et al.  Premature brain aging in humans exposed to maternal nutrient restriction during early gestation , 2017, NeuroImage.

[51]  Amnon Shashua,et al.  On the Expressive Power of Overlapping Architectures of Deep Learning , 2017, ICLR.

[52]  A. Toga,et al.  Association of brain age with smoking, alcohol consumption, and genetic variants , 2018 .

[53]  Shuicheng Yan,et al.  Sharing Residual Units Through Collective Tensor Factorization To Improve Deep Neural Networks , 2018, IJCAI.

[54]  M. Alda,et al.  Obesity, dyslipidemia and brain age in first-episode psychosis. , 2018, Journal of psychiatric research.

[55]  Maja Pantic,et al.  TensorLy: Tensor Learning in Python , 2016, J. Mach. Learn. Res..

[56]  Zoltán Vidnyánszky,et al.  Predicting Body Mass Index From Structural MRI Brain Images Using a Deep Convolutional Neural Network , 2020, Frontiers in Neuroinformatics.