Generalization in multitask deep neural classifiers: a statistical physics approach
暂无分享,去创建一个
[1] Stefano Soatto,et al. Entropy-SGD: biasing gradient descent into wide valleys , 2016, ICLR.
[2] Quoc V. Le,et al. Multi-task Sequence to Sequence Learning , 2015, ICLR.
[3] Elliot Meyerson,et al. Beyond Shared Hierarchies: Deep Multitask Learning through Soft Layer Ordering , 2017, ICLR.
[4] Raj Rao Nadakuditi,et al. The singular values and vectors of low rank perturbations of large rectangular random matrices , 2011, J. Multivar. Anal..
[5] Shinji Watanabe,et al. Joint CTC-attention based end-to-end speech recognition using multi-task learning , 2016, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[6] Florent Krzakala,et al. Generalisation dynamics of online learning in over-parameterised neural networks , 2019, ArXiv.
[7] Andrea Vedaldi,et al. Universal representations: The missing link between faces, text, planktons, and cat breeds , 2017, ArXiv.
[8] Yann LeCun,et al. Comparing dynamics: deep neural networks versus glassy systems , 2018, ICML.
[9] Lukasz Kaiser,et al. One Model To Learn Them All , 2017, ArXiv.
[10] Vladlen Koltun,et al. Multi-Task Learning as Multi-Objective Optimization , 2018, NeurIPS.
[11] Sebastian Ruder,et al. Fine-tuned Language Models for Text Classification , 2018, ArXiv.
[12] Martial Hebert,et al. Cross-Stitch Networks for Multi-task Learning , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[13] Kai Yu,et al. Multi-task joint-learning of deep neural networks for robust speech recognition , 2015, 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU).
[14] Ruosong Wang,et al. On Exact Computation with an Infinitely Wide Neural Net , 2019, NeurIPS.
[15] Jaehoon Lee,et al. Wide neural networks of any depth evolve as linear models under gradient descent , 2019, NeurIPS.
[16] Fred A. Hamprecht,et al. Essentially No Barriers in Neural Network Energy Landscape , 2018, ICML.
[17] Stéphane Mallat,et al. Understanding deep convolutional networks , 2016, Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences.
[18] Roberto Cipolla,et al. Multi-task Learning Using Uncertainty to Weigh Losses for Scene Geometry and Semantics , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[19] Opper,et al. Generalization ability of perceptrons with continuous outputs. , 1993, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics.
[20] Andrew Zisserman,et al. Multi-task Self-Supervised Visual Learning , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[21] Rich Caruana,et al. Multitask Learning , 1997, Machine-mediated learning.
[22] Surya Ganguli,et al. An analytic theory of generalization dynamics and transfer learning in deep linear networks , 2018, ICLR.
[23] R. Jackson. Inequalities , 2007, Algebra for Parents.
[24] Surya Ganguli,et al. Exact solutions to the nonlinear dynamics of learning in deep linear neural networks , 2013, ICLR.
[25] Razvan Pascanu,et al. Adapting Auxiliary Losses Using Gradient Similarity , 2018, ArXiv.
[26] B. Derrida. Random-energy model: An exactly solvable model of disordered systems , 1981 .
[27] Max Tegmark,et al. Why Does Deep and Cheap Learning Work So Well? , 2016, Journal of Statistical Physics.
[28] Xiaodong Liu,et al. Multi-Task Deep Neural Networks for Natural Language Understanding , 2019, ACL.
[29] Andrew M. Saxe,et al. High-dimensional dynamics of generalization error in neural networks , 2017, Neural Networks.