Sideways: Depth-Parallel Training of Video Models
暂无分享,去创建一个
[1] Geoffrey E. Hinton,et al. On the importance of initialization and momentum in deep learning , 2013, ICML.
[2] Thomas Serre,et al. HMDB: A large video database for human motion recognition , 2011, 2011 International Conference on Computer Vision.
[3] Joelle Pineau,et al. Conditional Computation in Neural Networks for faster models , 2015, ArXiv.
[4] H. Robbins. A Stochastic Approximation Method , 1951 .
[5] Aran Nayebi,et al. CORnet: Modeling the Neural Mechanisms of Core Object Recognition , 2018, bioRxiv.
[6] Quoc V. Le,et al. GPipe: Efficient Training of Giant Neural Networks using Pipeline Parallelism , 2018, ArXiv.
[7] Alex Graves,et al. Recurrent Models of Visual Attention , 2014, NIPS.
[8] Oriol Vinyals,et al. Representation Learning with Contrastive Predictive Coding , 2018, ArXiv.
[9] Yann LeCun,et al. Large Scale Online Learning , 2003, NIPS.
[10] Yoram Singer,et al. Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..
[11] Samy Bengio,et al. Understanding deep learning requires rethinking generalization , 2016, ICLR.
[12] Arild Nøkland,et al. Training Neural Networks with Local Error Signals , 2019, ICML.
[13] Tong Zhang,et al. Solving large scale linear prediction problems using stochastic gradient descent algorithms , 2004, ICML.
[14] Mubarak Shah,et al. UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild , 2012, ArXiv.
[15] Fei-Fei Li,et al. Large-Scale Video Classification with Convolutional Neural Networks , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.
[16] Geoffrey E. Hinton. Connectionist Learning Procedures , 1989, Artif. Intell..
[17] Sindy Löwe,et al. Putting An End to End-to-End: Gradient-Isolated Learning of Representations , 2019, NeurIPS.
[18] Arild Nøkland,et al. Direct Feedback Alignment Provides Learning in Deep Neural Networks , 2016, NIPS.
[19] Yann LeCun,et al. The Loss Surfaces of Multilayer Networks , 2014, AISTATS.
[20] Marco Gori,et al. Backprop Diffusion is Biologically Plausible , 2019, ArXiv.
[21] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.
[22] Martín Abadi,et al. TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems , 2016, ArXiv.
[23] Bin Gu,et al. Training Neural Networks Using Features Replay , 2018, NeurIPS.
[24] Stephen J. Wright,et al. Hogwild: A Lock-Free Approach to Parallelizing Stochastic Gradient Descent , 2011, NIPS.
[25] Lorenzo Torresani,et al. SCSampler: Sampling Salient Clips From Video for Efficient Action Recognition , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[26] Andrew Zisserman,et al. Two-Stream Convolutional Networks for Action Recognition in Videos , 2014, NIPS.
[27] Alok Aggarwal,et al. Regularized Evolution for Image Classifier Architecture Search , 2018, AAAI.
[28] Andrew Zisserman,et al. Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[29] Brian Kingsbury,et al. Beyond Backprop: Online Alternating Minimization with Auxiliary Variables , 2018, ICML.
[30] Luc Van Gool,et al. DynamoNet: Dynamic Action and Motion Network , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[31] Michael Eickenberg,et al. Decoupled Greedy Learning of CNNs , 2019, ICML.
[32] Yingli Tian,et al. Self-supervised Spatiotemporal Feature Learning by Video Geometric Transformations , 2018, ArXiv.
[33] M. Larkum. A cellular mechanism for cortical associations: an organizing principle for the cerebral cortex , 2013, Trends in Neurosciences.
[34] Max Jaderberg,et al. Understanding Synthetic Gradients and Decoupled Neural Interfaces , 2017, ICML.
[35] Colin J. Akerman,et al. Random synaptic feedback weights support error backpropagation for deep learning , 2016, Nature Communications.
[36] Jascha Sohl-Dickstein,et al. Learning Unsupervised Learning Rules , 2018, ArXiv.
[37] Terrence J. Sejnowski,et al. Slow Feature Analysis: Unsupervised Learning of Invariances , 2002, Neural Computation.
[38] Yoshua Bengio,et al. Towards Biologically Plausible Deep Learning , 2015, ArXiv.
[39] Alex Graves,et al. Decoupled Neural Interfaces using Synthetic Gradients , 2016, ICML.
[40] Wenhao Wu,et al. Multi-Agent Reinforcement Learning Based Frame Sampling for Effective Untrimmed Video Recognition , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[41] Joachim M. Buhmann,et al. Kickback Cuts Backprop's Red-Tape: Biologically Plausible Credit Assignment in Neural Networks , 2014, AAAI.
[42] Kaiming He,et al. Rethinking ImageNet Pre-Training , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[43] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[44] Nitish Srivastava,et al. Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..
[45] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.
[46] Behnam Neyshabur,et al. Implicit Regularization in Deep Learning , 2017, ArXiv.
[47] Deva Ramanan,et al. CATER: A diagnostic dataset for Compositional Actions and TEmporal Reasoning , 2020, ICLR.
[48] Kaiming He,et al. Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour , 2017, ArXiv.
[49] Andrew Zisserman,et al. Massively Parallel Video Networks , 2018, ECCV.
[50] Y. Miyashita,et al. Top-down signal from prefrontal cortex in executive control of memory retrieval , 1999, Nature.
[51] Carl Doersch,et al. Learning Visual Question Answering by Bootstrapping Hard Attention , 2018, ECCV.
[52] Yutaka Satoh,et al. Can Spatiotemporal 3D CNNs Retrace the History of 2D CNNs and ImageNet? , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[53] Frank Hutter,et al. Decoupled Weight Decay Regularization , 2017, ICLR.
[54] Bin Gu,et al. Decoupled Parallel Backpropagation with Convergence Guarantee , 2018, ICML.
[55] Zhiqiang Shen,et al. Object Detection from Scratch with Deep Supervision , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[56] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.