暂无分享,去创建一个
Tor Lattimore | Joel Veness | Marcus Hutter | David Budden | Agnieszka Grabska-Barwinska | Peter Toth | Christopher Mattern | Avishkar Bhoopchand | Simon Schmitt
[1] Quanshi Zhang,et al. Visual interpretability for deep learning: a survey , 2018, Frontiers of Information Technology & Electronic Engineering.
[2] Timothy C. Bell,et al. A corpus for the evaluation of lossless compression algorithms , 1997, Proceedings DCC '97. Data Compression Conference.
[3] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[4] Sindy Löwe,et al. Putting An End to End-to-End: Gradient-Isolated Learning of Representations , 2019, NeurIPS.
[5] Rich Caruana,et al. Multitask Learning , 1998, Encyclopedia of Machine Learning and Data Mining.
[6] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.
[7] Alex Graves,et al. Decoupled Neural Interfaces using Synthetic Gradients , 2016, ICML.
[8] Surya Ganguli,et al. Continual Learning Through Synaptic Intelligence , 2017, ICML.
[9] Koray Kavukcuoglu,et al. Pixel Recurrent Neural Networks , 2016, ICML.
[10] Michael McCloskey,et al. Catastrophic Interference in Connectionist Networks: The Sequential Learning Problem , 1989 .
[11] Razvan Pascanu,et al. Overcoming catastrophic forgetting in neural networks , 2016, Proceedings of the National Academy of Sciences.
[12] Elad Hazan,et al. Introduction to Online Convex Optimization , 2016, Found. Trends Optim..
[13] Stephen Grossberg,et al. The ART of adaptive pattern recognition by a self-organizing neural network , 1987, Computer.
[14] Michael Eickenberg,et al. Decoupled Greedy Learning of CNNs , 2019, ICML.
[15] Martin Zinkevich,et al. Online Convex Programming and Generalized Infinitesimal Gradient Ascent , 2003, ICML.
[16] Joel Veness,et al. A Combinatorial Perspective on Transfer Learning , 2020, NeurIPS.
[17] Christopher Mattern. Statistical Data Compression , 2008, Encyclopedia of Algorithms.
[18] Yoshua Bengio,et al. An Empirical Investigation of Catastrophic Forgeting in Gradient-Based Neural Networks , 2013, ICLR.
[19] Kurt Hornik,et al. Approximation capabilities of multilayer feedforward networks , 1991, Neural Networks.
[20] Kurt Hornik,et al. Neural networks and principal component analysis: Learning from examples without local minima , 1989, Neural Networks.
[21] Surya Ganguli,et al. Exact solutions to the nonlinear dynamics of learning in deep linear neural networks , 2013, ICLR.
[22] Sarah Eichmann,et al. The Radon Transform And Some Of Its Applications , 2016 .
[23] Marvin Minsky,et al. Perceptrons: An Introduction to Computational Geometry , 1969 .
[24] Trevor Darrell,et al. Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.
[25] Geoffrey E. Hinton,et al. Adaptive Mixtures of Local Experts , 1991, Neural Computation.
[26] Maxim Smirnov,et al. Data Compression Explained , 2010 .
[27] Paolo Ferragina,et al. Text Compression , 2009, Encyclopedia of Database Systems.
[28] Ohad Shamir,et al. Stochastic Gradient Descent for Non-smooth Optimization: Convergence Results and Optimal Averaging Schemes , 2012, ICML.
[29] Yohan. jin,et al. 2013 Ieee Conference on Computer Vision and Pattern Recognition Workshops 2013 Ieee Conference on Computer Vision and Pattern Recognition Workshops 2013 Ieee Conference on Computer Vision and Pattern Recognition Workshops 2013 Ieee Conference on Computer Vision and Pattern Recognition Workshops , 2022 .
[30] Andrew Zisserman,et al. Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps , 2013, ICLR.
[31] Martin J. Wainwright,et al. Convexified Convolutional Neural Networks , 2016, ICML.
[32] Anthony V. Robins,et al. Catastrophic Forgetting, Rehearsal and Pseudorehearsal , 1995, Connect. Sci..
[33] Stefan Carlsson,et al. CNN Features Off-the-Shelf: An Astounding Baseline for Recognition , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops.
[34] Sébastien Bubeck,et al. Convex Optimization: Algorithms and Complexity , 2014, Found. Trends Mach. Learn..
[35] Arild Nøkland,et al. Training Neural Networks with Local Error Signals , 2019, ICML.
[36] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[37] Elad Hazan,et al. Logarithmic regret algorithms for online convex optimization , 2006, Machine Learning.
[38] Christopher Mattern,et al. Linear and Geometric Mixtures - Analysis , 2013, 2013 Data Compression Conference.
[39] Hugo Larochelle,et al. The Neural Autoregressive Distribution Estimator , 2011, AISTATS.
[40] Nando de Freitas,et al. A Machine Learning Perspective on Predictive Coding with PAQ8 , 2011, 2012 Data Compression Conference.
[41] Trevor Darrell,et al. DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition , 2013, ICML.
[42] Moses Charikar,et al. Similarity estimation techniques from rounding algorithms , 2002, STOC '02.
[43] Yee Whye Teh,et al. Progress & Compress: A scalable framework for continual learning , 2018, ICML.
[44] Geoffrey E. Hinton,et al. Learning representations by back-propagating errors , 1986, Nature.
[45] Geoffrey E. Hinton. Training Products of Experts by Minimizing Contrastive Divergence , 2002, Neural Computation.
[46] Matthew V. Mahoney,et al. Fast Text Compression with Neural Networks , 2000, FLAIRS Conference.
[47] Hod Lipson,et al. Understanding Neural Networks Through Deep Visualization , 2015, ArXiv.
[48] Ashok Cutkosky,et al. Anytime Online-to-Batch, Optimism and Acceleration , 2019, ICML.
[49] Christopher Mattern. Mixing Strategies in Data Compression , 2012, 2012 Data Compression Conference.
[50] Tor Lattimore,et al. Online Learning with Gated Linear Networks , 2017, ArXiv.
[51] Joel Veness,et al. Online Learning in Contextual Bandits using Gated Linear Networks , 2020, NeurIPS.
[52] Matthew V. Mahoney,et al. Adaptive weighing of context models for lossless data compression , 2005 .
[53] Yee Whye Teh,et al. Meta-learning of Sequential Strategies , 2019, ArXiv.
[54] Michael Eickenberg,et al. Greedy Layerwise Learning Can Scale to ImageNet , 2018, ICML.
[55] Jürgen Schmidhuber,et al. Sequential neural text compression , 1996, IEEE Trans. Neural Networks.
[56] Christoph H. Lampert,et al. iCaRL: Incremental Classifier and Representation Learning , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).