暂无分享,去创建一个
[1] Stefano Soatto,et al. Entropy-SGD: biasing gradient descent into wide valleys , 2016, ICLR.
[2] Geoffrey E. Hinton,et al. Distilling the Knowledge in a Neural Network , 2015, ArXiv.
[3] Rui Peng,et al. Network Trimming: A Data-Driven Neuron Pruning Approach towards Efficient Deep Architectures , 2016, ArXiv.
[4] L. Deng,et al. The MNIST Database of Handwritten Digit Images for Machine Learning Research [Best of the Web] , 2012, IEEE Signal Processing Magazine.
[5] M. Yuan,et al. Model selection and estimation in regression with grouped variables , 2006 .
[6] Zoubin Ghahramani,et al. A Theoretically Grounded Application of Dropout in Recurrent Neural Networks , 2015, NIPS.
[7] Hakan Inan,et al. Tying Word Vectors and Word Classifiers: A Loss Framework for Language Modeling , 2016, ICLR.
[8] Lior Wolf,et al. Using the Output Embedding to Improve Language Models , 2016, EACL.
[9] Oleksandr Makeyev,et al. Neural network with ensembles , 2010, The 2010 International Joint Conference on Neural Networks (IJCNN).
[10] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[11] Mark W. Schmidt,et al. Group Sparse Priors for Covariance Estimation , 2009, UAI.
[12] Mark J. van der Laan,et al. The relative performance of ensemble methods with deep convolutional neural networks for image classification , 2017, Journal of applied statistics.
[13] Beatrice Santorini,et al. Building a Large Annotated Corpus of English: The Penn Treebank , 1993, CL.
[14] Yee Whye Teh,et al. Bayesian Learning via Stochastic Gradient Langevin Dynamics , 2011, ICML.
[15] Yiran Chen,et al. Learning Structured Sparsity in Deep Neural Networks , 2016, NIPS.
[16] Song Han,et al. Learning both Weights and Connections for Efficient Neural Network , 2015, NIPS.
[17] Frank Hutter,et al. SGDR: Stochastic Gradient Descent with Warm Restarts , 2016, ICLR.
[18] Danilo Comminiello,et al. Group sparse regularization for deep neural networks , 2016, Neurocomputing.
[19] Richard Socher,et al. Regularizing and Optimizing LSTM Language Models , 2017, ICLR.
[20] Wojciech Zaremba,et al. Recurrent Neural Network Regularization , 2014, ArXiv.
[21] Lawrence Carin,et al. Preconditioned Stochastic Gradient Langevin Dynamics for Deep Neural Networks , 2015, AAAI.
[22] Song Han,et al. Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding , 2015, ICLR.
[23] Frank Hutter,et al. SGDR: Stochastic Gradient Descent with Restarts , 2016, ArXiv.
[24] Kilian Q. Weinberger,et al. Snapshot Ensembles: Train 1, get M for free , 2017, ICLR.
[25] Song Han,et al. ESE: Efficient Speech Recognition Engine with Sparse LSTM on FPGA , 2016, FPGA.
[26] Vivek Rathod,et al. Bayesian dark knowledge , 2015, NIPS.
[27] Shinichi Nakajima,et al. Bayesian Group-Sparse Modeling and Variational Inference , 2014, IEEE Transactions on Signal Processing.
[28] Hiroshi Nakagawa,et al. Approximation Analysis of Stochastic Gradient Langevin Dynamics by using Fokker-Planck Equation and Ito Process , 2014, ICML.
[29] Tianqi Chen,et al. Stochastic Gradient Hamiltonian Monte Carlo , 2014, ICML.
[30] Yee Whye Teh,et al. Consistency and Fluctuations For Stochastic Gradient Langevin Dynamics , 2014, J. Mach. Learn. Res..
[31] Erich Elsen,et al. Exploring Sparsity in Recurrent Neural Networks , 2017, ICLR.
[32] Masashi Sugiyama,et al. Bayesian Dark Knowledge , 2015 .
[33] Phil Blunsom,et al. Optimizing Performance of Recurrent Neural Networks on GPUs , 2016, ArXiv.
[34] Ruslan Salakhutdinov,et al. Breaking the Softmax Bottleneck: A High-Rank RNN Language Model , 2017, ICLR.
[35] Fang Liu,et al. Learning Intrinsic Sparse Structures within Long Short-term Memory , 2017, ICLR.
[36] Mathieu Salzmann,et al. Learning the Number of Neurons in Deep Networks , 2016, NIPS.
[37] Zhe Gan,et al. Scalable Bayesian Learning of Recurrent Neural Networks for Language Modeling , 2016, ACL.