论文信息 - Reintroducing Straight-Through Estimators as Principled Methods for Stochastic Binary Networks - 字舞流文

Reintroducing Straight-Through Estimators as Principled Methods for Stochastic Binary Networks

[1] Niao He,et al. On the Convergence Rate of Stochastic Mirror Descent for Nonsmooth Nonconvex Optimization , 2018, 1806.04781.

[2] David Duvenaud,et al. Backpropagation through the Void: Optimizing control variates for black-box gradient estimation , 2017, ICLR.

[3] Mohammad Emtiyaz Khan,et al. Training Binary Neural Networks using the Bayesian Learning Rule , 2020, ICML.

[4] Kai Yu,et al. Binary Deep Neural Networks for Speech Recognition , 2017, INTERSPEECH.

[5] Yoshua Bengio,et al. BinaryConnect: Training Deep Neural Networks with binary weights during propagations , 2015, NIPS.

[6] Shuchang Zhou,et al. DoReFa-Net: Training Low Bitwidth Convolutional Neural Networks with Low Bitwidth Gradients , 2016, ArXiv.

[7] Stefano Lodi,et al. Self-Supervised Bernoulli Autoencoders for Semi-Supervised Hashing , 2020, CIARP.

[8] Efstratios Gavves,et al. Low Bias Low Variance Gradient Estimates for Boolean Stochastic Networks , 2020, ICML.

[9] Gang Hua,et al. How to Train a Compact Binary Neural Network with High Accuracy? , 2017, AAAI.

[10] Miguel Lázaro-Gredilla,et al. Local Expectation Gradients for Black Box Variational Inference , 2015, NIPS.

[11] Ruslan Salakhutdinov,et al. Learning Stochastic Feedforward Neural Networks , 2013, NIPS.

[12] Boris Flach,et al. Path Sample-Analytic Gradient Estimators for Stochastic Binary Networks , 2020, NeurIPS.

[13] Ali Farhadi,et al. XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks , 2016, ECCV.

[14] Chang Liu,et al. Straight-Through Estimator as Projected Wasserstein Gradient Flow , 2019, ArXiv.

[15] WEIGHTS HAVING STABLE SIGNS ARE IMPORTANT: FINDING PRIMARY SUBNETWORKS , 2020 .

[16] Mingyuan Zhou,et al. ARM: Augment-REINFORCE-Merge Gradient for Stochastic Binary Networks , 2018, ICLR.

[17] Christoph Meinel,et al. Back to Simplicity: How to Train Accurate BNNs from Scratch? , 2019, ArXiv.

[18] Alexander Shekhovtsov,et al. Bias-Variance Tradeoffs in Single-Sample Binary Gradient Estimators , 2021, GCPR.

[19] Mark Horowitz,et al. 1.1 Computing's energy problem (and what we can do about it) , 2014, 2014 IEEE International Solid-State Circuits Conference Digest of Technical Papers (ISSCC).

[20] Natalia Gimelshein,et al. PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.

[21] Jascha Sohl-Dickstein,et al. REBAR: Low-variance, unbiased gradient estimates for discrete latent variable models , 2017, NIPS.

[22] John Darzentas,et al. Problem Complexity and Method Efficiency in Optimization , 1983 .

[23] Max Welling,et al. Probabilistic Binary Neural Networks , 2018, ArXiv.

[24] Nitish Srivastava,et al. Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[25] Xianglong Liu,et al. Differentiable Soft Quantization: Bridging Full-Precision and Low-Bit Neural Networks , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[26] Yu Bai,et al. ProxQuant: Quantized Neural Networks via Proximal Operators , 2018, ICLR.

[27] Holger Fröning,et al. Training Discrete-Valued Neural Networks with Sign Activations Using Weight Distributions , 2019, ECML/PKDD.

[28] Philip H. S. Torr,et al. Mirror Descent View for Neural Network Quantization , 2019, AISTATS.

[29] Jian Sun,et al. Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[30] Ethan Fetaya,et al. Learning Discrete Weights Using the Local Reparameterization Trick , 2017, ICLR.

[31] Babak Hassibi,et al. Stochastic Mirror Descent on Overparameterized Nonlinear Models , 2019, IEEE Transactions on Neural Networks and Learning Systems.

[32] Kwang-Ting Cheng,et al. Latent Weights Do Not Exist: Rethinking Binarized Neural Network Optimization , 2019, NeurIPS.

[33] Maja Pantic,et al. Improved training of binary networks for human pose estimation and image recognition , 2019, ArXiv.

[34] Yee Whye Teh,et al. The Concrete Distribution: A Continuous Relaxation of Discrete Random Variables , 2016, ICLR.

[35] Georgios Tzimiropoulos,et al. High-Capacity Expert Binary Networks , 2020, ICLR.

[36] Yoshua Bengio,et al. Estimating or Propagating Gradients Through Stochastic Neurons for Conditional Computation , 2013, ArXiv.

[37] Xiaoning Qian,et al. Pairwise Supervised Hashing with Bernoulli Variational Auto-Encoder and Self-Control Gradient Estimator , 2020, UAI.

[38] Andrew S. Cassidy,et al. Convolutional networks for fast, energy-efficient neuromorphic computing , 2016, Proceedings of the National Academy of Sciences.

[39] Nicholas D. Lane,et al. An Empirical study of Binary Neural Networks' Optimisation , 2018, ICLR.

[40] Alex Graves,et al. Practical Variational Inference for Neural Networks , 2011, NIPS.

[41] M. E. Khan. Learning-Algorithms from Bayesian Principles , 2019 .

[42] Ben Poole,et al. Categorical Reparameterization with Gumbel-Softmax , 2016, ICLR.

[43] Georgios Tzimiropoulos,et al. BATS: Binary ArchitecTure Search , 2020, ECCV.

[44] Alexander Shekhovtsov,et al. Initialization and Transfer Learning of Stochastic Binary Networks from Real-Valued Ones , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[45] Lawrence Carin,et al. GO Gradient for Expectation-Based Objectives , 2019, ICLR.

[46] Geoffrey E. Hinton,et al. Using very deep autoencoders for content-based image retrieval , 2011, ESANN.

[47] Jack Xin,et al. Understanding Straight-Through Estimator in Training Activation Quantized Neural Nets , 2019, ICLR.

[48] Le Song,et al. Stochastic Generative Hashing , 2017, ICML.

[49] Ran El-Yaniv,et al. Binarized Neural Networks , 2016, ArXiv.

[50] Georgios Tzimiropoulos,et al. Training Binary Neural Networks with Real-to-Binary Convolutions , 2020, ICLR.

[51] Wei Pan,et al. Towards Accurate Binary Convolutional Neural Network , 2017, NIPS.

[52] Georgios Tzimiropoulos,et al. Binarized Convolutional Landmark Localizers for Human Pose Estimation and Face Alignment with Limited Resources , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[53] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[54] Yi Fang,et al. Variational Deep Semantic Hashing for Text Documents , 2017, SIGIR.

[55] Radford M. Neal. Connectionist Learning of Belief Networks , 1992, Artif. Intell..

[56] Issei Sato,et al. Evaluating the Variance of Likelihood-Ratio Gradient Estimators , 2017, ICML.

[57] Ronald J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.

[58] Mark W. Schmidt,et al. Fast and Simple Natural-Gradient Variational Inference with Mixture of Exponential-family Approximations , 2019, ICML.

[59] Tapani Raiko,et al. Techniques for Learning Binary Stochastic Feedforward Neural Networks , 2014, ICLR.

[60] Sahin Lale,et al. A Study of Generalization of Stochastic Mirror Descent Algorithms on Overparameterized Nonlinear Models , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[61] Daan Wierstra,et al. Deep AutoRegressive Networks , 2013, ICML.

[62] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[63] Guoyin Wang,et al. NASH: Toward End-to-End Neural Architecture for Generative Semantic Hashing , 2018, ACL.

[64] Wei Liu,et al. Bi-Real Net: Enhancing the Performance of 1-bit CNNs With Improved Representational Capability and Advanced Training Algorithm , 2018, ECCV.