Mad Max: Affine Spline Insights Into Deep Learning
暂无分享,去创建一个
[1] Serge J. Belongie,et al. Residual Networks Behave Like Ensembles of Relatively Shallow Networks , 2016, NIPS.
[2] Stephen P. Boyd,et al. Convex piecewise-linear fitting , 2009 .
[3] Leo Breiman,et al. Hinging hyperplanes for regression, classification, and function approximation , 1993, IEEE Trans. Inf. Theory.
[4] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.
[5] Richard G. Baraniuk,et al. A Spline Theory of Deep Networks , 2018, ICML.
[6] Stefano Soatto,et al. Visual Representations: Defining Properties and Deep Approximations , 2014, ICLR 2016.
[7] Elad Hoffer,et al. Exponentially vanishing sub-optimal local minima in multilayer neural networks , 2017, ICLR.
[8] Sankar K. Pal,et al. Multilayer perceptron, fuzzy sets, and classification , 1992, IEEE Trans. Neural Networks.
[9] Rob Fergus,et al. Visualizing and Understanding Convolutional Networks , 2013, ECCV.
[10] Stefano Soatto,et al. Emergence of Invariance and Disentanglement in Deep Representations , 2017, 2018 Information Theory and Applications Workshop (ITA).
[11] Razvan Pascanu,et al. Advances in optimizing recurrent networks , 2012, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[12] Huu Le,et al. DeepVQ: A Deep Network Architecture for Vector Quantization , 2018, CVPR Workshops.
[13] Anil K. Jain. Data clustering: 50 years beyond K-means , 2008, Pattern Recognit. Lett..
[14] Sergey Ioffe,et al. Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning , 2016, AAAI.
[15] Hiroaki Nishikawa,et al. Accurate Piecewise Linear Continuous Approximations to One-Dimensional Curves : Error Estimates and Algorithms , 2010 .
[16] Joan Bruna,et al. Intriguing properties of neural networks , 2013, ICLR.
[17] David B. Dunson,et al. Multivariate convex regression with adaptive partitioning , 2011, J. Mach. Learn. Res..
[18] C. K. Yuen,et al. Theory and Application of Digital Signal Processing , 1978, IEEE Transactions on Systems, Man, and Cybernetics.
[19] Richard Baraniuk,et al. Max-Affine Spline Insights into Deep Generative Networks , 2020, ArXiv.
[20] Robert Hecht-Nielsen,et al. Theory of the backpropagation neural network , 1989, International 1989 Joint Conference on Neural Networks.
[21] Fei Wen,et al. An improved vector quantization method using deep neural network , 2017 .
[22] Joon Hee Han,et al. Local Decorrelation For Improved Pedestrian Detection , 2014, NIPS.
[23] Nadav Cohen,et al. On the Expressive Power of Deep Learning: A Tensor Analysis , 2015, COLT 2016.
[24] Aditya Bhaskara,et al. Provable Bounds for Learning Some Deep Representations , 2013, ICML.
[25] Richard G. Baraniuk,et al. A Probabilistic Framework for Deep Learning , 2016, NIPS.
[26] Jürgen Schmidhuber,et al. Flat Minima , 1997, Neural Computation.
[27] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[28] Lei Xu,et al. Input Convex Neural Networks : Supplementary Material , 2017 .
[29] Yoshua Bengio,et al. Extracting and composing robust features with denoising autoencoders , 2008, ICML '08.
[30] Nasser M. Nasrabadi,et al. Image coding using vector quantization: a review , 1988, IEEE Trans. Commun..
[31] Heekuck Oh,et al. Neural Networks for Pattern Recognition , 1993, Adv. Comput..
[32] Behnaam Aazhang,et al. The Geometry of Deep Networks: Power Diagram Subdivision , 2019, NeurIPS.
[33] Richard G. Baraniuk,et al. Optimal tree approximation with wavelets , 1999, Optics & Photonics.
[34] Ronald,et al. Learning representations by backpropagating errors , 2004 .
[35] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[36] Geoffrey E. Hinton,et al. Autoencoders, Minimum Description Length and Helmholtz Free Energy , 1993, NIPS.
[37] Wotao Yin,et al. A Block Coordinate Descent Method for Regularized Multiconvex Optimization with Applications to Nonnegative Tensor Factorization and Completion , 2013, SIAM J. Imaging Sci..
[38] Pascal Vincent,et al. An Exploration of Softmax Alternatives Belonging to the Spherical Loss Family , 2015, ICLR.
[39] E. Beckenbach. CONVEX FUNCTIONS , 2007 .
[40] Jian Sun,et al. Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[41] J. Schmidhuber,et al. Framewise phoneme classification with bidirectional LSTM networks , 2005, Proceedings. 2005 IEEE International Joint Conference on Neural Networks, 2005..
[42] Yoshua Bengio,et al. Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.
[43] Matthew D. Zeiler. ADADELTA: An Adaptive Learning Rate Method , 2012, ArXiv.
[44] Diogo Almeida,et al. Resnet in Resnet: Generalizing Residual Architectures , 2016, ArXiv.
[45] M. E. Botkin,et al. Structural shape optimization with geometric description and adaptive mesh refinement , 1985 .
[46] Andrew Zisserman,et al. Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps , 2013, ICLR.
[47] Harris Drucker,et al. Learning algorithms for classification: A comparison on handwritten digit recognition , 1995 .
[48] Bahman Gharesifard,et al. Universal Approximation Power of Deep Neural Networks via Nonlinear Control Theory , 2020, ArXiv.
[49] Yoram Singer,et al. Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..
[50] Michael I. Jordan,et al. Factorial Hidden Markov Models , 1995, Machine Learning.
[51] Zoubin Ghahramani,et al. Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning , 2015, ICML.
[52] Geoffrey E. Hinton,et al. Learning representations by back-propagating errors , 1986, Nature.
[53] J. M. Tarela,et al. Region configurations for realizability of lattice Piecewise-Linear models , 1999 .
[54] Michael Elad,et al. Convolutional Neural Networks Analyzed via Convolutional Sparse Coding , 2016, J. Mach. Learn. Res..
[55] Zichao Wang,et al. A Max-Affine Spline Perspective of Recurrent Neural Networks , 2019, ICLR.
[56] Samy Bengio,et al. Adversarial examples in the physical world , 2016, ICLR.
[57] Blaine Rister,et al. Piecewise convexity of artificial neural networks , 2016, Neural Networks.
[58] Jürgen Schmidhuber,et al. Training Very Deep Networks , 2015, NIPS.
[59] Naftali Tishby,et al. Deep learning and the information bottleneck principle , 2015, 2015 IEEE Information Theory Workshop (ITW).
[60] M. Powell,et al. Approximation theory and methods , 1984 .
[61] Yoshua Bengio,et al. Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling , 2014, ArXiv.
[62] Yann LeCun,et al. The mnist database of handwritten digits , 2005 .
[63] R. Zemel. A minimum description length framework for unsupervised learning , 1994 .
[64] Raman Arora,et al. Understanding Deep Neural Networks with Rectified Linear Units , 2016, Electron. Colloquium Comput. Complex..
[65] Shuning Wang,et al. Generalization of hinging hyperplanes , 2005, IEEE Transactions on Information Theory.
[66] Richard G. Baraniuk,et al. From Hard to Soft: Understanding Deep Network Nonlinearities via Vector Quantization and Statistical Inference , 2018, ICLR.
[67] Andrej Risteski,et al. Representational Power of ReLU Networks and Polynomial Kernels: Beyond Worst-Case Analysis , 2018, ArXiv.
[68] Yoshua Bengio,et al. Generative Adversarial Nets , 2014, NIPS.
[69] Razvan Pascanu,et al. On the Number of Linear Regions of Deep Neural Networks , 2014, NIPS.
[70] Alex Graves,et al. Generating Sequences With Recurrent Neural Networks , 2013, ArXiv.
[71] George Cybenko,et al. Approximation by superpositions of a sigmoidal function , 1989, Math. Control. Signals Syst..
[72] Michael Unser,et al. A representer theorem for deep neural networks , 2018, J. Mach. Learn. Res..
[73] Stefanie Jegelka,et al. ResNet with one-neuron hidden layers is a Universal Approximator , 2018, NeurIPS.
[74] Luca Benini,et al. Soft-to-Hard Vector Quantization for End-to-End Learning Compressible Representations , 2017, NIPS.
[75] Aaron C. Courville,et al. Deep Learning Vector Quantization , 2016, ESANN.
[76] Marc Levoy,et al. Fast texture synthesis using tree-structured vector quantization , 2000, SIGGRAPH.
[77] Stéphane Mallat,et al. Invariant Scattering Convolution Networks , 2012, IEEE transactions on pattern analysis and machine intelligence.
[78] Haihao Lu,et al. Depth Creates No Bad Local Minima , 2017, ArXiv.
[79] Samy Bengio,et al. Understanding deep learning requires rethinking generalization , 2016, ICLR.
[80] Ming Yang,et al. Compressing Deep Convolutional Networks using Vector Quantization , 2014, ArXiv.
[81] G. Petrova,et al. Nonlinear Approximation and (Deep) ReLU\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathrm {ReLU}$$\end{document} , 2019, Constructive Approximation.
[82] Yonina C. Eldar,et al. Orthogonal matched filter detection , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).
[83] Justin K. Romberg,et al. Bayesian tree-structured image modeling using wavelet-domain hidden Markov models , 2001, IEEE Trans. Image Process..
[84] S. Mallat. A wavelet tour of signal processing , 1998 .
[85] Richard Baraniuk,et al. Implicit Rugosity Regularization via Data Augmentation , 2019 .
[86] Yoshua Bengio,et al. Maxout Networks , 2013, ICML.