A Priori Estimates of the Generalization Error for Autoencoders

Autoencoder is a machine learning model which aims for dimensionality reduction, by reconstructing its input through a bottleneck with lower dimension than the input. It is among the most popular models used in unsupervised learning and semi-supervised learning. In this paper, we build theoretical understanding about autoencoders. Specifically, assuming the existence of the underlying groundtruth encoder and decoder, we establish a priori estimates of the generalization error for autoencoders when an appropriately chosen regularization term is applied. The estimate is a priori in the sense that it only depend on some norms of the groundtruth encoder and decoder, but not the model parameters. The bound acheives nearly optimal rates with respect to the number of data and parameters. To our knowledge, this is the first try to build a priori estimates to unsupervised learning models. Numerical experiments show the tightness of the bounds.

[1]  David A. McAllester,et al.  A PAC-Bayesian Approach to Spectrally-Normalized Margin Bounds for Neural Networks , 2017, ICLR.

[2]  Ron Meir,et al.  Generalization Bounds For Unsupervised and Semi-Supervised Learning With Autoencoders , 2019, ArXiv.

[3]  Philippe G. Ciarlet,et al.  The finite element method for elliptic problems , 2002, Classics in applied mathematics.

[4]  Leo Breiman,et al.  Hinging hyperplanes for regression, classification, and function approximation , 1993, IEEE Trans. Inf. Theory.

[5]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[6]  V. Koltchinskii,et al.  Empirical margin distributions and bounding the generalization error of combined classifiers , 2002, math/0405343.

[7]  Andrew R. Barron,et al.  Universal approximation bounds for superpositions of a sigmoidal function , 1993, IEEE Trans. Inf. Theory.

[8]  Konstantinos Pitas,et al.  Better PAC-Bayes Bounds for Deep Neural Networks using the Loss Curvature , 2019, ArXiv.

[9]  Ryota Tomioka,et al.  Norm-Based Capacity Control in Neural Networks , 2015, COLT.

[10]  Junwei Lu,et al.  On Tighter Generalization Bound for Deep Neural Networks: CNNs, ResNets, and Beyond , 2018, ArXiv.

[11]  Gintare Karolina Dziugaite,et al.  Computing Nonvacuous Generalization Bounds for Deep (Stochastic) Neural Networks with Many More Parameters than Training Data , 2017, UAI.

[12]  Shai Ben-David,et al.  Understanding Machine Learning: From Theory to Algorithms , 2014 .

[13]  Yi Zhang,et al.  Stronger generalization bounds for deep nets via a compression approach , 2018, ICML.

[14]  E. Weinan,et al.  A Priori Estimates of the Population Risk for Residual Networks , 2019, ArXiv.

[15]  Peter L. Bartlett,et al.  Rademacher and Gaussian Complexities: Risk Bounds and Structural Results , 2003, J. Mach. Learn. Res..

[16]  Vladimir Vapnik,et al.  Chervonenkis: On the uniform convergence of relative frequencies of events to their probabilities , 1971 .

[17]  Peter L. Bartlett,et al.  Nearly-tight VC-dimension and Pseudodimension Bounds for Piecewise Linear Neural Networks , 2017, J. Mach. Learn. Res..

[18]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[19]  Ruosong Wang,et al.  Fine-Grained Analysis of Optimization and Generalization for Overparameterized Two-Layer Neural Networks , 2019, ICML.

[20]  Pierre Vandergheynst,et al.  PAC-BAYESIAN MARGIN BOUNDS FOR CONVOLUTIONAL NEURAL NETWORKS , 2018 .

[21]  Jason M. Klusowski,et al.  Risk Bounds for High-dimensional Ridge Function Combinations Including Neural Networks , 2016, 1607.01434.

[22]  Lei Wu,et al.  A Priori Estimates of the Generalization Error for Two-layer Neural Networks , 2018, Communications in Mathematical Sciences.

[23]  Ohad Shamir,et al.  Size-Independent Sample Complexity of Neural Networks , 2017, COLT.

[24]  Matus Telgarsky,et al.  Spectrally-normalized margin bounds for neural networks , 2017, NIPS.