论文信息 - Regularization of Deep Neural Networks with Spectral Dropout

Regularization of Deep Neural Networks with Spectral Dropout

The big breakthrough on the ImageNet challenge in 2012 was partially due to the 'Dropout' technique used to avoid overfitting. Here, we introduce a new approach called 'Spectral Dropout' to improve the generalization ability of deep neural networks. We cast the proposed approach in the form of regular Convolutional Neural Network (CNN) weight layers using a decorrelation transform with fixed basis functions. Our spectral dropout method prevents overfitting by eliminating weak and 'noisy' Fourier domain coefficients of the neural network activations, leading to remarkably better results than the current regularization methods. Furthermore, the proposed is very efficient due to the fixed basis functions used for spectral transformation. In particular, compared to Dropout and Drop-Connect, our method significantly speeds up the network convergence rate during the training process (roughly ×2), with considerably higher neuron pruning rates (an increase of ∼30%). We demonstrate that the spectral dropout can also be used in conjunction with other regularization approaches resulting in additional performance gains.

[1] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[2] Hao Zhou,et al. Less Is More: Towards Compact CNNs , 2016, ECCV.

[3] Shih-Fu Chang,et al. An Exploration of Parameter Redundancy in Deep Networks with Circulant Projections , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[4] Christopher D. Manning,et al. Fast dropout training , 2013, ICML.

[5] Jianqin Zhou,et al. On discrete cosine transform , 2011, ArXiv.

[6] Nitish Srivastava,et al. Improving neural networks by preventing co-adaptation of feature detectors , 2012, ArXiv.

[7] Jian Sun,et al. Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[8] Mohammed Bennamoun,et al. Cost-Sensitive Learning of Deep Feature Representations From Imbalanced Data , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[9] Andrew Lavin,et al. Fast Algorithms for Convolutional Neural Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10] Tara N. Sainath,et al. Structured Transforms for Small-Footprint Deep Learning , 2015, NIPS.

[11] Rob Fergus,et al. Stochastic Pooling for Regularization of Deep Convolutional Neural Networks , 2013, ICLR.

[12] Yann LeCun,et al. Regularization of Neural Networks using DropConnect , 2013, ICML.

[13] Grgoire Montavon,et al. Neural Networks: Tricks of the Trade , 2012, Lecture Notes in Computer Science.

[14] Juergen Luettin,et al. Fast Face Detection using MLP and FFT , 1999 .

[15] Yann LeCun,et al. Fast Training of Convolutional Networks through FFTs , 2013, ICLR.

[16] Jian Sun,et al. Identity Mappings in Deep Residual Networks , 2016, ECCV.

[17] Roberto Cipolla,et al. Bayesian SegNet: Model Uncertainty in Deep Convolutional Encoder-Decoder Architectures for Scene Understanding , 2015, BMVC.

[18] Benjamin Graham,et al. Fractional Max-Pooling , 2014, ArXiv.

[19] Yoshua Bengio,et al. Maxout Networks , 2013, ICML.

[20] Kilian Q. Weinberger,et al. Deep Networks with Stochastic Depth , 2016, ECCV.

[21] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.