Regularization methods for neural networks and related models
暂无分享,去创建一个
Neural networks have become very popular in the last few years. They have demonstrated the best results in areas of image classification, image segmentation, speech recognition, and text processing. The major breakthrough happened in early 2010s, when it became feasible to train deep neural networks (DNN) on a GPU, which made the training process several hundred times faster. At the same time, large labeled datasets with millions of objects, such as ImageNet [16], became available. The GPU implementation of a convolutional DNN with over 10 layers and millions of parameters could handle the ImageNet dataset in just a few days. As a result, such networks could decrease classification error in the image classification competition LSVRC-2010 [54] by 40% compared with the hand-made feature algorithms. Deep neural networks are able to demonstrate excellent results on tasks with a complex classification function and sufficient amount of training data. However, since DNN models have a huge number of parameters, they can also be easily overfitted, when the amount of training data is not large enough. Thus, regularization techniques for neural networks are crucially important to make them applicable to a wide range of problems. In this thesis we provide a comprehensive overview of existing regularization techniques for neural networks and provide their theoretical explanation. Training of neural networks is performed using the Backpropagation algorithm (BP). Standard BP has two passes: forward and backward. It computes the predictions for the current input and the loss function in the forward pass, and the derivatives of the loss function with respect to the input and weights on the backward pass. The nature of the data usually assumes that two very close data points have the same label. This means that the predictions of a classifier
[1] Qiang Ji,et al. Active Image Labeling and Its Application to Facial Action Labeling , 2008, ECCV.
[2] Qi Wu,et al. CASME database: A dataset of spontaneous micro-expressions collected from neutralized faces , 2013, 2013 10th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG).
[3] Matti Pietikäinen,et al. Dynamic Texture Recognition Using Local Binary Patterns with an Application to Facial Expressions , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.