Deep neural networks demonstrate to have a high performance on image classification tasks while being more difficult to train. Due to the complexity and vanishing gradient problem, it normally takes a lot of time and more computational power to train deeper neural networks. Deep residual networks (ResNets) can make the training process faster and attain more accuracy compared to their equivalent neural networks. ResNets achieve this improvement by adding a simple skip connection parallel to the layers of convolutional neural networks. In this project we first design a ResNet model that can perform the image classification task on the Tiny ImageNet dataset with a high accuracy, then we compare the performance of this ResNet model with its equivalent Convolutional Network (ConvNet). Our findings illustrate that ResNets are more prone to overfitting despite their higher accuracy. Several methods to prevent overfitting such as adding dropout layers and stochastic augmentation of the training dataset has been studied in this work.
[1]
Jian Sun,et al.
Deep Residual Learning for Image Recognition
,
2015,
2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[2]
Andrew Zisserman,et al.
Very Deep Convolutional Networks for Large-Scale Image Recognition
,
2014,
ICLR.
[3]
Geoffrey E. Hinton,et al.
ImageNet classification with deep convolutional neural networks
,
2012,
Commun. ACM.
[4]
Rob Fergus,et al.
Visualizing and Understanding Convolutional Networks
,
2013,
ECCV.
[5]
Yoshua Bengio,et al.
Learning long-term dependencies with gradient descent is difficult
,
1994,
IEEE Trans. Neural Networks.
[6]
Yoshua Bengio,et al.
Understanding the difficulty of training deep feedforward neural networks
,
2010,
AISTATS.
[7]
D. Hubel,et al.
Receptive fields and functional architecture of monkey striate cortex
,
1968,
The Journal of physiology.
[8]
Xiang Zhang,et al.
OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks
,
2013,
ICLR.