Improve Image Classification Using Data Augmentation and Neural Networks

In this paper, we present how to improve image classification by using data augmentation and convolutional neural networks. Model overfitting and poor performance are common problems in applying neural network techniques. Approaches to bring intra-class differences down and retain sensitivity to the inter-class variations are important to maximize model accuracy and minimize the loss function. With CIFAR-10 public image dataset, the effects of model overfitting were monitored within different model architectures in combination of data augmentation and hyper-parameter tuning. The model performance was evaluated with train and test accuracy and loss, characteristics derived from the confusion matrices, and visualizations of different model outputs with non-linear mapping algorithm t-Distributed Stochastic Neighbor Embedding (t-SNE). As a macro-architecture with 16 weighted layers, VGG16 model is used for large scale image classification. In the presence of image data augmentation, the overall VGG16 model train accuracy is 96%, the test accuracy is stabilized at 92%, and both the results of train and test losses are below 0.5. The overall image classification error rate is dropped to 8%, while the single class misclassification rates are less than 7.5% in eight out of ten image classes. Model architecture, hyper-parameter tuning, and data augmentation are essential to reduce model overfitting and help build a more reliable convolutional neural network model.