In this paper, we present how to improve image classification by using data augmentation and convolutional neural networks. Model overfitting and poor performance are common problems in applying neural network techniques. Approaches to bring intra-class differences down and retain sensitivity to the inter-class variations are important to maximize model accuracy and minimize the loss function. With CIFAR-10 public image dataset, the effects of model overfitting were monitored within different model architectures in combination of data augmentation and hyper-parameter tuning. The model performance was evaluated with train and test accuracy and loss, characteristics derived from the confusion matrices, and visualizations of different model outputs with non-linear mapping algorithm t-Distributed Stochastic Neighbor Embedding (t-SNE). As a macro-architecture with 16 weighted layers, VGG16 model is used for large scale image classification. In the presence of image data augmentation, the overall VGG16 model train accuracy is 96%, the test accuracy is stabilized at 92%, and both the results of train and test losses are below 0.5. The overall image classification error rate is dropped to 8%, while the single class misclassification rates are less than 7.5% in eight out of ten image classes. Model architecture, hyper-parameter tuning, and data augmentation are essential to reduce model overfitting and help build a more reliable convolutional neural network model.
[1]
Andriy Burkov,et al.
The Hundred-Page Machine Learning Book
,
2019
.
[2]
Masakazu Matsugu,et al.
Subject independent facial expression recognition with robust face detection using a convolutional neural network
,
2003,
Neural Networks.
[3]
Jian Sun,et al.
Deep Residual Learning for Image Recognition
,
2015,
2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[4]
Masoumeh Haghpanahi,et al.
Cardiologist-level arrhythmia detection and classification in ambulatory electrocardiograms using a deep neural network
,
2019,
Nature Medicine.
[5]
Klevis Ramo.
Hands-On Java Deep Learning for Computer Vision: Implement machine learning and neural network methodologies to perform computer vision-related tasks
,
2019
.
[6]
Alex Krizhevsky,et al.
Learning Multiple Layers of Features from Tiny Images
,
2009
.
[7]
Luis Perez,et al.
The Effectiveness of Data Augmentation in Image Classification using Deep Learning
,
2017,
ArXiv.
[8]
Geoffrey E. Hinton,et al.
Visualizing Data using t-SNE
,
2008
.
[9]
Sergei V. Chekanov,et al.
Numeric Computation and Statistical Data Analysis on the Java Platform
,
2016,
Advanced Information and Knowledge Processing.
[10]
Yoshua Bengio,et al.
Gradient-based learning applied to document recognition
,
1998,
Proc. IEEE.