The Regression of MNIST Dataset Based on Convolutional Neural Network

The MNIST dataset of handwritten digits has been widely used for validating the effectiveness and efficiency of machine learning methods. Although this dataset was primarily used for classification and results of very high accuracy (99.3%+) on it have been obtained, its important application of regression is not directly applicable, thus substantially deteriorates its usefulness and the development of regression methods for such types of data. In this paper, to allow MNIST to be usable for regression, we firstly apply its class/label with normal distribution thereby convert the original discrete class numbers into float ones. Modified Convolutional Neural Networks (CNN) is then applied to generate a regression model. Multiple experiments have been conducted in order to select optimal parameters and layer settings for this application. Experimental results suggest that, optimal outcome of mean-absolute-error (MAE) value can be obtained when ReLu function is adopted for the first layer with other layers activated by the softplus functions. In the proposed approach, two indicators of MAE and Log-Cosh loss have been applied to optimize the parameters and score the predictions. Experiments on 10-fold cross-validation demonstrate that, desired low values of MAE and Log-Cosh error respectively at 0.202 and 0.079 can be achieved. Furthermore, multiple values of standard deviation of the normal distribution have been applied to verify the applicability when data of label number at varied distributions is used. The experimental results suggest that a positive correlation exists between the adopted standard deviation and the loss value, that is, the higher concentration degree of data will contribute to the lower MAE value.

[1]  Jürgen Schmidhuber,et al.  Multi-column deep neural networks for image classification , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[2]  Benjamin Schrauwen,et al.  Deep content-based music recommendation , 2013, NIPS.

[3]  Gang Hua,et al.  Attention-based Temporal Weighted Convolutional Neural Network for Action Recognition , 2018, AIAI.

[4]  Masakazu Matsugu,et al.  Subject independent facial expression recognition with robust face detection using a convolutional neural network , 2003, Neural Networks.

[5]  Timothy V Pyrkov,et al.  Extracting biological age from biomedical data via deep learning: too much of a good thing? , 2017, Scientific Reports.

[6]  Jason Weston,et al.  A unified architecture for natural language processing: deep neural networks with multitask learning , 2008, ICML '08.

[7]  Yoshua Bengio,et al.  Série Scientifique Scientific Series Incorporating Second-order Functional Knowledge for Better Option Pricing Incorporating Second-order Functional Knowledge for Better Option Pricing , 2022 .

[8]  Ah Chung Tsoi,et al.  Face recognition: a convolutional neural-network approach , 1997, IEEE Trans. Neural Networks.

[9]  Geoffrey E. Hinton,et al.  Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.

[10]  Christian Viard-Gaudin,et al.  A Convolutional Neural Network Approach for Objective Video Quality Assessment , 2006, IEEE Transactions on Neural Networks.

[11]  Alexander Rakhlin,et al.  Paediatric Bone Age Assessment Using Deep Convolutional Neural Networks , 2017, DLMIA/ML-CDS@MICCAI.

[12]  Abraham Wald,et al.  Statistical Decision Functions , 1951 .

[13]  Y Ichioka,et al.  Parallel distributed processing model with local space-invariant interconnections and its optical architecture. , 1990, Applied optics.

[14]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[15]  Dong Yu,et al.  Deep Learning: Methods and Applications , 2014, Found. Trends Signal Process..

[16]  Geoffrey J. McLachlan,et al.  Analyzing Microarray Gene Expression Data , 2004 .

[17]  C. Willmott,et al.  Advantages of the mean absolute error (MAE) over the root mean square error (RMSE) in assessing average model performance , 2005 .