The article deals with the overfitting problem in deep neural networks. Finding the model with proper number of parameters matching the simulated process can be a difficult task. There are a range of recommendations how to chose the number of neurons in hidden layers, but most of them don’t always work well in practice. As a result, neural networks work in underfitting or overfitting regime. Therefore in practice complex model is usually chosen and regularization strategies are applied. In this paper, the main regularization techniques for multilayer perceptrons including early stopping and dropout are discussed. Regularization representation using metagraph approach is described. In the creation mode, the metagraph representation of the neural network is created using metagraph agents. In the training mode, the training metagraph is created. Thus, different regularization strategies may be embedded into the training algorithm. The special metagraph agent for dropout strategy is developed. Comparison of different regularization techniques is conducted on CoverType dataset. Results of experiments are analyzed. Advantages of early stopping and dropout regularization strategies are discussed.
[1]
Christopher M. Bishop,et al.
Regularization and complexity control in feed-forward networks
,
1995
.
[2]
Yuriy E. Gapanyuk,et al.
Multilevel neural net adaptive models using the metagraph approach
,
2016,
Optical Memory and Neural Networks.
[3]
Nitish Srivastava,et al.
Dropout: a simple way to prevent neural networks from overfitting
,
2014,
J. Mach. Learn. Res..
[4]
L. Ljung,et al.
Overtraining, regularization and searching for a minimum, with application to neural networks
,
1995
.
[5]
J. Nazuno.
Haykin, Simon. Neural networks: A comprehensive foundation, Prentice Hall, Inc. Segunda Edición, 1999
,
2000
.
[6]
Yoshua Bengio,et al.
Understanding the difficulty of training deep feedforward neural networks
,
2010,
AISTATS.
[7]
Sergey Ioffe,et al.
Rethinking the Inception Architecture for Computer Vision
,
2015,
2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[8]
Nasser M. Nasrabadi,et al.
Pattern Recognition and Machine Learning
,
2006,
Technometrics.