Comparing Deep Recurrent Networks Based on the MAE Random Sampling, a First Approach

Recurrent neural networks have demonstrated to be good at tackling prediction problems, however due to their high sensitivity to hyper-parameter configuration, finding an appropriate network is a tough task. Automatic hyper-parameter optimization methods have emerged to find the most suitable configuration to a given problem, but these methods are not generally adopted because of their high computational cost. Therefore, in this study we extend the MAE random sampling, a low-cost method to compare single-hidden layer architectures, to multiple-hidden-layer ones. We validate empirically our proposal and show that it is possible to predict and compare the expected performance of an hyper-parameter configuration in a low-cost way.

[1]  Byunghan Lee,et al.  Deep learning in bioinformatics , 2016, Briefings Bioinform..

[2]  Guang Yang,et al.  Neural networks designing neural networks: Multi-objective hyper-parameter optimization , 2016, 2016 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).

[3]  R. Bracewell The Fourier Transform and Its Applications , 1966 .

[4]  Václav Snásel,et al.  Metaheuristic design of feedforward neural networks: A review of two decades of research , 2017, Eng. Appl. Artif. Intell..

[5]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[6]  Yuan Yu,et al.  TensorFlow: A system for large-scale machine learning , 2016, OSDI.

[7]  Yoshua Bengio,et al.  Learning long-term dependencies with gradient descent is difficult , 1994, IEEE Trans. Neural Networks.

[8]  Razvan Pascanu,et al.  On the difficulty of training recurrent neural networks , 2012, ICML.

[9]  Ausif Mahmood,et al.  A Framework for Designing the Architectures of Deep Convolutional Neural Networks , 2017, Entropy.

[10]  Simon Haykin,et al.  Neural Networks and Learning Machines , 2010 .

[11]  Enrique Alba,et al.  Evolutionary Deep Learning for Car Park Occupancy Prediction in Smart Cities , 2018, LION.

[12]  Yoshua Bengio,et al.  Gradient Flow in Recurrent Nets: the Difficulty of Learning Long-Term Dependencies , 2001 .

[13]  Zachary Chase Lipton A Critical Review of Recurrent Neural Networks for Sequence Learning , 2015, ArXiv.

[14]  Frank Hutter,et al.  Speeding Up Automatic Hyperparameter Optimization of Deep Neural Networks by Extrapolation of Learning Curves , 2015, IJCAI.

[15]  Yoshua Bengio,et al.  Algorithms for Hyper-Parameter Optimization , 2011, NIPS.

[16]  Bram van Ginneken,et al.  A survey on deep learning in medical image analysis , 2017, Medical Image Anal..

[17]  Wojciech Zaremba,et al.  An Empirical Exploration of Recurrent Network Architectures , 2015, ICML.

[18]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[19]  Masanori Nakakuni,et al.  Quantitative measures to evaluate neural network weight initialization strategies , 2017, 2017 IEEE 7th Annual Computing and Communication Workshop and Conference (CCWC).