论文信息 - Comparing Deep Recurrent Networks Based on the MAE Random Sampling, a First Approach

Comparing Deep Recurrent Networks Based on the MAE Random Sampling, a First Approach

Recurrent neural networks have demonstrated to be good at tackling prediction problems, however due to their high sensitivity to hyper-parameter configuration, finding an appropriate network is a tough task. Automatic hyper-parameter optimization methods have emerged to find the most suitable configuration to a given problem, but these methods are not generally adopted because of their high computational cost. Therefore, in this study we extend the MAE random sampling, a low-cost method to compare single-hidden layer architectures, to multiple-hidden-layer ones. We validate empirically our proposal and show that it is possible to predict and compare the expected performance of an hyper-parameter configuration in a low-cost way.

[1] Byunghan Lee,et al. Deep learning in bioinformatics , 2016, Briefings Bioinform..

[2] Guang Yang,et al. Neural networks designing neural networks: Multi-objective hyper-parameter optimization , 2016, 2016 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).

[3] R. Bracewell. The Fourier Transform and Its Applications , 1966 .

[4] Václav Snásel,et al. Metaheuristic design of feedforward neural networks: A review of two decades of research , 2017, Eng. Appl. Artif. Intell..

[5] Geoffrey E. Hinton,et al. Deep Learning , 2015, Nature.

[6] Yuan Yu,et al. TensorFlow: A system for large-scale machine learning , 2016, OSDI.

[7] Yoshua Bengio,et al. Learning long-term dependencies with gradient descent is difficult , 1994, IEEE Trans. Neural Networks.

[8] Razvan Pascanu,et al. On the difficulty of training recurrent neural networks , 2012, ICML.

[9] Ausif Mahmood,et al. A Framework for Designing the Architectures of Deep Convolutional Neural Networks , 2017, Entropy.

[10] Simon Haykin,et al. Neural Networks and Learning Machines , 2010 .

[11] Enrique Alba,et al. Evolutionary Deep Learning for Car Park Occupancy Prediction in Smart Cities , 2018, LION.

[12] Yoshua Bengio,et al. Gradient Flow in Recurrent Nets: the Difficulty of Learning Long-Term Dependencies , 2001 .

[13] Zachary Chase Lipton. A Critical Review of Recurrent Neural Networks for Sequence Learning , 2015, ArXiv.

[14] Frank Hutter,et al. Speeding Up Automatic Hyperparameter Optimization of Deep Neural Networks by Extrapolation of Learning Curves , 2015, IJCAI.

[15] Yoshua Bengio,et al. Algorithms for Hyper-Parameter Optimization , 2011, NIPS.

[16] Bram van Ginneken,et al. A survey on deep learning in medical image analysis , 2017, Medical Image Anal..

[17] Wojciech Zaremba,et al. An Empirical Exploration of Recurrent Network Architectures , 2015, ICML.

[18] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.

[19] Masanori Nakakuni,et al. Quantitative measures to evaluate neural network weight initialization strategies , 2017, 2017 IEEE 7th Annual Computing and Communication Workshop and Conference (CCWC).