论文信息 - Random error sampling-based recurrent neural network architecture optimization - 字舞流文

Random error sampling-based recurrent neural network architecture optimization

Recurrent neural networks are good at solving prediction problems. However, finding a network that suits a problem is quite hard because their performance is strongly affected by their architecture configuration. Automatic architecture optimization methods help to find the most suitable design, but they are not extensively adopted because of their high computational cost. In this work, we introduce the Random Error Sampling-based Neuroevolution (RESN), an evolutionary algorithm that uses the mean absolute error random sampling, a training-free approach to predict the expected performance of an artificial neural network, to optimize the architecture of a network. We empirically validate our proposal on three prediction problems, and compare our technique to training-based architecture optimization techniques and to neuroevolutionary approaches. Our findings show that we can achieve state-of-the-art error performance and that we reduce by half the time needed to perform the optimization.

Enrique Alba | Jamal Toutouh | Andr'es Camero | E. Alba | Jamal Toutouh | Andrés Camero | J. Toutouh

[1] Carola Doerr,et al. Non-static parameter choices in evolutionary computation , 2017, GECCO.

[2] Enrique Alba,et al. Infrastructure Deployment in Vehicular Communication Networks Using a Parallel Multiobjective Evolutionary Algorithm , 2017, Int. J. Intell. Syst..

[3] Enrique Alba,et al. Full Automatic ANN Design: A Genetic Approach , 1993, IWANN.

[4] Pu Yunming,et al. The Genetic Convolutional Neural Network Model Based on Random Sample , 2015 .

[5] Travis Desell,et al. Investigating recurrent neural network memory structures using neuro-evolution , 2019, GECCO.

[6] Yoshua Bengio,et al. Algorithms for Hyper-Parameter Optimization , 2011, NIPS.

[7] Risto Miikkulainen,et al. Evolving Deep LSTM-based Memory Networks using an Information Maximization Objective , 2016, GECCO.

[8] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[9] Masanori Nakakuni,et al. Quantitative measures to evaluate neural network weight initialization strategies , 2017, 2017 IEEE 7th Annual Computing and Communication Workshop and Conference (CCWC).

[10] Enrique Alba,et al. Comparing Deep Recurrent Networks Based on the MAE Random Sampling, a First Approach , 2018, CAEPIA.

[11] Enrique Alba,et al. BIN-CT: Urban Waste Collection based in Predicting the Container Fill Level , 2018, Biosyst..

[12] Ming-Wei Chang,et al. Load Forecasting Using Support Vector Machines: A Study on EUNITE Competition 2001 , 2004, IEEE Transactions on Power Systems.

[13] Risto Miikkulainen,et al. Evolving Neural Networks through Augmenting Topologies , 2002, Evolutionary Computation.

[14] James Higgins,et al. Using ant colony optimization to optimize long short-term memory recurrent neural networks , 2018, GECCO.

[15] Elliot Meyerson,et al. Evolving Deep Neural Networks , 2017, Artificial Intelligence in the Age of Neural Networks and Brain Computing.

[16] Zachary Chase Lipton. A Critical Review of Recurrent Neural Networks for Sequence Learning , 2015, ArXiv.

[17] Simon Haykin,et al. Neural Networks and Learning Machines , 2010 .

[18] João Paulo Papa,et al. Fine-Tuning Convolutional Neural Networks Using Harmony Search , 2015, CIARP.

[19] Enrique Alba,et al. Evolutionary Deep Learning for Car Park Occupancy Prediction in Smart Cities , 2018, LION.

[20] Enrique Alba,et al. Road map partitioning for routing by using a micro steady state evolutionary algorithm , 2018, Eng. Appl. Artif. Intell..

[21] Bram van Ginneken,et al. A survey on deep learning in medical image analysis , 2017, Medical Image Anal..

[22] Wojciech Zaremba,et al. An Empirical Exploration of Recurrent Network Architectures , 2015, ICML.

[23] Yuan Yu,et al. TensorFlow: A system for large-scale machine learning , 2016, OSDI.

[24] Travis Desell,et al. Evolving Recurrent Neural Networks for Time Series Data Prediction of Coal Plant Parameters , 2019, EvoApplications.

[25] Thomas Bäck,et al. Evolutionary algorithms in theory and practice - evolution strategies, evolutionary programming, genetic algorithms , 1996 .

[26] Amr M. Ibrahim,et al. Particle Swarm Optimization trained recurrent neural network for voltage instability prediction , 2018, Journal of Electrical Systems and Information Technology.

[27] Václav Snásel,et al. Metaheuristic design of feedforward neural networks: A review of two decades of research , 2017, Eng. Appl. Artif. Intell..

[28] Yoshua Bengio,et al. Learning long-term dependencies with gradient descent is difficult , 1994, IEEE Trans. Neural Networks.

[29] Razvan Pascanu,et al. On the difficulty of training recurrent neural networks , 2012, ICML.

[30] Mohamad Ivan Fanany,et al. Simulated Annealing Algorithm for Deep Learning , 2015 .

[31] Ausif Mahmood,et al. A Framework for Designing the Architectures of Deep Convolutional Neural Networks , 2017, Entropy.

[32] Enrique Alba,et al. DLOPT: Deep Learning Optimization Library , 2018, ArXiv.

[33] Xin Yao,et al. Evolutionary Generative Adversarial Networks , 2018, IEEE Transactions on Evolutionary Computation.

[34] Elliot Meyerson,et al. Evolutionary architecture search for deep multitask networks , 2018, GECCO.

[35] Geoffrey E. Hinton,et al. Deep Learning , 2015, Nature.

[36] Abdullah Al-Dujaili,et al. Lipizzaner: A System That Scales Robust Generative Adversarial Network Training , 2018, ArXiv.

[37] Andrei Tchernykh,et al. Analysis of Mobility Patterns for Public Transportation and Bus Stops Relocation , 2019, Programming and Computer Software.

[38] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.

[39] Enrique Alba Torres,et al. BIN-CT: sistema inteligente para la gestión de la recogida de residuos urbanos , 2018 .

[40] Enrique Alba,et al. Parallel multi-objective metaheuristics for smart communications in vehicular networks , 2017, Soft Comput..

[41] Mingyuan Zhang,et al. Short-term load forecasting based on multivariate time series prediction and weighted neural network with random weights and kernels , 2018, Cluster Computing.

[42] Byunghan Lee,et al. Deep learning in bioinformatics , 2016, Briefings Bioinform..

[43] Guang Yang,et al. Neural networks designing neural networks: Multi-objective hyper-parameter optimization , 2016, 2016 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).

[44] R. Bracewell. The Fourier Transform and Its Applications , 1966 .

[45] Enrique Alba,et al. Waste Generation Prediction in Smart Cities Through Deep Neuroevolution , 2018, ICSC-Cities.

[46] E. Alba,et al. Metaheuristic Procedures for Training Neutral Networks , 2006 .

[47] Enrique Alba,et al. Low-Cost Recurrent Neural Network Expected Performance Evaluation , 2018, ArXiv.

[48] Mohamad Ivan Fanany,et al. Metaheuristic Algorithms for Convolution Neural Network , 2016, Comput. Intell. Neurosci..

[49] Yoshua Bengio,et al. Exploring Strategies for Training Deep Neural Networks , 2009, J. Mach. Learn. Res..

[50] S. C. Kremer,et al. Gradient Flow in Recurrent Nets: the Difficulty of Learning Long-Term Dependencies , 2001 .

[51] Una-May O'Reilly,et al. Spatial evolutionary generative adversarial networks , 2019, GECCO.

[52] Frank Hutter,et al. Speeding Up Automatic Hyperparameter Optimization of Deep Neural Networks by Extrapolation of Learning Curves , 2015, IJCAI.