Evolving Long Short-Term Memory Networks

Machine learning techniques have been massively employed in the last years over a wide variety of applications, especially those based on deep learning, which obtained state-of-the-art results in several research fields. Despite the success, such techniques still suffer from some shortcomings, such as the sensitivity to their hyperparameters, whose proper selection is context-dependent, i.e., the model may perform better over each dataset when using a specific set of hyperparameters. Therefore, we propose an approach based on evolutionary optimization techniques for fine-tuning Long Short-Term Memory networks. Experiments were conducted over three public word-processing datasets for part-of-speech tagging. The results showed the robustness of the proposed approach for the aforementioned task.

[1]  Z. Geem Music-Inspired Harmony Search Algorithm: Theory and Applications , 2009 .

[2]  Eric Brill,et al.  A Simple Rule-Based Part of Speech Tagger , 1992, HLT.

[3]  Xin-She Yang,et al.  Learning Parameters in Deep Belief Networks Through Firefly Algorithm , 2016, ANNPR.

[4]  Zong Woo Geem,et al.  Music-Inspired Harmony Search Algorithm , 2009 .

[5]  Sung-Bae Cho,et al.  Particle Swarm Optimization-based CNN-LSTM Networks for Forecasting Energy Consumption , 2019, 2019 IEEE Congress on Evolutionary Computation (CEC).

[6]  Kyung-shik Shin,et al.  Genetic Algorithm-Optimized Long Short-Term Memory Network for Stock Market Prediction , 2018, Sustainability.

[7]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[8]  Iryna Gurevych,et al.  Optimal Hyperparameters for Deep LSTM-Networks for Sequence Labeling Tasks , 2017, ArXiv.

[9]  Joao Paulo Papa,et al.  Temperature-Based Deep Boltzmann Machines , 2017, Neural Processing Letters.

[10]  Radu-Emil Precup,et al.  Embedding Gravitational Search Algorithms in Convolutional Neural Networks for OCR applications , 2012, 2012 7th IEEE International Symposium on Applied Computational Intelligence and Informatics (SACI).

[11]  Krzysztof Krawiec,et al.  Geometric Semantic Genetic Programming , 2012, PPSN.

[12]  Jürgen Schmidhuber,et al.  LSTM: A Search Space Odyssey , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[13]  Steven Bird,et al.  NLTK: The Natural Language Toolkit , 2002, ACL 2006.

[14]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[15]  Timothy Perkis,et al.  Stack-based genetic programming , 1994, Proceedings of the First IEEE Conference on Evolutionary Computation. IEEE World Congress on Computational Intelligence.

[16]  João Paulo Papa,et al.  A Novel Siamese-Based Approach for Scene Change Detection With Applications to Obstructed Routes in Hazardous Environments , 2020, IEEE Intelligent Systems.

[17]  Eduard H. Hovy,et al.  End-to-end Sequence Labeling via Bi-directional LSTM-CNNs-CRF , 2016, ACL.

[18]  F. Wilcoxon Individual Comparisons by Ranking Methods , 1945 .

[19]  Hossein Nezamabadi-pour,et al.  GSA: A Gravitational Search Algorithm , 2009, Inf. Sci..

[20]  Julian M. Kupiec,et al.  Robust part-of-speech tagging using a hidden Markov model , 1992 .

[21]  Chris Brew,et al.  Proceedings of the ACL-02 Workshop on Effective tools and methodologies for teaching natural language processing and computational linguistics - Volume 1 , 2002 .

[22]  J J Hopfield,et al.  Neural networks and physical systems with emergent collective computational abilities. , 1982, Proceedings of the National Academy of Sciences of the United States of America.

[23]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[24]  Yee Whye Teh,et al.  A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.

[25]  Suresh Manandhar,et al.  Dependency Based Embeddings for Sentence Classification Tasks , 2016, NAACL.

[26]  Andrew W. Senior,et al.  Long short-term memory recurrent neural network architectures for large scale acoustic modeling , 2014, INTERSPEECH.

[27]  Xin-She Yang,et al.  Firefly algorithm, stochastic test functions and design optimisation , 2010, Int. J. Bio Inspired Comput..

[28]  João Paulo Papa,et al.  A Hybrid Approach for Breast Mass Categorization , 2019 .

[29]  Geoffrey E. Hinton,et al.  An Efficient Learning Procedure for Deep Boltzmann Machines , 2012, Neural Computation.

[30]  Jürgen Schmidhuber,et al.  Learning to Forget: Continual Prediction with LSTM , 2000, Neural Computation.

[31]  João Paulo Papa,et al.  Fine Tuning Deep Boltzmann Machines Through Meta-Heuristic Approaches , 2018, 2018 IEEE 12th International Symposium on Applied Computational Intelligence and Informatics (SACI).

[32]  John R. Koza,et al.  Genetic programming - on the programming of computers by means of natural selection , 1993, Complex adaptive systems.

[33]  Luiz C F Ribeiro,et al.  Bag of Samplings for computer-assisted Parkinson's disease diagnosis based on Recurrent Neural Networks , 2019, Comput. Biol. Medicine.

[34]  David E. Goldberg,et al.  Genetic algorithms and Machine Learning , 1988, Machine Learning.

[35]  João Paulo Papa,et al.  Fine-Tuning Convolutional Neural Networks Using Harmony Search , 2015, CIARP.