Immunocomputing-Based Approach for Optimizing the Topologies of LSTM Networks

This paper aims to automatically design optimal LSTM topologies using the clonal selection algorithm (CSA) to solve text classification tasks such as sentiment analysis and SMS spam classification. Designing optimal topologies involves determining the best configuration of hyperparameters that will give the best performance. The current state-of-the-art LSTM topologies are often designed using trial and error approaches which are incredibly time-consuming and require domain experts. Our proposed method, referred to as CSA-LSTM, is evaluated using the Large Movie Review Dataset (IMDB). Furthermore, to verify the robustness of the hyperparameters discovered by CSA for the IMDB dataset, we have used them for the other datasets, viz. the Twitter US Airline Sentiment and the SMS Spam Collection. Additionally, the discovered hyperparameters for the LSTM are combined with pre-determined convolutional neural network (CNN) layers to achieve the same or better results to fast the training time and fewer trainable parameters. For further verification and evaluation of the generalization ability and effectiveness of the proposed approach, it is compared with four machine learning algorithms widely used for text classification tasks: (1) random forest (RF), (2) logistic regression (LR), (3) support vector machine (SVM), and (4) multinomial naive Bayes (NB). The results of our experiments show that the LSTM topologies automatically designed by our CSA method are less expensive, reusable and outperform the machine learning algorithms and other models in the literature evaluated on the same three datasets. Through our proposed method, LSTM’s best topology can be self-determined without any human intervention, making CSA-based algorithms a promising approach to automatically design optimal LSTM topologies that provide the best performance for a given task.

[1]  Minmin Chen,et al.  Efficient Vector Representation for Documents through Corruption , 2017, ICLR.

[2]  Christopher Potts,et al.  Learning Word Vectors for Sentiment Analysis , 2011, ACL.

[3]  João Paulo Papa,et al.  Evolving Long Short-Term Memory Networks , 2020, ICCS.

[4]  Jonathan Timmis,et al.  Application Areas of AIS: The Past, The Present and The Future , 2005, ICARIS.

[5]  G W Hoffmann,et al.  A neural network model based on the analogy with the immune system. , 1986, Journal of theoretical biology.

[6]  Saeid Nahavandi,et al.  A Novel Evolutionary-Based Deep Convolutional Neural Network Model for Intelligent Load Forecasting , 2021, IEEE Transactions on Industrial Informatics.

[7]  Alan S. Perelson,et al.  The immune system, adaptation, and machine learning , 1986 .

[8]  Rico Sennrich,et al.  Neural Machine Translation of Rare Words with Subword Units , 2015, ACL.

[9]  Akebo Yamakami,et al.  Contributions to the study of SMS spam filtering: new collection and results , 2011, DocEng '11.

[10]  Ali Al Bataineh,et al.  Immuno-Computing-based Neural Learning for Data Classification , 2019, International Journal of Advanced Computer Science and Applications.

[11]  D. Kaur,et al.  A Comparative Study of Different Curve Fitting Algorithms in Artificial Neural Network using Housing Dataset , 2018, NAECON 2018 - IEEE National Aerospace and Electronics Conference.

[12]  Devinder Kaur,et al.  Optimal Convolutional Neural Network Architecture Design Using Clonal Selection Algorithm , 2019 .

[13]  Yoshiteru Ishida Fully distributed diagnosis by PDP learning algorithm: towards immune network PDP model , 1990, 1990 IJCNN International Joint Conference on Neural Networks.

[14]  Amita Sharma,et al.  Designing optimal architecture of recurrent neural network (LSTM) with particle swarm optimization technique specifically for educational dataset , 2019 .

[15]  Pascal Vincent,et al.  Representation Learning: A Review and New Perspectives , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Saeid Nahavandi,et al.  Towards novel deep neuroevolution models: chaotic levy grasshopper optimization for short-term wind speed forecasting , 2021, Engineering with Computers.

[17]  Ali S. Al Bataineh,et al.  A gradient boosting regression based approach for energy consumption prediction in buildings , 2019 .

[18]  Caroline Tagg,et al.  A corpus linguistics study of SMS text messaging , 2009 .

[19]  Travis Desell,et al.  Optimizing Long Short-Term Memory Recurrent Neural Networks Using Ant Colony Optimization to Predict Turbine Engine Vibration , 2017, Appl. Soft Comput..

[20]  Andry Rakotonirainy,et al.  Long Short Term Memory Hyperparameter Optimization for a Neural Network Based Emotion Recognition Framework , 2018, IEEE Access.

[21]  Wei Xu,et al.  Deep Captioning with Multimodal Recurrent Neural Networks (m-RNN) , 2014, ICLR.

[22]  Ahmed Fawzy Gad Practical Computer Vision Applications Using Deep Learning with CNNs , 2018, Apress.

[23]  José Camacho-Collados,et al.  On the Role of Text Preprocessing in Neural Network Architectures: An Evaluation Study on Text Categorization and Sentiment Analysis , 2017, BlackboxNLP@EMNLP.

[24]  Yue Zhang,et al.  Sentence-State LSTM for Text Representation , 2018, ACL.

[25]  Nitesh V. Chawla,et al.  SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..

[26]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[27]  Anderson Paulo de Paiva,et al.  Factorial design analysis applied to the performance of SMS anti-spam filtering systems , 2016, Expert Syst. Appl..

[28]  F. Burnet A modification of jerne's theory of antibody production using the concept of clonal selection , 1976, CA: a cancer journal for clinicians.

[29]  Jason Brownlee,et al.  Clever Algorithms: Nature-Inspired Programming Recipes , 2012 .

[30]  Jason Brownlee,et al.  Clonal selection theory and Clonalg: the clonal selection classification algorithm (CSCA) , 2005 .

[31]  Bahram Gharabaghi,et al.  Genetic-Algorithm-Optimized Sequential Model for Water Temperature Prediction , 2020, Sustainability.

[32]  Jonathan Timmis,et al.  Artificial immune systems as a novel soft computing paradigm , 2003, Soft Comput..

[33]  Autoencoder based Semi-Supervised Anomaly Detection in Turbofan Engines , 2020 .

[34]  Kyung-shik Shin,et al.  Genetic Algorithm-Optimized Long Short-Term Memory Network for Stock Market Prediction , 2018, Sustainability.

[35]  Jonathan Timmis,et al.  An Introduction to Artificial Immune Systems , 2012, Handbook of Natural Computing.

[36]  Andrew Y. Ng,et al.  Parsing Natural Scenes and Natural Language with Recursive Neural Networks , 2011, ICML.

[37]  Jason Brownlee,et al.  Artificial immune recognition system (AIRS): a review and analysis , 2005 .

[38]  Fernando José Von Zuben,et al.  Learning and optimization using the clonal selection principle , 2002, IEEE Trans. Evol. Comput..

[39]  Geoffrey E. Hinton,et al.  Speech recognition with deep recurrent neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[40]  Saeid Nahavandi,et al.  Neuroevolution-based autonomous robot navigation: A comparative study , 2020, Cognitive Systems Research.

[41]  Abba Suganda Girsang,et al.  Artificial Bee Colony-Optimized LSTM for Bitcoin Price Prediction , 2019, Advances in Science, Technology and Engineering Systems Journal.

[42]  Saeid Nahavandi,et al.  Optimal Autonomous Driving Through Deep Imitation Learning and Neuroevolution , 2019, 2019 IEEE International Conference on Systems, Man and Cybernetics (SMC).

[43]  Toby P. Breckon,et al.  SMS Spam Filtering Using Probabilistic Topic Modelling and Stacked Denoising Autoencoder , 2016, ICANN.

[44]  Reza Javidan,et al.  Spam filtering in SMS using recurrent neural networks , 2017, 2017 Artificial Intelligence and Signal Processing Conference (AISP).

[45]  Stephanie Forrest,et al.  Infect Recognize Destroy , 1996 .

[46]  Anand Kumar,et al.  Sentiment Classification System of Twitter Data for US Airline Service Analysis , 2018, 2018 IEEE 42nd Annual Computer Software and Applications Conference (COMPSAC).

[48]  Jeffrey O. Kephart,et al.  Biologically Inspired Defenses Against Computer Viruses , 1995, IJCAI.

[49]  Leandro Nunes de Castro,et al.  The Clonal Selection Algorithm with Engineering Applications , 2011 .

[50]  Alan S. Perelson,et al.  Self-nonself discrimination in a computer , 1994, Proceedings of 1994 IEEE Computer Society Symposium on Research in Security and Privacy.

[51]  Vincenzo Cutello,et al.  Clonal Selection Algorithms: A Comparative Case Study Using Effective Mutation Potentials , 2005, ICARIS.

[52]  Jason Brownlee,et al.  Clonal selection algorithms , 2007 .

[53]  Siddhartha Mishra,et al.  Coupled Oscillatory Recurrent Neural Network (coRNN): An accurate and (gradient) stable architecture for learning long time dependencies , 2020, ICLR.

[54]  Hugues Bersini,et al.  Hints for Adaptive Problem Solving Gleaned from Immune Networks , 1990, PPSN.

[55]  Jonathan Timmis,et al.  Theoretical advances in artificial immune systems , 2008, Theor. Comput. Sci..

[56]  Xuanjing Huang,et al.  Information Aggregation via Dynamic Routing for Sequence Encoding , 2018, COLING.

[57]  Kuldip K. Paliwal,et al.  Bidirectional recurrent neural networks , 1997, IEEE Trans. Signal Process..