A Reservoir Computing Approach to Word Sense Disambiguation

Reservoir computing (RC) has emerged as an alternative approach for the development of fast trainable recurrent neural networks (RNNs). It is considered to be biologically plausible due to the similarity between randomly designed artificial reservoir structures and cortical structures in the brain. The paper continues our previous research on the application of a member of the family of RC approaches—the echo state network (ESN)—to the natural language processing (NLP) task of Word Sense Disambiguation (WSD). A novel deep bi-directional ESN (DBiESN) structure is proposed, as well as a novel approach for exploiting reservoirs’ steady states. The models also make use of ESN-enhanced word embeddings. The paper demonstrates that our DBiESN approach offers a good alternative to previously tested BiESN models in the context of the word sense disambiguation task having smaller number of trainable parameters. Although our DBiESN-based model achieves similar accuracy to other popular RNN architectures, we could not outperform the state of the art. However, due to the smaller number of trainable parameters in the reservoir models, in contrast to fully trainable RNNs, it is to be expected that they would have better generalization properties as well as higher potential to increase their accuracy, which should justify further exploration of such architectures.

[1]  Peter Ford Dominey,et al.  Real-Time Parallel Processing of Grammatical Structure in the Fronto-Striatal System: A Recurrent Network Simulation Study Using Reservoir Computing , 2013, PloS one.

[2]  Jürgen Schmidhuber,et al.  Sequence Labelling in Structured Domains with Hierarchical Recurrent Neural Networks , 2007, IJCAI.

[3]  Edward T. Bullmore,et al.  The Multilayer Connectome of Caenorhabditis elegans , 2016, PLoS Comput. Biol..

[4]  Petia D. Koprinkova-Hristova,et al.  Echo State Network for Word Sense Disambiguation , 2018, AIMSA.

[5]  John G. Harris,et al.  Minimum mean squared error time series classification using an echo state network prediction model , 2006, 2006 IEEE International Symposium on Circuits and Systems.

[6]  Henry Markram,et al.  Real-Time Computing Without Stable States: A New Framework for Neural Computation Based on Perturbations , 2002, Neural Computation.

[7]  Claudio Gallicchio,et al.  Comparison between DeepESNs and gated RNNs on multivariate time-series prediction , 2018, ESANN.

[8]  George A. Miller,et al.  A Semantic Concordance , 1993, HLT.

[9]  Petia D. Koprinkova-Hristova,et al.  Echo State Networks for Multi-dimensional Data Clustering , 2012, ICANN.

[10]  Mikhail Belkin,et al.  Reconciling modern machine-learning practice and the classical bias–variance trade-off , 2018, Proceedings of the National Academy of Sciences.

[11]  Stefan L. Frank,et al.  Generalization and systematicity in echo state networks , 2008 .

[12]  Scott Cotton,et al.  SENSEVAL-2: Overview , 2001, *SEMEVAL.

[13]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[14]  Petia D. Koprinkova-Hristova,et al.  Echo State vs. LSTM Networks for Word Sense Disambiguation , 2019, ICANN.

[15]  Roberto Navigli,et al.  Word Sense Disambiguation: A Unified Evaluation Framework and Empirical Comparison , 2017, EACL.

[16]  Anna Bogomolova,et al.  Stochastic Gradient Versus Recursive Least Squares Learning , 2006 .

[17]  Francesco Piazza,et al.  Echo State Networks for Real-Time Audio Applications , 2007, ISNN.

[18]  Benjamin Schrauwen,et al.  Phoneme Recognition with Large Hierarchical Reservoirs , 2010, NIPS.

[19]  Martha Palmer,et al.  The English all-words task , 2004, SENSEVAL@ACL.

[20]  Hossam Faris,et al.  Bidirectional reservoir networks trained using SVM$$+$$+ privileged information for manufacturing process modeling , 2017, Soft Comput..

[21]  B. Schrauwen,et al.  Reservoir computing and extreme learning machines for non-linear time-series data analysis , 2013, Neural Networks.

[22]  VerstraetenD.,et al.  2007 Special Issue , 2007 .

[23]  Petia D. Koprinkova-Hristova Multi-dimensional Data Clustering and Visualization via Echo State Networks , 2016, New Approaches in Intelligent Image Analysis.

[24]  Stefan Wermter,et al.  Interactive natural language acquisition in a multi-modal recurrent neural architecture , 2017, Connect. Sci..

[25]  Yoshua Bengio,et al.  On the Properties of Neural Machine Translation: Encoder–Decoder Approaches , 2014, SSST@EMNLP.

[26]  Christian Bauckhage,et al.  Echo State Networks for Named Entity Recognition , 2019, ICANN.

[27]  Jean-Pierre Martens,et al.  Large Vocabulary Continuous Speech Recognition With Reservoir-Based Acoustic Models , 2014, IEEE Signal Processing Letters.

[28]  Garrison W. Cottrell,et al.  2007 Special Issue: Learning grammatical structure with Echo State Networks , 2007 .

[29]  Herbert Jaeger,et al.  Reservoir computing approaches to recurrent neural network training , 2009, Comput. Sci. Rev..

[30]  Peter Ford Dominey,et al.  Reservoir Computing Properties of Neural Dynamics in Prefrontal Cortex , 2016, PLoS Comput. Biol..

[31]  Hai Zhao,et al.  A Unified Tagging Solution: Bidirectional LSTM Recurrent Neural Network with Word Embedding , 2015, ArXiv.

[32]  Roberto Navigli,et al.  SemEval-2013 Task 12: Multilingual Word Sense Disambiguation , 2013, *SEMEVAL.

[33]  Mikhail Belkin,et al.  Reconciling modern machine-learning practice and the classical bias–variance trade-off , 2018, Proceedings of the National Academy of Sciences.

[34]  Herbert Jaeger,et al.  Discovering multiscale dynamical features with hierarchical Echo State Networks , 2008 .

[35]  Yoshua Bengio,et al.  Towards Biologically Plausible Deep Learning , 2015, ArXiv.

[36]  Roberto Navigli,et al.  SemEval-2015 Task 13: Multilingual All-Words Sense Disambiguation and Entity Linking , 2015, *SEMEVAL.

[37]  Xavier Hinaut,et al.  Semantic Role Labelling for Robot Instructions using Echo State Networks , 2016, ESANN.

[38]  Jean-Pierre Martens,et al.  Acoustic Modeling With Hierarchical Reservoirs , 2013, IEEE Transactions on Audio, Speech, and Language Processing.

[39]  Hai Zhao,et al.  Part-of-Speech Tagging with Bidirectional Long Short-Term Memory Recurrent Neural Network , 2015, ArXiv.

[40]  Xavier Hinaut,et al.  Using natural language feedback in a neuro-inspired integrated multimodal robotic architecture , 2016, 2016 25th IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN).

[41]  Simone Scardapane,et al.  Advances in Biologically Inspired Reservoir Computing , 2017, Cognitive Computation.

[42]  Martha Palmer,et al.  SemEval-2007 Task-17: English Lexical Sample, SRL and All Words , 2007, Fourth International Workshop on Semantic Evaluations (SemEval-2007).

[43]  Petia D. Koprinkova-Hristova,et al.  Word Embeddings Improvement via Echo State Networks , 2019, 2019 IEEE International Symposium on INnovations in Intelligent SysTems and Applications (INISTA).

[44]  Claudio Gallicchio,et al.  A Reservoir Computing Approach for Human Gesture Recognition from Kinect Data , 2016, AI*AAL@AI*IA.

[45]  Alexander Popov,et al.  Word Sense Disambiguation with Recurrent Neural Networks , 2017, RANLP 2017.

[46]  M. Blignaut,et al.  Towards a Transferable and Cost-Effective Plant AFLP Protocol , 2013, PloS one.

[47]  Baobao Chang,et al.  Graph-based Dependency Parsing with Bidirectional LSTM , 2016, ACL.

[48]  Tho Le-Ngoc,et al.  Recursive least squares constant modulus algorithm for blind adaptive array , 2004, IEEE Transactions on Signal Processing.