Improvement of Neural Reverse Dictionary by Using Cascade Forward Neural Network

: A reverse dictionary maps a description to the word specified by the description. The neural reverse dic- tionary (NRD) learns to map word embeddings for an input definition into an embedding of the word defined by the definition using neural networks. Such a function encodes phrasal semantics and bridges the gap between them and lexical semantics. However, previous NRD has a limitation in accuracy due to its insu ffi cient capacity. To solve this problem, we used novel combinations of neural networks, which are e ff ective in neural machine translation and image processing, with su ffi cient capacities. We found that, an LSTM output adjustment by using a multi-layer fully connected network with bypass structures (CFNN) was more e ff ective for reverse dictionary tasks than using more complicated LSTM. BiLSTM + CFNN was comparable to the commercial system OneLook Reverse Dictionary in some metrics, and noised biLSTM + CFNN which we tuned by a noising data augmentation outperformed OneLook Reverse Dictionary in almost all metrics. We also examined the reasons for the success of biLSTM + CFNN and revealed that a bypass structure of the CFNN and balance in the capacity of LSTM and the CFNN contribute to the improved performance of the NRD.

[1]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[2]  Kazunori Yamaguchi,et al.  Improvement of Reverse Dictionary by Tuning Word Vectors and Category Inference , 2018, ICIST.

[3]  Myle Ott,et al.  Understanding Back-Translation at Scale , 2018, EMNLP.

[4]  Guillaume Lample,et al.  Unsupervised Machine Translation Using Monolingual Corpora Only , 2017, ICLR.

[5]  Wesley De Neve,et al.  Improving Language Modeling using Densely Connected Recurrent Neural Networks , 2017, Rep4NLP@ACL.

[6]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[7]  Kilian Q. Weinberger,et al.  Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Piotr,et al.  UNSUPERVISED MACHINE TRANSLATION USING MONOLINGUAL CORPORA ONLY , 2017 .

[9]  Serge J. Belongie,et al.  Residual Networks Behave Like Ensembles of Relatively Shallow Networks , 2016, NIPS.

[10]  Sushrut Thorat,et al.  Implementing a Reverse Dictionary, based on word definitions, using a Node-Graph Architecture , 2016, COLING.

[11]  Felix Hill,et al.  Learning Distributed Representations of Sentences from Unlabelled Data , 2016, NAACL.

[12]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Yoshua Bengio,et al.  Learning to Understand Phrases by Embedding the Dictionary , 2015, TACL.

[14]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[15]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[16]  Hiram Calvo,et al.  A Reverse Dictionary Based on Semantic Analysis Using WordNet , 2013, MICAI.

[17]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[18]  Anindya Datta,et al.  Building a Scalable Database-Driven Reverse Dictionary , 2013, IEEE Transactions on Knowledge and Data Engineering.

[19]  Janet Metcalfe,et al.  Tip-of-the-tongue (TOT) states: retrieval, behavior, and experience , 2011, Memory & cognition.

[20]  Kemal Oflazer,et al.  Use of wordnet for retrieving words from their meanings , 2004 .

[21]  Pierre Nugues,et al.  A lexical database and an algorithm to find words from definitions , 2002 .

[22]  Yoshua Bengio,et al.  A Neural Probabilistic Language Model , 2003, J. Mach. Learn. Res..

[23]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[24]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[25]  Christian Lebiere,et al.  The Cascade-Correlation Learning Architecture , 1989, NIPS.