Neural Monkey: An Open-source Tool for Sequence Learning

Abstract In this paper, we announce the development of Neural Monkey – an open-source neural machine translation (NMT) and general sequence-to-sequence learning system built over the TensorFlow machine learning library. The system provides a high-level API tailored for fast prototyping of complex architectures with multiple sequence encoders and decoders. Models’ overall architecture is specified in easy-to-read configuration files. The long-term goal of the Neural Monkey project is to create and maintain a growing collection of implementations of recently proposed components or methods, and therefore it is designed to be easily extensible. Trained models can be deployed either for batch data processing or as a web service. In the presented paper, we describe the design of the system and introduce the reader to running experiments using Neural Monkey.

[1]  P. Pitha Case frames of nouns , 1981 .

[2]  Marie Těšitelová,et al.  Psaná a mluvená odborná čeština z kvantitativního hlediska (v rámci věcného stylu) , 1983 .

[3]  Petr Sgall Contributions to functional syntax, semantics, and language comprehension , 1985 .

[4]  M. F.,et al.  Bibliography , 1985, Experimental Gerontology.

[5]  Geoffrey E. Hinton,et al.  Learning representations by back-propagating errors , 1986, Nature.

[6]  Petr Sgall,et al.  The Meaning Of The Sentence In Its Semantic And Pragmatic Aspects , 1986 .

[7]  Jan Hajic RUSLAN - An MT System Between Closely Related Languages , 1987, EACL.

[8]  W. Chafe,et al.  Properties of spoken and written language. , 1987 .

[9]  Jerry R. Hobbs,et al.  Translation by Abduction , 1990, COLING.

[10]  Morton Ann Gernsbacher,et al.  Language Comprehension As Structure Building , 1990 .

[11]  B. MacWhinney The CHILDES project: tools for analyzing talk , 1992 .

[12]  Robert L. Mercer,et al.  The Mathematics of Statistical Machine Translation: Parameter Estimation , 1993, CL.

[13]  Olga Müllerová Mluvený text a jeho syntaktická výstavba , 1994 .

[14]  Helge Dyvik Exploiting structural similarities in machine translation , 1994, Comput. Humanit..

[15]  David Brazil,et al.  口语语法 = A grammar of speech , 1995 .

[16]  Atro Voutilainen,et al.  A language-independent system for parsing unrestricted text , 1995 .

[17]  E. Hawkins Spoken and written language , 1985, Science.

[18]  Olga Müllerová,et al.  "Konverzace v češtině při rodinných a přátelských návštěvách", Jana Hoffmannová, Olga Müllerová, Jiří Zeman, Praha 1999 : [recenzja] / Leszek Mrzygłód. , 1999 .

[19]  Jan Hajic,et al.  Machine Translation of Very Close Languages , 2000, ANLP.

[20]  Jan Haji – an MT system for closely related languages , 2000 .

[21]  G. Leech Grammars of spoken English: new outcomes of corpus-oriented research. , 2000 .

[22]  Thorsten Brants,et al.  TnT – A Statistical Part-of-Speech Tagger , 2000, ANLP.

[23]  Moses Charikar,et al.  Similarity estimation techniques from rounding algorithms , 2002, STOC '02.

[24]  I. Çiçekli,et al.  1 A Machine Translation System Between a Pair of Closely Related Languages , 2002 .

[25]  Ineke Schuurman,et al.  CGN, an annotated corpus of spoken Dutch , 2003, LINC@EACL.

[26]  Noah A. Smith,et al.  The Web as a Parallel Corpus , 2003, CL.

[27]  Jan Hajic,et al.  A simple multilingual machine translation system , 2003, MTSUMMIT.

[28]  Liesbet Heyvaert A cognitive-functional approach to nominalization in English , 2003 .

[29]  Hermann Ney,et al.  A Systematic Comparison of Various Statistical Alignment Models , 2003, CL.

[30]  Petr Pajas,et al.  PDT-VALLEX : Creating a Large-coverage Valency Lexicon for Treebank Annotation , 2003 .

[31]  Alon Lavie,et al.  Adding Syntactic Annotations to Transcripts of Parent-Child Dialogs , 2004, LREC.

[32]  Vladislav Kubon,et al.  A translation model for languages of accessing countries , 2004, EAMT.

[33]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[34]  Světla Čmejrková,et al.  Mluvená čeština v televizních debatách: korpus DIALOG , 2004 .

[35]  Kepa Sarasola,et al.  An open-source shallow-transfer machine translation engine for the Romance languages of Spain , 2005, EAMT.

[36]  Matthew G. Snover,et al.  A Study of Translation Edit Rate with Targeted Human Annotation , 2006, AMTA.

[37]  Alexandr Andoni,et al.  Near-Optimal Hashing Algorithms for Approximate Nearest Neighbor in High Dimensions , 2006, 2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS'06).

[38]  Jason Eisner,et al.  Lexical Semantics , 2020, The Handbook of English Linguistics.

[39]  P. Homola,et al.  Exploiting Similarity in the MT into a Minority Language , 2006 .

[40]  Jana Hoffmannová Čeština v dialogu generací. , 2007 .

[41]  Eckhard Bick,et al.  Using Danish as a CG Interlingua: A Wide-Coverage Norwegian-English Machine Translation System , 2007, NODALIDA.

[42]  Stephan Vogel,et al.  Parallel Implementations of Word Alignment Tool , 2008, SETQALNLP.

[43]  Jitka Šonková Morfologie mluvené češtiny - frekvenční analýza , 2008 .

[44]  Francis M. Tyers,et al.  Developing Prototypes for Machine Translation between Two Sami Languages , 2009, EAMT.

[45]  Brian Quigley,et al.  Synthesis Digital Library of Engineering and Computer Science. , 2009, Issues in Science and Technology Librarianship.

[46]  Miquel Espl,et al.  Bitextor, a free/open-source software to harvest translation memories from multilingual websites , 2009 .

[47]  Vladislav Kubon,et al.  A Method to Restrict the Blow-up of Hypotheses of a Non-disambiguated , 2009, RANLP.

[48]  P. Homola,et al.  A method of hybrid MT for related languages , 2010 .

[49]  Petr Pajas,et al.  Querying Diverse Treebanks in a Uniform Way , 2010, LREC.

[50]  Razvan Pascanu,et al.  Theano: A CPU and GPU Math Compiler in Python , 2010, SciPy.

[51]  Václav Cvrček Mluvnice současné češtiny. , 2010 .

[52]  Hairong Kuang,et al.  The Hadoop Distributed File System , 2010, 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST).

[53]  Clément Farabet,et al.  Torch7: A Matlab-like Environment for Machine Learning , 2011, NIPS 2011.

[54]  Zdenka Uresová Building the PDT-Vallex valency lexicon , 2011 .

[55]  Marie Mikulová,et al.  Announcing Prague Czech-English Dependency Treebank 2.0 , 2012, LREC.

[56]  David Chisnall A New Objective-C Runtime: from Research to Production , 2012, ACM Queue.

[57]  Ondrej Dusek,et al.  The Joy of Parallelism with CzEng 1.0 , 2012, LREC.

[58]  Matthew D. Zeiler ADADELTA: An Adaptive Learning Rate Method , 2012, ArXiv.

[59]  Laszlo Hunyadi,et al.  Incompleteness and fragmentation in spoken language syntax and its relation to prosody and gesturing: Cognitive processes vs. Possible formal cues , 2013, 2013 IEEE 4th International Conference on Cognitive Infocommunications (CogInfoCom).

[60]  Philipp Koehn,et al.  Dirt Cheap Web-Scale Parallel Text from the Common Crawl , 2013, ACL.

[61]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[62]  W. Bruce Croft,et al.  Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) , 2013 .

[63]  Ilya Shpigor,et al.  Instant MinGW Starter , 2013 .

[64]  Yoshua Bengio,et al.  On the Properties of Neural Machine Translation: Encoder–Decoder Approaches , 2014, SSST@EMNLP.

[65]  V. Kolárová Chapter 2. Special valency behavior of Czech deverbal nouns , 2014 .

[66]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[67]  J. Panevová Chapter 1. Contribution of valency to the analysis of language , 2014 .

[68]  Yoshua Bengio,et al.  Show, Attend and Tell: Neural Image Caption Generation with Visual Attention , 2015, ICML.

[69]  Yoshua Bengio,et al.  Blocks and Fuel: Frameworks for deep learning , 2015, ArXiv.

[70]  Kenta Oono,et al.  Chainer : a Next-Generation Open Source Framework for Deep Learning , 2015 .

[71]  Quoc V. Le,et al.  A Neural Conversational Model , 2015, ArXiv.

[72]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[73]  Christopher D. Manning,et al.  Bilingual Word Representations with Monolingual Quality in Mind , 2015, VS@HLT-NAACL.

[74]  Samy Bengio,et al.  Show and tell: A neural image caption generator , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[75]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[76]  Rico Sennrich,et al.  Neural Machine Translation of Rare Words with Subword Units , 2015, ACL.

[77]  Vladislav Kubon,et al.  Automated Implementation Process of Machine Translation System for Related Languages , 2016, Comput. Informatics.

[78]  Oriol Vinyals,et al.  Multilingual Language Processing From Bytes , 2015, NAACL.

[79]  Andy Way,et al.  FaDA: Fast Document Aligner using Word Embedding , 2016, Prague Bull. Math. Linguistics.

[80]  Martín Abadi,et al.  TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems , 2016, ArXiv.

[81]  Yang Liu,et al.  Minimum Risk Training for Neural Machine Translation , 2015, ACL.

[82]  Noah A. Smith,et al.  Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) , 2016, ACL 2016.

[83]  Hans Uszkoreit,et al.  Deeper Machine Translation and Evaluation for German , 2016, DMTW.

[84]  Jindřich Helcl,et al.  CUNI System for WMT16 Automatic Post-Editing and Multimodal Translation Tasks , 2016, WMT.

[85]  Gareth J. F. Jones,et al.  Representing Documents and Queries as Sets of Word Embedded Vectors for Information Retrieval , 2016, ArXiv.

[86]  Rico Sennrich,et al.  Nematus: a Toolkit for Neural Machine Translation , 2017, EACL.

[87]  Irena Holubová,et al.  Extracting Parallel Paragraphs from Common Crawl , 2017, Prague Bull. Math. Linguistics.

[88]  Vaibhava Goel,et al.  Self-Critical Sequence Training for Image Captioning , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[89]  Alexander M. Rush,et al.  OpenNMT: Open-Source Toolkit for Neural Machine Translation , 2017, ACL.

[90]  Vladislav Kubon,et al.  Česílko Goes Open-source , 2017, Prague Bull. Math. Linguistics.

[91]  Lars Ahrenberg,et al.  Back to the Future ? The Case for English-Swedish Direct Machine Translation , .

[92]  Kevin P. Scannell Machine translation for closely related language pairs , 2022 .