Learning to Generate Textual Data

To learn text understanding models with millions of parameters one needs massive amounts of data. In this work, we argue that generating data can compensate for this need. While defining generic data generators is difficult, we propose to allow generators to be “weakly” specified in the sense that a set of parameters controls how the data is generated. Consider for example generators where the example templates, grammar, and/or vocabulary is determined by this set of parameters. Instead of manually tuning these parameters, we learn them from the limited training data at our disposal. To achieve this, we derive an efficient algorithm called GENERE that jointly estimates the parameters of the model and the undetermined generation parameters. We illustrate its benefits by learning to solve math exam questions using a highly parametrized sequence-to-sequence neural network.

[1]  Tomaso Poggio,et al.  Incorporating prior information in machine learning by creating virtual examples , 1998, Proc. IEEE.

[2]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[3]  Albert Gatt,et al.  SimpleNLG: A Realisation Engine for Practical Applications , 2009, ENLG.

[4]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[5]  Oren Etzioni,et al.  Learning to Solve Arithmetic Word Problems with Verb Categorization , 2014, EMNLP.

[6]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[7]  Guillaume Bouchard,et al.  The Tradeoff Between Generative and Discriminative Classifiers , 2004 .

[8]  Phil Blunsom,et al.  Reasoning about Entailment with Neural Attention , 2015, ICLR.

[9]  Dan Roth,et al.  Reasoning about Quantities in Natural Language , 2015, TACL.

[10]  Dan Roth,et al.  Solving General Arithmetic Word Problems , 2016, EMNLP.

[11]  Anthony Widjaja,et al.  Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2003, IEEE Transactions on Neural Networks.

[12]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[13]  Mirella Lapata,et al.  Language to Logical Form with Neural Attention , 2016, ACL.

[14]  Qiang Yang,et al.  A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[15]  James Parker,et al.  on Knowledge and Data Engineering, , 1990 .

[16]  R. J. Williams,et al.  On the use of backpropagation in associative reinforcement learning , 1988, IEEE 1988 International Conference on Neural Networks.

[17]  Geoffrey E. Hinton,et al.  Grammar as a Foreign Language , 2014, NIPS.