Does the Order of Training Samples Matter? Improving Neural Data-to-Text Generation with Curriculum Learning

Recent advancements in data-to-text generation largely take on the form of neural end-to-end systems. Efforts have been dedicated to improving text generation systems by changing the order of training samples in a process known as curriculum learning. Past research on sequence-to-sequence learning showed that curriculum learning helps to improve both the performance and convergence speed. In this work, we delve into the same idea surrounding the training samples consisting of structured data and text pairs, where at each update, the curriculum framework selects training samples based on the model’s competence. Specifically, we experiment with various difficulty metrics and put forward a soft edit distance metric for ranking training samples. On our benchmarks, it shows faster convergence speed where training time is reduced by 38.7% and performance is boosted by 4.84 BLEU.

[1]  Verena Rieser,et al.  The E2E Dataset: New Challenges For End-to-End Generation , 2017, SIGDIAL Conference.

[2]  Mirella Lapata,et al.  Modeling Local Coherence: An Entity-Based Approach , 2005, ACL.

[3]  Xiaoyu Shen,et al.  DART: A Lightweight Quality-Suggestive Data-to-Text Annotation Tool , 2020, COLING.

[4]  Lidia S. Chao,et al.  Norm-Based Curriculum Learning for Neural Machine Translation , 2020, ACL.

[5]  Marco Guerini,et al.  Generating E-Commerce Product Titles and Predicting their Quality , 2018, INLG.

[6]  Huda Khayrallah,et al.  An Empirical Exploration of Curriculum Learning for Neural Machine Translation , 2018, ArXiv.

[7]  Lidia S. Chao,et al.  Uncertainty-Aware Curriculum Learning for Neural Machine Translation , 2020, ACL.

[8]  Kai A. Krueger,et al.  Flexible shaping: How learning in small steps helps , 2009, Cognition.

[9]  Xiaoyu Shen,et al.  Neural Data-to-Text Generation with LM-based Text Augmentation , 2021, EACL.

[10]  Ehud Reiter,et al.  Book Reviews: Building Natural Language Generation Systems , 2000, CL.

[11]  Alexander M. Rush,et al.  Challenges in Data-to-Document Generation , 2017, EMNLP.

[12]  Barnabás Póczos,et al.  Competence-based Curriculum Learning for Neural Machine Translation , 2019, NAACL.

[13]  Gholamreza Haffari,et al.  Machine learning approaches for dealing with limited bilingual data in statistical machine translation , 2009 .

[14]  Dietrich Klakow,et al.  Neural Data-to-Text Generation via Jointly Learning the Segmentation and Correspondence , 2020, ACL.

[15]  Xudong Hong,et al.  Improving Language Generation from Feature-Rich Tree-Structured Data with Relational Graph Convolutional Encoders , 2019, EMNLP.

[16]  David Grangier,et al.  Neural Text Generation from Structured Data with Application to the Biography Domain , 2016, EMNLP.

[17]  Natalia Gimelshein,et al.  PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.

[18]  Xiaoyu Shen,et al.  Unsupervised Pidgin Text Generation By Pivoting English Data and Self-Training , 2020, ArXiv.

[19]  Diyi Yang,et al.  ToTTo: A Controlled Table-To-Text Generation Dataset , 2020, EMNLP.

[20]  Eric Brill,et al.  An Improved Error Model for Noisy Channel Spelling Correction , 2000, ACL.

[21]  Vladimir I. Levenshtein,et al.  Binary codes capable of correcting deletions, insertions, and reversals , 1965 .

[22]  V. Marchman,et al.  From rote learning to system building: acquiring verb morphology in children and connectionist nets , 1993, Cognition.

[23]  Claire Gardent,et al.  The WebNLG Challenge: Generating Text from DBPedia Data , 2016, INLG.

[24]  J. Elman Learning and development in neural networks: the importance of starting small , 1993, Cognition.

[25]  Fred J. Damerau,et al.  A technique for computer detection and correction of spelling errors , 1964, CACM.

[26]  Jie Zhou,et al.  MovieChats: Chat like Humans in a Closed Domain , 2020, EMNLP.

[27]  Alex Marin,et al.  Jointly Improving Language Understanding and Generation with Quality-Weighted Weak Supervision of Automatic Labeling , 2021, EACL.

[28]  Kevin Duh,et al.  Curriculum Learning for Domain Adaptation in Neural Machine Translation , 2019, NAACL.

[29]  Changhan Wang,et al.  Levenshtein Transformer , 2019, NeurIPS.

[30]  Vera Demberg,et al.  Safe Handover in Mixed-Initiative Control for Cyber-Physical Systems , 2020, ArXiv.

[31]  Jason Weston,et al.  Curriculum learning , 2009, ICML '09.

[32]  Ernie Chang,et al.  Neobility at SemEval-2017 Task 1: An Attention-based Sentence Similarity Model , 2017, SemEval@ACL.

[33]  Ondrej Bojar,et al.  Curriculum Learning and Minibatch Bucketing in Neural Machine Translation , 2017, RANLP.