论文信息 - Building Hierarchically Disentangled Language Models for Text Generation with Named Entities

Building Hierarchically Disentangled Language Models for Text Generation with Named Entities

Named entities pose a unique challenge to traditional methods of language modeling. While several domains are characterised with a high proportion of named entities, the occurrence of specific entities varies widely. Cooking recipes, for example, contain a lot of named entities — viz. ingredients, cooking techniques (also called processes), and utensils. However, some ingredients occur frequently within the instructions while most occur rarely. In this paper, we build upon the previous work done on language models developed for text with named entities by introducing a Hierarchically Disentangled Model. Training is divided into multiple branches with each branch producing a model with overlapping subsets of vocabulary. We found the existing datasets insufficient to accurately judge the performance of the model. Hence, we have curated 158,473 cooking recipes from several publicly available online sources. To reliably derive the entities within this corpus, we employ a combination of Named Entity Recognition (NER) as well as an unsupervised method of interpretation using dependency parsing and POS tagging, followed by a further cleaning of the dataset. This unsupervised interpretation models instructions as action graphs and is specific to the corpus of cooking recipes, unlike NER which is a general method applicable to all corpora. To delve into the utility of our language model, we apply it to tasks such as graph-to-text generation and ingredients-to-recipe generation, comparing it to previous state-of-the-art baselines. We make our dataset (including annotations and processed action graphs) available for use, considering their potential use cases for language modeling and text generation research.

[1] Richard Socher,et al. Quasi-Recurrent Neural Networks , 2016, ICLR.

[2] Mirella Lapata,et al. Text Generation from Knowledge Graphs with Graph Transformers , 2019, NAACL.

[3] Sebastian Riedel,et al. Language Models as Knowledge Bases? , 2019, EMNLP.

[4] Kai-Wei Chang,et al. Building Language Models for Text with Named Entities , 2018, ACL.

[5] Yoshua Bengio,et al. Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[6] Antonio Torralba,et al. Recipe1M+: A Dataset for Learning Cross-Modal Embeddings for Cooking Recipes and Food Images , 2021, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7] Yejin Choi,et al. Globally Coherent Text Generation with Neural Checklist Models , 2016, EMNLP.

[8] Tript Sharma,et al. RecipeDB: a resource for exploring recipes , 2020, Database J. Biol. Databases Curation.

[9] Amaia Salvador,et al. Inverse Cooking: Recipe Generation From Food Images , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[10] Noah A. Smith,et al. To Tune or Not to Tune? Adapting Pretrained Representations to Diverse Tasks , 2019, RepL4NLP@ACL.

[11] Yoshua Bengio,et al. Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[12] Antonio Torralba,et al. Generating the Future with Adversarial Transformers , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13] Qingming Huang,et al. Spatiotemporal CNN for Video Object Segmentation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[14] Premkumar T. Devanbu,et al. On the naturalness of software , 2016, Commun. ACM.

[15] Diego Marcheggiani,et al. Deep Graph Convolutional Encoders for Structured Data to Text Generation , 2018, INLG.

[16] Anja Belz,et al. The First Surface Realisation Shared Task: Overview and Evaluation Results , 2011, ENLG.

[17] Erhardt Barth,et al. A Hybrid Convolutional Variational Autoencoder for Text Generation , 2017, EMNLP.

[18] Mirella Lapata,et al. Data-to-text Generation with Entity Modeling , 2019, ACL.

[19] Wei Lu,et al. Densely Connected Graph Convolutional Networks for Graph-to-Sequence Learning , 2019, TACL.

[20] Yejin Choi,et al. Mise en Place: Unsupervised Interpretation of Instructional Recipes , 2015, EMNLP.

[21] Xin Xia,et al. Code Generation as a Dual Task of Code Summarization , 2019, NeurIPS.

[22] Premkumar T. Devanbu,et al. Are deep neural networks the best choice for modeling source code? , 2017, ESEC/SIGSOFT FSE.

[23] Raghuram Ramanujan,et al. Computational Creativity in the Culinary Arts , 2015, FLAIRS.

[24] Ganesh Bagler,et al. FlavorDB: a database of flavor molecules , 2017, Nucleic Acids Res..

[25] Ganesh Bagler,et al. A Named Entity Based Approach to Model Recipes , 2020, 2020 IEEE 36th International Conference on Data Engineering Workshops (ICDEW).

[26] Dan Klein,et al. Abstract Syntax Networks for Code Generation and Semantic Parsing , 2017, ACL.

[27] Jürgen Schmidhuber,et al. Bidirectional LSTM Networks for Improved Phoneme Classification and Recognition , 2005, ICANN.

[28] Alexander M. Rush,et al. Challenges in Data-to-Document Generation , 2017, EMNLP.

[29] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.

[30] Robert Dale. Generating recipes: an overview of epicure , 1990 .

[31] Adam Tauman Kalai,et al. Counterfactual Language Model Adaptation for Suggesting Phrases , 2017, IJCNLP.

[32] Omer Levy,et al. GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding , 2018, BlackboxNLP@EMNLP.

[33] Shuyang Li,et al. Generating Personalized Recipes from Historical User Preferences , 2019, EMNLP.

[34] Christopher D. Manning,et al. Incorporating Non-local Information into Information Extraction Systems by Gibbs Sampling , 2005, ACL.

[35] Yoko Yamakata,et al. Flow Graph Corpus from Recipe Texts , 2014, LREC.

[36] Amaia Salvador,et al. Learning Cross-Modal Embeddings for Cooking Recipes and Food Images , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).