NABU - Multilingual Graph-based Neural RDF Verbalizer

The RDF-to-text task has recently gained substantial attention due to continuous growth of Linked Data. In contrast to traditional pipeline models, recent studies have focused on neural models, which are now able to convert a set of RDF triples into text in an end-to-end style with promising results. However, English is the only language widely targeted. We address this research gap by presenting NABU, a multilingual graph-based neural model that verbalizes RDF data to German, Russian, and English. NABU is based on an encoder-decoder architecture, uses an encoder inspired by Graph Attention Networks and a Transformer as decoder. Our approach relies on the fact that knowledge graphs are language-agnostic and they hence can be used to generate multilingual text. We evaluate NABU in monolingual and multilingual settings on standard benchmarking WebNLG datasets. Our results show that NABU outperforms state-of-the-art approaches on English with 66.21 BLEU, and achieves consistent results across all languages on the multilingual scenario with 56.04 BLEU.

[1]  Advaith Siddharthan,et al.  SaferDrive: An NLG-based behaviour change support system for drivers , 2018, Natural Language Engineering.

[2]  Alon Lavie,et al.  METEOR: An Automatic Metric for MT Evaluation with Improved Correlation with Human Judgments , 2005, IEEvaluation@ACL.

[3]  Maja Popovic,et al.  chrF++: words helping character n-grams , 2017, WMT.

[4]  Christopher D. Manning,et al.  Effective Approaches to Attention-based Neural Machine Translation , 2015, EMNLP.

[5]  Rico Sennrich,et al.  Why Self-Attention? A Targeted Evaluation of Neural Machine Translation Architectures , 2018, EMNLP.

[6]  Claire Gardent,et al.  The WebNLG Challenge: Generating Text from DBPedia Data , 2016, INLG.

[7]  Gholamreza Haffari,et al.  Graph-to-Sequence Learning using Gated Graph Neural Networks , 2018, ACL.

[8]  Leo Wanner,et al.  Natural Language Generation in the context of the Semantic Web , 2014, Semantic Web.

[9]  Mariana L. Neves,et al.  RDF2PT: Generating Brazilian Portuguese Texts from RDF Data , 2018, LREC.

[10]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[11]  Martin Wattenberg,et al.  Google’s Multilingual Neural Machine Translation System: Enabling Zero-Shot Translation , 2016, TACL.

[12]  Pietro Liò,et al.  Graph Attention Networks , 2017, ICLR.

[13]  Emiel Krahmer,et al.  Survey of the State of the Art in Natural Language Generation: Core tasks, applications and evaluation , 2017, J. Artif. Intell. Res..

[14]  Verena Rieser,et al.  The E2E Dataset: New Challenges For End-to-End Generation , 2017, SIGDIAL Conference.

[15]  Iryna Gurevych,et al.  Modeling Global and Local Node Contexts for Text Generation from Knowledge Graphs , 2020, Transactions of the Association for Computational Linguistics.

[16]  Claire Gardent,et al.  Generating Paraphrases from DBPedia using Deep Learning , 2016, WebNLG.

[17]  Shashi Narayan,et al.  Creating Training Corpora for NLG Micro-Planners , 2017, ACL.

[18]  Christophe Gravier,et al.  Mind the (Language) Gap: Generation of Multilingual Wikipedia Summaries from Wikidata for ArticlePlaceholders , 2018, ESWC.

[19]  Di He,et al.  Multilingual Neural Machine Translation with Knowledge Distillation , 2019, ICLR.

[20]  Axel-Cyrille Ngonga Ngomo,et al.  BENGAL: An Automatic Benchmark Generator for Entity Recognition and Linking , 2018, INLG.

[21]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[22]  Claire Gardent,et al.  The WebNLG Challenge: Generating Text from RDF Data , 2017, INLG.

[23]  Dimitra Gkatzia,et al.  Comparing Multi-label Classification with Reinforcement Learning for Summarisation of Time-series Data , 2014, ACL.

[24]  Alexander M. Rush,et al.  End-to-End Content and Plan Selection for Data-to-Text Generation , 2018, INLG.

[25]  Claire Gardent,et al.  Creating a Corpus for Russian Data-to-Text Generation Using Neural Machine Translation and Post-Editing , 2019, BSNLP@ACL.

[26]  Jens Lehmann,et al.  DBpedia: A Nucleus for a Web of Open Data , 2007, ISWC/ASWC.

[27]  Emiel Krahmer,et al.  Neural data-to-text generation: A comparison between pipeline and end-to-end architectures , 2019, EMNLP.

[28]  Emiel Krahmer,et al.  Enriching the WebNLG corpus , 2018, INLG.

[29]  Halil Kilicoglu,et al.  Aligning Texts and Knowledge Bases with Semantic Sentence Simplification , 2016, WebNLG.

[30]  Thibault Sellam,et al.  BLEURT: Learning Robust Metrics for Text Generation , 2020, ACL.

[31]  Daniel Duma,et al.  Generating Natural Language from Linked Data: Unsupervised template extraction , 2013, IWCS.

[32]  Rico Sennrich,et al.  Neural Machine Translation of Rare Words with Subword Units , 2015, ACL.

[33]  Ehud Reiter,et al.  Book Reviews: Building Natural Language Generation Systems , 2000, CL.

[34]  Hang Li,et al.  “ Tony ” DNN Embedding for “ Tony ” Selective Read for “ Tony ” ( a ) Attention-based Encoder-Decoder ( RNNSearch ) ( c ) State Update s 4 SourceVocabulary Softmax Prob , 2016 .

[35]  Kathleen McKeown,et al.  Discourse Planning with an N-gram Model of Relations , 2015, EMNLP.

[36]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[37]  Wei Wang,et al.  GTR-LSTM: A Triple Encoder for Sentence Generation from RDF Data , 2018, ACL.

[38]  Taku Kudo,et al.  Subword Regularization: Improving Neural Network Translation Models with Multiple Subword Candidates , 2018, ACL.

[39]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[40]  Alessandro Mazzei,et al.  Designing and testing the messages produced by a virtual dietitian , 2018, INLG.

[41]  Andreas Harth,et al.  A language-independent method for the extraction of RDF verbalization templates , 2014, INLG.

[42]  Matthew R. Walter,et al.  What to talk about and how? Selective Generation using LSTMs with Coarse-to-Fine Alignment , 2015, NAACL.

[43]  Anja Belz,et al.  The First Surface Realisation Shared Task: Overview and Evaluation Results , 2011, ENLG.

[44]  Diego Marcheggiani,et al.  Deep Graph Convolutional Encoders for Structured Data to Text Generation , 2018, INLG.

[45]  Philipp Cimiano,et al.  Exploiting Ontology Lexica for Generating Natural Language Texts from RDF Data , 2013, ENLG.

[46]  Axel-Cyrille Ngonga Ngomo,et al.  A Holistic Natural Language Generation Framework for the Semantic Web , 2019, RANLP.

[47]  Leo Wanner,et al.  The First Multilingual Surface Realisation Shared Task (SR’18): Overview and Evaluation Results , 2018 .

[48]  Yoshua Bengio,et al.  Show, Attend and Tell: Neural Image Caption Generation with Visual Attention , 2015, ICML.