Capturing Entity Hierarchy in Data-to-Text Generative Models

We aim at generating summary from structured data (i.e. tables, entity-relation triplets, ...). Most previous approaches relies on an encoder-decoder architecture in which data are linearized into a sequence of elements. In contrast, we propose to take into account entities forming the data structure in a hierarchical model. Moreover, we introduce the Transformer encoder in data-to-text models to ensure robust encoding of each element/entities in comparison to all others, no matter their initial positioning. Our model is evaluated on the RotoWire benchmark (statistical tables of NBA basketball games). This paper has been accepted at ECIR 2020.

[1]  Hao Ma,et al.  Table Cell Search for Question Answering , 2016, WWW.

[2]  Marc Dymetman,et al.  A surprisingly effective out-of-the-box char2char model on the E2E NLG Challenge dataset , 2017, SIGDIAL Conference.

[3]  Mirella Lapata,et al.  Data-to-Text Generation with Content Selection and Planning , 2018, AAAI.

[4]  Reynold Xin,et al.  Finding related tables , 2012, SIGMOD Conference.

[5]  Zhifang Sui,et al.  Hierarchical Encoder with Auxiliary Supervision for Neural Table-to-Text Generation: Learning Better Representation for Tables , 2019, AAAI.

[6]  Xiaojun Wan,et al.  Point Precisely: Towards Ensuring the Precision of Data in Generated Texts Using Delayed Copy Mechanism , 2018, COLING.

[7]  Emiel Krahmer,et al.  Making effective use of healthcare data using data-to-text technology , 2018, Data Science for Healthcare.

[8]  Emiel Krahmer,et al.  Survey of the State of the Art in Natural Language Generation: Core tasks, applications and evaluation , 2017, J. Artif. Intell. Res..

[9]  Percy Liang,et al.  Compositional Semantic Parsing on Semi-Structured Tables , 2015, ACL.

[10]  Raymond J. Mooney,et al.  Learning to sportscast: a test of grounded language acquisition , 2008, ICML '08.

[11]  Mirella Lapata,et al.  Data-to-text Generation with Entity Modeling , 2019, ACL.

[12]  Jian Li,et al.  Scalable Column Concept Determination for Web Tables Using Large Knowledge Bases , 2013, Proc. VLDB Endow..

[13]  Patrick Gallinari,et al.  A Hierarchical Model for Data-to-Text Generation , 2019, ECIR.

[14]  David Grangier,et al.  Neural Text Generation from Structured Data with Application to the Biography Domain , 2016, EMNLP.

[15]  Samy Bengio,et al.  Order Matters: Sequence to sequence for sets , 2015, ICLR.

[16]  Jim Hunter,et al.  Choosing words in computer-generated weather forecasts , 2005, Artif. Intell..

[17]  Zhifang Sui,et al.  Table-to-text Generation by Structure-aware Seq2seq Learning , 2017, AAAI.

[18]  Krisztian Balog,et al.  Table2Vec: Neural Word and Entity Embeddings for Table Population and Retrieval , 2019, SIGIR.

[19]  Alexander M. Rush,et al.  Challenges in Data-to-Document Generation , 2017, EMNLP.