GenNI: Human-AI Collaboration for Data-Backed Text Generation

Table2Text systems generate textual output based on structured data utilizing machine learning. These systems are essential for fluent natural language interfaces in tools such as virtual assistants; however, left to generate freely these ML systems often produce misleading or unexpected outputs. GenNI (Generation Negotiation Interface) is an interactive visual system for high-level human-AI collaboration in producing descriptive text. The tool utilizes a deep learning model designed with explicit control states. These controls allow users to globally constrain model generations, without sacrificing the representation power of the deep learning models. The visual interface makes it possible for users to interact with AI systems following a Refine-Forecast paradigm to ensure that the generation system acts in a manner human users find suitable. We report multiple use cases on two experiments that improve over uncontrolled generation approaches, while at the same time providing fine-grained control. A demo and source code are available at https://genni.vizhub.ai.

[1]  Kathleen McKeown,et al.  Statistical Acquisition of Content Selection Rules for Natural Language Generation , 2003, EMNLP.

[2]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[3]  Ehud Reiter,et al.  Book Reviews: Building Natural Language Generation Systems , 2000, CL.

[4]  Zhifang Sui,et al.  Learning to Control the Fine-grained Sentiment for Story Ending Generation , 2019, ACL.

[5]  Alexander M. Rush,et al.  Challenges in Data-to-Document Generation , 2017, EMNLP.

[6]  Alexander M. Rush,et al.  Learning Neural Templates for Text Generation , 2018, EMNLP.

[7]  Chin-Yew Lin,et al.  Data2Text Studio: Automated Text Generation from Structured Data , 2018, EMNLP.

[8]  Regina Barzilay,et al.  Style Transfer from Non-Parallel Text by Cross-Alignment , 2017, NIPS.

[9]  Jesse Vig,et al.  A Multiscale Visualization of Attention in the Transformer Model , 2019, ACL.

[10]  Verena Rieser,et al.  The E2E Dataset: New Challenges For End-to-End Generation , 2017, SIGDIAL Conference.

[11]  Robert Dale,et al.  Building applied natural language generation systems , 1997, Natural Language Engineering.

[12]  Sebastian Gehrmann,et al.  exBERT: A Visual Analysis Tool to Explore Learned Representations in Transformers Models , 2019, ArXiv.

[13]  Alexander M. Rush,et al.  Posterior Control of Blackbox Generation , 2020, ACL.

[14]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[15]  Kevin Knight,et al.  Generation that Exploits Corpus-Based Statistical Knowledge , 1998, ACL.

[16]  Yansong Feng,et al.  Latent Template Induction with Gumbel-CRFs , 2020, NeurIPS.

[17]  Claire Gardent,et al.  Generating Syntactic Paraphrases , 2018, EMNLP.

[18]  Thomas Wolf,et al.  TransferTransfo: A Transfer Learning Approach for Neural Network Based Conversational Agents , 2019, ArXiv.

[19]  Ali Farhadi,et al.  Defending Against Neural Fake News , 2019, NeurIPS.

[20]  Johanna D. Moore,et al.  Planning Text for Advisory Dialogues: Capturing Intentional and Rhetorical Information , 1993, CL.

[21]  Eric P. Xing,et al.  Toward Controlled Generation of Text , 2017, ICML.

[22]  Thomas Muller,et al.  TaPas: Weakly Supervised Table Parsing via Pre-training , 2020, ACL.

[23]  Dietrich Klakow,et al.  Neural Data-to-Text Generation via Jointly Learning the Segmentation and Correspondence , 2020, ACL.

[24]  Colin Raffel,et al.  Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer , 2019, J. Mach. Learn. Res..

[25]  Xiang Gao,et al.  MixingBoard: a Knowledgeable Stylized Integrated Text Generation Platform , 2020, ACL.

[26]  Zhen Li,et al.  Understanding Hidden Memories of Recurrent Neural Networks , 2017, 2017 IEEE Conference on Visual Analytics Science and Technology (VAST).

[27]  Emiel Krahmer,et al.  From data to speech: a general approach , 2001, Natural Language Engineering.

[28]  Kevin Gimpel,et al.  Controllable Paraphrase Generation with a Syntactic Exemplar , 2019, ACL.

[29]  Alexander M. Rush,et al.  Visual Interaction with Deep Learning Models through Collaborative Semantic Inference , 2019, IEEE Transactions on Visualization and Computer Graphics.

[30]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[31]  Karol Gregor,et al.  Neural Variational Inference and Learning in Belief Networks , 2014, ICML.

[32]  Alexander M. Rush,et al.  Seq2seq-Vis: A Visual Debugging Tool for Sequence-to-Sequence Models , 2018, IEEE Transactions on Visualization and Computer Graphics.

[33]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[34]  Minsuk Kahng,et al.  Visual Analytics in Deep Learning: An Interrogative Survey for the Next Frontiers , 2018, IEEE Transactions on Visualization and Computer Graphics.

[35]  Anja Belz Probabilistic Generation of Weather Forecast Texts , 2007, HLT-NAACL.

[36]  Alexander M. Rush,et al.  LSTMVis: A Tool for Visual Analysis of Hidden State Dynamics in Recurrent Neural Networks , 2016, IEEE Transactions on Visualization and Computer Graphics.

[37]  Eduard H. Hovy,et al.  Automated Discourse Generation Using Discourse Structure Relations , 1993, Artif. Intell..

[38]  David Grangier,et al.  Neural Text Generation from Structured Data with Application to the Biography Domain , 2016, EMNLP.

[39]  Mihir Kale,et al.  Text-to-Text Pre-Training for Data-to-Text Tasks , 2020, INLG.

[40]  Mark Cieliebak,et al.  Syntactic Manipulation for Generating more Diverse and Interesting Texts , 2018, INLG.

[41]  Matthew R. Walter,et al.  What to talk about and how? Selective Generation using LSTMs with Coarse-to-Fine Alignment , 2015, NAACL.

[42]  Daan Wierstra,et al.  Stochastic Backpropagation and Approximate Inference in Deep Generative Models , 2014, ICML.

[43]  Dylan Cashman,et al.  RNNbow: Visualizing Learning Via Backpropagation Gradients in RNNs , 2018, IEEE Computer Graphics and Applications.

[44]  Leo Wanner,et al.  Content selection from an ontology-based knowledge base for the generation of football summaries , 2011, ENLG.

[45]  Ilya Sutskever,et al.  Language Models are Unsupervised Multitask Learners , 2019 .

[46]  Jie Fu,et al.  CoCon: A Self-Supervised Approach for Controlled Text Generation , 2020, ArXiv.

[47]  Susan McRoy,et al.  An augmented template-based approach to text realization , 2003, Natural Language Engineering.

[48]  Andreas Madsen,et al.  Visualizing memorization in RNNs , 2019, Distill.

[49]  Lav R. Varshney,et al.  CTRL: A Conditional Transformer Language Model for Controllable Generation , 2019, ArXiv.

[50]  Huamin Qu,et al.  ProtoSteer: Steering Deep Sequence Model with Prototypes , 2020, IEEE Transactions on Visualization and Computer Graphics.

[51]  Alexander M. Rush Torch-Struct: Deep Structured Prediction Library , 2020, ACL.

[52]  Fei-Fei Li,et al.  Visualizing and Understanding Recurrent Networks , 2015, ArXiv.

[53]  Blake Howald,et al.  Domain Adaptable Semantic Clustering in Statistical NLG , 2013, IWCS.

[54]  Helmut Horacek,et al.  A Flexible Shallow Approach to Text Generation , 1998, INLG.

[55]  Alexander M. Rush,et al.  GLTR: Statistical Detection and Visualization of Generated Text , 2019, ACL.

[56]  Marilyn A. Walker,et al.  Controlling Personality-Based Stylistic Variation with Neural Natural Language Generators , 2018, SIGDIAL Conference.

[57]  Ondrej Dusek,et al.  Sequence-to-Sequence Generation for Spoken Dialogue via Deep Syntax Trees and Strings , 2016, ACL.

[58]  Luke S. Zettlemoyer,et al.  Adversarial Example Generation with Syntactically Controlled Paraphrase Networks , 2018, NAACL.