论文信息 - Learning to Generate Compositional Color Descriptions

Learning to Generate Compositional Color Descriptions

The production of color language is essential for grounded language generation. Color descriptions have many challenging properties: they can be vague, compositionally complex, and denotationally rich. We present an effective approach to generating color descriptions using recurrent neural networks and a Fourier-transformed color representation. Our model outperforms previous work on a conditional language modeling task over a large corpus of naturalistic color descriptions. In addition, probing the model's output reveals that it can accurately produce not only basic color terms but also descriptors with non-convex denotations ("greenish"), bare modifiers ("bright", "dull"), and compositional phrases ("faded teal") not seen in training.

Christopher Potts | Noah D. Goodman | Will Monroe | Christopher Potts | Will Monroe

[1] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.

[2] H. Akaike. A new look at the statistical model identification , 1974 .

[3] Tamara L. Berg,et al. Baby Talk : Understanding and Generating Image Descriptions , 2011 .

[4] Colin Raffel,et al. Lasagne: First release. , 2015 .

[5] Guojun Lu,et al. Shape-based image retrieval using generic Fourier descriptor , 2002, Signal Process. Image Commun..

[6] Yoram Singer,et al. Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..

[7] Nitish Srivastava,et al. Improving neural networks by preventing co-adaptation of feature detectors , 2012, ArXiv.

[8] Noah A. Smith,et al. Character Sequence Models for Colorful Words , 2016, EMNLP.

[9] Brian McMahan,et al. A Bayesian Model of Grounded Color Semantics , 2015, TACL.

[10] Emiel Krahmer,et al. Computational Generation of Referring Expressions: A Survey , 2012, CL.

[11] P. Kay. Basic Color Terms: Their Universality and Evolution , 1969 .

[12] Dan Klein,et al. A Game-Theoretic Approach to Generating Spatial Descriptions , 2010, EMNLP.

[13] Alex Graves,et al. Generating Sequences With Recurrent Neural Networks , 2013, ArXiv.

[14] Christopher Potts,et al. Learning in the Rational Speech Acts Model , 2015, ArXiv.

[15] Karl Stratos,et al. Midge: Generating Image Descriptions From Computer Vision Detections , 2012, EACL.

[16] David DeVault,et al. Managing ambiguities across utterances in dialogue , 2007 .

[17] John Salvatier,et al. Theano: A Python framework for fast computation of mathematical expressions , 2016, ArXiv.

[18] Yoshua Bengio,et al. Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.