论文信息 - Learning to Organize a Bag of Words into Sentences with Neural Networks: An Empirical Study

Learning to Organize a Bag of Words into Sentences with Neural Networks: An Empirical Study

Sequential information, a.k.a., orders, is assumed to be essential for processing a sequence with recurrent neural network or convolutional neural network based encoders. However, is it possible to encode natural languages without orders? Given a bag of words from a disordered sentence, humans may still be able to understand what those words mean by reordering or reconstructing them. Inspired by such an intuition, in this paper, we perform a study to investigate how “order” information takes effects in natural language learning. By running comprehensive comparisons, we quantitatively compare the ability of several representative neural models to organize sentences from a bag of words under three typical scenarios, and summarize some empirical findings and challenges, which can shed light on future research on this line of work.

[1] Stephen Clark,et al. Syntax-Based Word Ordering Incorporating a Large-Scale Language Model , 2012, EACL.

[2] Yue Zhang,et al. An Empirical Comparison Between N-gram and Syntactic Language Models for Word Ordering , 2015, EMNLP.

[3] Alexander M. Rush,et al. Word Ordering Without Syntax , 2016, EMNLP.

[4] Andrew Markham,et al. Robust Attentional Aggregation of Deep Feature Sets for Multi-view 3D Reconstruction , 2018, International Journal of Computer Vision.

[5] Songhua Xu,et al. Keyword Extraction and Headline Generation Using Novel Word Features , 2010, Proceedings of the AAAI Conference on Artificial Intelligence.

[6] Stephen Clark,et al. Syntax-Based Grammaticality Improvement using CCG and Guided Search , 2011, EMNLP.

[7] Salim Roukos,et al. Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[8] Yee Whye Teh,et al. Set Transformer , 2018, ICML.

[9] Andreas Stolcke,et al. Finding consensus in speech recognition: word error minimization and other applications of confusion networks , 2000, Comput. Speech Lang..

[10] Yann Dauphin,et al. Convolutional Sequence to Sequence Learning , 2017, ICML.

[11] Quoc V. Le,et al. Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[12] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13] Geoffrey E. Hinton,et al. Layer Normalization , 2016, ArXiv.

[14] Samy Bengio,et al. Order Matters: Sequence to sequence for sets , 2015, ICLR.

[15] Max Welling,et al. Attention-based Deep Multiple Instance Learning , 2018, ICML.

[16] Leonidas J. Guibas,et al. PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.

[18] Chin-Yew Lin,et al. ROUGE: A Package for Automatic Evaluation of Summaries , 2004, ACL 2004.

[19] Jason Weston,et al. End-To-End Memory Networks , 2015, NIPS.

[20] Richard S. Zemel,et al. Prototypical Networks for Few-shot Learning , 2017, NIPS.

[21] Yoshua Bengio,et al. Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[22] Lukás Burget,et al. Recurrent neural network based language model , 2010, INTERSPEECH.

[23] Christopher D. Manning,et al. Get To The Point: Summarization with Pointer-Generator Networks , 2017, ACL.

[24] Yue Zhang,et al. Transition-Based Syntactic Linearization , 2015, NAACL.

[25] Geoffrey Zweig,et al. From captions to visual concepts and back , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[26] Marcus Tomalin,et al. Word Ordering with Phrase-Based Grammars , 2014, EACL.

[27] Ming Zhou,et al. Gated Self-Matching Networks for Reading Comprehension and Question Answering , 2017, ACL.

[28] Yoram Singer,et al. Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..

[29] Richard Socher,et al. A Deep Reinforced Model for Abstractive Summarization , 2017, ICLR.

[30] Yoshua Bengio,et al. Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling , 2014, ArXiv.

[31] Asim Kadav,et al. Attend and Interact: Higher-Order Object Interactions for Video Understanding , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[32] Joelle Pineau,et al. Building End-To-End Dialogue Systems Using Generative Hierarchical Neural Network Models , 2015, AAAI.

[33] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.

[34] Beatrice Santorini,et al. Building a Large Annotated Corpus of English: The Penn Treebank , 1993, CL.

[35] Jing He,et al. Word-reordering for Statistical Machine Translation Using Trigram Language Model , 2011, IJCNLP.

[36] Marcus Tomalin,et al. A Comparison of Neural Models for Word Ordering , 2017, INLG.