Adaptive Convolutional Filter Generation for Natural Language Understanding

Convolutional neural networks (CNNs) have recently emerged as a popular building block for natural language processing (NLP). Despite their success, most existing CNN models employed in NLP are not expressive enough, in the sense that all input sentences share the same learned (and static) set of filters. Motivated by this problem, we propose an adaptive convolutional filter generation framework for natural language understanding, by leveraging a meta network to generate input-aware filters. We further generalize our framework to model question-answer sentence pairs and propose an adaptive question answering (AdaQA) model; a novel two-way feature abstraction mechanism is introduced to encapsulate co-dependent sentence representations. We investigate the effectiveness of our framework on document categorization and answer sentence-selection tasks, achieving state-of-the-art performance on several benchmark datasets.

[1]  Xinyuan Zhang,et al.  Diffusion Maps for Textual Network Embedding , 2018, NeurIPS.

[2]  Virginia R. de Sa,et al.  Multi-view Sentence Representation Learning , 2018, ArXiv.

[3]  Guoyin Wang,et al.  Joint Embedding of Words and Labels for Text Classification , 2018, ACL.

[4]  Guoyin Wang,et al.  NASH: Toward End-to-End Neural Architecture for Generative Semantic Hashing , 2018, ACL.

[5]  Zhe Gan,et al.  Topic Compositional Neural Language Model , 2017, AISTATS.

[6]  Lawrence Carin,et al.  Deconvolutional Latent-Variable Model for Text Sequence Matching , 2017, AAAI.

[7]  Guoyin Wang,et al.  Deconvolutional Paragraph Representation Learning , 2017, NIPS.

[8]  Bowen Zhou,et al.  Learning Loss Functions for Semi-supervised Learning via Discriminative Adversarial Networks , 2017, ArXiv.

[9]  Xuanjing Huang,et al.  Dynamic Compositional Neural Networks over Tree Structure , 2017, IJCAI.

[10]  Yann Dauphin,et al.  Convolutional Sequence to Sequence Learning , 2017, ICML.

[11]  Junmo Kim,et al.  Active Convolution: Learning the Shape of Convolution for Image Classification , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Bowen Zhou,et al.  A Structured Self-attentive Sentence Embedding , 2017, ICLR.

[13]  Zhiguo Wang,et al.  Bilateral Multi-Perspective Matching for Natural Language Sentences , 2017, IJCAI.

[14]  Yann Dauphin,et al.  Language Modeling with Gated Convolutional Networks , 2016, ICML.

[15]  Zhe Gan,et al.  Learning Generic Sentence Representations Using Convolutional Neural Networks , 2016, EMNLP.

[16]  Richard Socher,et al.  Dynamic Coattention Networks For Question Answering , 2016, ICLR.

[17]  Shuohang Wang,et al.  A Compare-Aggregate Model for Matching Text Sequences , 2016, ICLR.

[18]  Quoc V. Le,et al.  HyperNetworks , 2016, ICLR.

[19]  Hong Yu,et al.  Neural Semantic Encoders , 2016, EACL.

[20]  C A Nelson,et al.  Learning to Learn , 2017, Encyclopedia of Machine Learning and Data Mining.

[21]  Jinho D. Choi,et al.  SelQA: A New Benchmark for Selection-Based Question Answering , 2016, 2016 IEEE 28th International Conference on Tools with Artificial Intelligence (ICTAI).

[22]  Jason Weston,et al.  Key-Value Memory Networks for Directly Reading Documents , 2016, EMNLP.

[23]  Yann LeCun,et al.  Very Deep Convolutional Networks for Natural Language Processing , 2016, ArXiv.

[24]  Luc Van Gool,et al.  Dynamic Filter Networks , 2016, NIPS.

[25]  Zhiguo Wang,et al.  Sentence Similarity Learning by Lexical Decomposition and Composition , 2016, COLING.

[26]  Bowen Zhou,et al.  Attentive Pooling Networks , 2016, ArXiv.

[27]  Rui Yan,et al.  Natural Language Inference by Tree-Based Convolution and Heuristic Matching , 2015, ACL.

[28]  Bowen Zhou,et al.  ABCNN: Attention-Based Convolutional Neural Network for Modeling Sentence Pairs , 2015, TACL.

[29]  Phil Blunsom,et al.  Neural Variational Inference for Text Processing , 2015, ICML.

[30]  Soumith Chintala,et al.  Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks , 2015, ICLR.

[31]  Quoc V. Le,et al.  Semi-supervised Sequence Learning , 2015, NIPS.

[32]  Yi Yang,et al.  WikiQA: A Challenge Dataset for Open-Domain Question Answering , 2015, EMNLP.

[33]  Xiang Zhang,et al.  Character-level Convolutional Networks for Text Classification , 2015, NIPS.

[34]  Tong Zhang,et al.  Semi-supervised Convolutional Neural Networks for Text Categorization via Region Embedding , 2015, NIPS.

[35]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[36]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[37]  Hang Li,et al.  Convolutional Neural Network Architectures for Matching Natural Language Sentences , 2014, NIPS.

[38]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[39]  Yoon Kim,et al.  Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[40]  Quoc V. Le,et al.  Distributed Representations of Sentences and Documents , 2014, ICML.

[41]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[42]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[43]  Jason Weston,et al.  Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..

[44]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[45]  Yann LeCun,et al.  Signature Verification Using A "Siamese" Time Delay Neural Network , 1993, Int. J. Pattern Recognit. Artif. Intell..

[46]  Guoyin Wang,et al.  Deconvolutional Paragraph Representation Learning , 2017, NIPS.

[47]  Bowen Zhou,et al.  Learning Loss Functions for Semi-supervised Learning via Discriminative Adversarial Networks , 2017, ArXiv.

[48]  Xuanjing Huang,et al.  Dynamic Compositional Neural Networks over Tree Structure , 2017, IJCAI.

[49]  Yann Dauphin,et al.  Convolutional Sequence to Sequence Learning , 2017, ICML.

[50]  Junmo Kim,et al.  Active Convolution: Learning the Shape of Convolution for Image Classification , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[51]  Bowen Zhou,et al.  A Structured Self-attentive Sentence Embedding , 2017, ICLR.

[52]  Yann Dauphin,et al.  Language Modeling with Gated Convolutional Networks , 2016, ICML.

[53]  Richard Socher,et al.  Dynamic Coattention Networks For Question Answering , 2016, ICLR.

[54]  Quoc V. Le,et al.  HyperNetworks , 2016, ICLR.

[55]  Hong Yu,et al.  Neural Semantic Encoders , 2016, EACL.

[56]  C A Nelson,et al.  Learning to Learn , 2017, Encyclopedia of Machine Learning and Data Mining.

[57]  Jinho D. Choi,et al.  SelQA: A New Benchmark for Selection-Based Question Answering , 2016, 2016 IEEE 28th International Conference on Tools with Artificial Intelligence (ICTAI).

[58]  Jason Weston,et al.  Key-Value Memory Networks for Directly Reading Documents , 2016, EMNLP.

[59]  Yann LeCun,et al.  Very Deep Convolutional Networks for Natural Language Processing , 2016, ArXiv.

[60]  Luc Van Gool,et al.  Dynamic Filter Networks , 2016, NIPS.

[61]  Zhiguo Wang,et al.  Sentence Similarity Learning by Lexical Decomposition and Composition , 2016, COLING.

[62]  Bowen Zhou,et al.  Attentive Pooling Networks , 2016, ArXiv.

[63]  Rui Yan,et al.  Natural Language Inference by Tree-Based Convolution and Heuristic Matching , 2015, ACL.

[64]  Bowen Zhou,et al.  ABCNN: Attention-Based Convolutional Neural Network for Modeling Sentence Pairs , 2015, TACL.

[65]  Phil Blunsom,et al.  Neural Variational Inference for Text Processing , 2015, ICML.

[66]  Soumith Chintala,et al.  Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks , 2015, ICLR.

[67]  Yi Yang,et al.  WikiQA: A Challenge Dataset for Open-Domain Question Answering , 2015, EMNLP.

[68]  Xiang Zhang,et al.  Character-level Convolutional Networks for Text Classification , 2015, NIPS.

[69]  Tong Zhang,et al.  Semi-supervised Convolutional Neural Networks for Text Categorization via Region Embedding , 2015, NIPS.

[70]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[71]  Hang Li,et al.  Convolutional Neural Network Architectures for Matching Natural Language Sentences , 2014, NIPS.

[72]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[73]  Yoon Kim,et al.  Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[74]  Quoc V. Le,et al.  Distributed Representations of Sentences and Documents , 2014, ICML.

[75]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[76]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[77]  Jason Weston,et al.  Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..

[78]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[79]  Yann LeCun,et al.  Signature Verification Using A "Siamese" Time Delay Neural Network , 1993, Int. J. Pattern Recognit. Artif. Intell..