Describing Multimedia Content Using Attention-Based Encoder-Decoder Networks
暂无分享,去创建一个
[1] Geoffrey E. Hinton,et al. Learning representations by back-propagating errors , 1986, Nature.
[2] Geoffrey E. Hinton,et al. Learning and relearning in Boltzmann machines , 1986 .
[3] Jonathan G. Fiscus,et al. DARPA TIMIT:: acoustic-phonetic continuous speech corpus CD-ROM, NIST speech disc 1-1.1 , 1993 .
[4] Yoshua Bengio,et al. Learning long-term dependencies with gradient descent is difficult , 1994, IEEE Trans. Neural Networks.
[5] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[6] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.
[7] S. C. Kremer,et al. Gradient Flow in Recurrent Nets: the Difficulty of Learning Long-Term Dependencies , 2001 .
[8] Yoshua Bengio,et al. Gradient Flow in Recurrent Nets: the Difficulty of Learning Long-Term Dependencies , 2001 .
[9] Salim Roukos,et al. Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.
[10] Chin-Yew Lin,et al. ROUGE: A Package for Automatic Evaluation of Summaries , 2004, ACL 2004.
[11] Jürgen Schmidhuber,et al. Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks , 2006, ICML.
[12] Geoffrey E. Hinton,et al. Three new graphical models for statistical language modelling , 2007, ICML '07.
[13] Geoffrey E. Hinton,et al. Learning to combine foveal glimpses with a third-order Boltzmann machine , 2010, NIPS.
[14] Yann LeCun,et al. Convolutional Learning of Spatio-temporal Features , 2010, ECCV.
[15] Lukás Burget,et al. Extensions of recurrent neural network language model , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[16] William B. Dolan,et al. Collecting Highly Parallel Data for Paraphrase Evaluation , 2011, ACL.
[17] Tara N. Sainath,et al. FUNDAMENTAL TECHNOLOGIES IN MODERN SPEECH RECOGNITION Digital Object Identifier 10.1109/MSP.2012.2205597 , 2012 .
[18] Misha Denil,et al. Learning Where to Attend with Deep Architectures for Image Tracking , 2011, Neural Computation.
[19] Alex Graves,et al. Sequence Transduction with Recurrent Neural Networks , 2012, ArXiv.
[20] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[21] Geoffrey E. Hinton,et al. Speech recognition with deep recurrent neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[22] Ruslan Salakhutdinov,et al. Learning Stochastic Feedforward Neural Networks , 2013, NIPS.
[23] Yejin Choi,et al. Baby talk: Understanding and generating simple image descriptions , 2011, CVPR 2011.
[24] Yoshua Bengio,et al. Estimating or Propagating Gradients Through Stochastic Neurons for Conditional Computation , 2013, ArXiv.
[25] Alex Graves,et al. Generating Sequences With Recurrent Neural Networks , 2013, ArXiv.
[26] Phil Blunsom,et al. Recurrent Continuous Translation Models , 2013, EMNLP.
[27] Peter Young,et al. Framing Image Description as a Ranking Task: Data, Models and Evaluation Metrics , 2013, J. Artif. Intell. Res..
[28] Yoshua Bengio,et al. Audio Chord Recognition with Recurrent Neural Networks , 2013, ISMIR.
[29] Karol Gregor,et al. Neural Variational Inference and Learning in Belief Networks , 2014, ICML.
[30] Yoshua Bengio,et al. Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.
[31] Navdeep Jaitly,et al. Towards End-To-End Speech Recognition with Recurrent Neural Networks , 2014, ICML.
[32] Xiang Zhang,et al. OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks , 2013, ICLR.
[33] Rob Fergus,et al. Visualizing and Understanding Convolutional Networks , 2013, ECCV.
[34] Yoshua Bengio,et al. On the Properties of Neural Machine Translation: Encoder–Decoder Approaches , 2014, SSST@EMNLP.
[35] Brian Murphy,et al. Simultaneously Uncovering the Patterns of Brain Regions Involved in Different Story Reading Subprocesses , 2014, PloS one.
[36] Erich Elsen,et al. Deep Speech: Scaling up end-to-end speech recognition , 2014, ArXiv.
[37] Alon Lavie,et al. Meteor Universal: Language Specific Translation Evaluation for Any Target Language , 2014, WMT@ACL.
[38] Ruslan Salakhutdinov,et al. Unifying Visual-Semantic Embeddings with Multimodal Neural Language Models , 2014, ArXiv.
[39] Hugo Larochelle,et al. A Neural Autoregressive Approach to Attention-based Recognition , 2015, International Journal of Computer Vision.
[40] Peter Young,et al. From image descriptions to visual denotations: New similarity metrics for semantic inference over event descriptions , 2014, TACL.
[41] Razvan Pascanu,et al. How to Construct Deep Recurrent Neural Networks , 2013, ICLR.
[42] Stefan Carlsson,et al. CNN Features Off-the-Shelf: An Astounding Baseline for Recognition , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops.
[43] Pietro Perona,et al. Microsoft COCO: Common Objects in Context , 2014, ECCV.
[44] Wei Xu,et al. Explain Images with Multimodal Recurrent Neural Networks , 2014, ArXiv.
[45] Alex Graves,et al. Recurrent Models of Visual Attention , 2014, NIPS.
[46] László Tóth,et al. Combining time- and frequency-domain convolution in convolutional neural network-based phone recognition , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[47] Nadir Durrani,et al. Edinburgh’s Phrase-based Machine Translation Systems for WMT-14 , 2014, WMT@ACL.
[48] Matthieu Cord,et al. Sequentially Generated Instance-Dependent Image Representations for Classification , 2014, ICLR.
[49] Yoshua Bengio,et al. Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling , 2014, ArXiv.
[50] Alex Graves,et al. Neural Turing Machines , 2014, ArXiv.
[51] Misha Denil,et al. Extraction of Salient Sentences from Labelled Documents , 2014, ArXiv.
[52] Quoc V. Le,et al. Sequence to Sequence Learning with Neural Networks , 2014, NIPS.
[53] Thomas Brox,et al. Striving for Simplicity: The All Convolutional Net , 2014, ICLR.
[54] Geoffrey Zweig,et al. From captions to visual concepts and back , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[55] Yoshua Bengio,et al. On Using Very Large Target Vocabulary for Neural Machine Translation , 2014, ACL.
[56] C. Lawrence Zitnick,et al. CIDEr: Consensus-based image description evaluation , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[57] Geoffrey E. Hinton,et al. Grammar as a Foreign Language , 2014, NIPS.
[58] Yoshua Bengio,et al. Show, Attend and Tell: Neural Image Caption Generation with Visual Attention , 2015, ICML.
[59] Jason Weston,et al. End-To-End Memory Networks , 2015, NIPS.
[60] Wei Xu,et al. Deep Captioning with Multimodal Recurrent Neural Networks (m-RNN) , 2014, ICLR.
[61] Christopher Joseph Pal,et al. Describing Videos by Exploiting Temporal Structure , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[62] Yoshua Bengio,et al. On Using Monolingual Corpora in Neural Machine Translation , 2015, ArXiv.
[63] Jason Weston,et al. Memory Networks , 2014, ICLR.
[64] Koray Kavukcuoglu,et al. Multiple Object Recognition with Visual Attention , 2014, ICLR.
[65] Navdeep Jaitly,et al. Pointer Networks , 2015, NIPS.
[66] Alex Graves,et al. DRAW: A Recurrent Neural Network For Image Generation , 2015, ICML.
[67] Jason Weston,et al. Weakly Supervised Memory Networks , 2015, ArXiv.
[68] Christopher Joseph Pal,et al. Using Descriptive Video Services to Create a Large Data Source for Video Annotation Research , 2015, ArXiv.
[69] Yoshua Bengio,et al. Attention-Based Models for Speech Recognition , 2015, NIPS.
[70] Tapani Raiko,et al. Techniques for Learning Binary Stochastic Feedforward Neural Networks , 2014, ICLR.
[71] Samy Bengio,et al. Show and tell: A neural image caption generator , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[72] Dumitru Erhan,et al. Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[73] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.
[74] Trevor Darrell,et al. Long-term recurrent convolutional networks for visual recognition and description , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[75] Geoffrey Zweig,et al. Language Models for Image Captioning: The Quirks and What Works , 2015, ACL.
[76] Yoshua Bengio,et al. Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.
[77] U. Austin,et al. Translating Videos to Natural Language Using Deep Recurrent Neural Networks , 2017 .
[78] Fei-Fei Li,et al. Deep visual-semantic alignments for generating image descriptions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).