暂无分享,去创建一个
[1] Myle Ott,et al. fairseq: A Fast, Extensible Toolkit for Sequence Modeling , 2019, NAACL.
[2] Yee Whye Teh,et al. Set Transformer , 2018, ArXiv.
[3] Richard Socher,et al. Pointer Sentinel Mixture Models , 2016, ICLR.
[4] Koji Tsuda,et al. Support vector classifier with asymetric kernel function , 1999, The European Symposium on Artificial Neural Networks.
[5] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[6] Ilya Sutskever,et al. Generating Long Sequences with Sparse Transformers , 2019, ArXiv.
[7] Yiming Yang,et al. Transformer-XL: Attentive Language Models beyond a Fixed-Length Context , 2019, ACL.
[8] Andrew Gordon Wilson,et al. Deep Kernel Learning , 2015, AISTATS.
[9] Yiming Yang,et al. MMD GAN: Towards Deeper Understanding of Moment Matching Network , 2017, NIPS.
[10] Quoc V. Le,et al. Sequence to Sequence Learning with Neural Networks , 2014, NIPS.
[11] Douglas Eck,et al. An Improved Relative Self-Attention Mechanism for Transformer with Application to Music Generation , 2018, ArXiv.
[12] Jean-Michel Morel,et al. A non-local algorithm for image denoising , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).
[13] Marc'Aurelio Ranzato,et al. Classical Structured Prediction Losses for Sequence to Sequence Learning , 2017, NAACL.
[14] Douglas Eck,et al. Music Transformer , 2018, 1809.04281.
[15] Pablo Barceló,et al. On the Turing Completeness of Modern Neural Network Architectures , 2019, ICLR.
[16] Alper Yilmaz,et al. Object Tracking by Asymmetric Kernel Mean Shift with Automatic Scale and Orientation Selection , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.
[17] Ruslan Salakhutdinov,et al. Multimodal Transformer for Unaligned Multimodal Language Sequences , 2019, ACL.
[18] Yee Whye Teh,et al. Set Transformer , 2018, ICML.
[19] Ali Farhadi,et al. Video Relationship Reasoning Using Gated Spatio-Temporal Energy Graph , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[20] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[21] Trevor Darrell,et al. What you saw is not what you get: Domain adaptation using asymmetric kernel transforms , 2011, CVPR 2011.
[22] L. Wasserman. All of Nonparametric Statistics , 2005 .
[23] Abhinav Gupta,et al. Non-local Neural Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[24] Andrew M. Dai,et al. Music Transformer: Generating Music with Long-Term Structure , 2018, ICLR.
[25] Ashish Vaswani,et al. Self-Attention with Relative Position Representations , 2018, NAACL.
[26] A. Atiya,et al. Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2005, IEEE Transactions on Neural Networks.
[27] Dustin Tran,et al. Image Transformer , 2018, ICML.
[28] Vladlen Koltun,et al. An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling , 2018, ArXiv.