Semi-Supervised Sequence Modeling with Cross-View Training
暂无分享,去创建一个
Quoc V. Le | Christopher D. Manning | Kevin Clark | Minh-Thang Luong | Minh-Thang Luong | Kevin Clark
[1] Sanja Fidler,et al. Skip-Thought Vectors , 2015, NIPS.
[2] Quoc V. Le,et al. Semi-supervised Sequence Learning , 2015, NIPS.
[3] Jan Niehues,et al. The IWSLT 2015 Evaluation Campaign , 2015, IWSLT.
[4] Mark Steedman,et al. CCGbank: A Corpus of CCG Derivations and Dependency Structures Extracted from the Penn Treebank , 2007, CL.
[5] Erik F. Tjong Kim Sang,et al. Introduction to the CoNLL-2003 Shared Task: Language-Independent Named Entity Recognition , 2003, CoNLL.
[6] Shin Ishii,et al. Virtual Adversarial Training: A Regularization Method for Supervised and Semi-Supervised Learning , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[7] Tolga Tasdizen,et al. Regularization With Stochastic Transformations and Perturbations for Deep Semi-Supervised Learning , 2016, NIPS.
[8] Sebastian Ruder,et al. Universal Language Model Fine-tuning for Text Classification , 2018, ACL.
[9] Sabine Buchholz,et al. Introduction to the CoNLL-2000 Shared Task Chunking , 2000, CoNLL/LLL.
[10] Beatrice Santorini,et al. Building a Large Annotated Corpus of English: The Penn Treebank , 1993, CL.
[11] Alex Krizhevsky,et al. Learning Multiple Layers of Features from Tiny Images , 2009 .
[12] Andrew M. Dai,et al. Adversarial Training Methods for Semi-Supervised Text Classification , 2016, ICLR.
[13] Mitchell P. Marcus,et al. OntoNotes: The 90% Solution , 2006, NAACL.
[14] Nitish Srivastava,et al. Improving neural networks by preventing co-adaptation of feature detectors , 2012, ArXiv.
[15] Christopher D. Manning,et al. Stanford Neural Machine Translation Systems for Spoken Language Domains , 2015, IWSLT.
[16] Ioannis Mitliagkas,et al. Manifold Mixup: Encouraging Meaningful On-Manifold Interpolation as a Regularizer , 2018, ArXiv.
[17] Geoffrey E. Hinton,et al. Regularizing Neural Networks by Penalizing Confident Output Distributions , 2017, ICLR.
[18] Qiang Yang,et al. An Overview of Multi-task Learning , 2018 .
[19] Geoffrey E. Hinton,et al. Distilling the Knowledge in a Neural Network , 2015, ArXiv.
[20] Eugene Charniak,et al. Parsing as Language Modeling , 2016, EMNLP.
[21] Shin Ishii,et al. Distributional Smoothing with Virtual Adversarial Training , 2015, ICLR 2016.
[22] Eugene Charniak,et al. Effective Self-Training for Parsing , 2006, NAACL.
[23] Il-Chul Moon,et al. Adversarial Dropout for Supervised and Semi-supervised Learning , 2017, AAAI.
[24] Xiang Wei,et al. Improving the Improved Training of Wasserstein GANs: A Consistency Term and Its Dual Effect , 2018, ICLR.
[25] Geoffrey E. Hinton,et al. On the importance of initialization and momentum in deep learning , 2013, ICML.
[26] Anders Søgaard,et al. Deep multi-task learning with low level tasks supervised at lower layers , 2016, ACL.
[27] Sebastian Ruder,et al. An Overview of Multi-Task Learning in Deep Neural Networks , 2017, ArXiv.
[28] Richard Socher,et al. Learned in Translation: Contextualized Word Vectors , 2017, NIPS.
[29] Timo Aila,et al. Temporal Ensembling for Semi-Supervised Learning , 2016, ICLR.
[30] Wojciech Zaremba,et al. Improved Techniques for Training GANs , 2016, NIPS.
[31] Guillaume Lample,et al. Neural Architectures for Named Entity Recognition , 2016, NAACL.
[32] Mikhail Belkin,et al. A Co-Regularization Approach to Semi-supervised Learning with Multiple Views , 2005 .
[33] Quoc V. Le,et al. Multi-task Sequence to Sequence Learning , 2015, ICLR.
[34] Sergey Ioffe,et al. Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[35] Alec Radford,et al. Improving Language Understanding by Generative Pre-Training , 2018 .
[36] Boris Polyak. Some methods of speeding up the convergence of iteration methods , 1964 .
[37] Yuan Zhang,et al. Stack-propagation: Improved Representation Learning for Syntax , 2016, ACL.
[38] Jason Weston,et al. Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..
[39] Razvan Pascanu,et al. Overcoming catastrophic forgetting in neural networks , 2016, Proceedings of the National Academy of Sciences.
[40] Jeffrey Dean,et al. Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.
[41] Jason Weston,et al. A unified architecture for natural language processing: deep neural networks with multitask learning , 2008, ICML '08.
[42] Quoc V. Le,et al. Unsupervised Pretraining for Sequence to Sequence Learning , 2016, EMNLP.
[43] Xiang Ren,et al. Empower Sequence Labeling with Task-Aware Neural Language Model , 2017, AAAI.
[44] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[45] Rich Caruana,et al. Multitask Learning , 1998, Encyclopedia of Machine Learning and Data Mining.
[46] Andrew McCallum,et al. Fast and Accurate Sequence Labeling with Iterated Dilated Convolutions , 2017, ArXiv.
[47] Hongyi Zhang,et al. mixup: Beyond Empirical Risk Minimization , 2017, ICLR.
[48] Derek Hoiem,et al. Learning without Forgetting , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[49] Thorsten Brants,et al. One billion word benchmark for measuring progress in statistical language modeling , 2013, INTERSPEECH.
[50] Rico Sennrich,et al. Improving Neural Machine Translation Models with Monolingual Data , 2015, ACL.
[51] Luke S. Zettlemoyer,et al. LSTM CCG Parsing , 2016, NAACL.
[52] Christopher Joseph Pal,et al. Learning General Purpose Distributed Sentence Representations via Large Scale Multi-task Learning , 2018, ICLR.
[53] Iryna Gurevych,et al. Reporting Score Distributions Makes a Difference: Performance Study of LSTM-networks for Sequence Tagging , 2017, EMNLP.
[54] Jiajun Zhang,et al. Shortcut Sequence Tagging , 2017, NLPCC.
[55] Felix Hill,et al. Learning Distributed Representations of Sentences from Unlabelled Data , 2016, NAACL.
[56] Yue Zhang,et al. In-Order Transition-based Constituent Parsing , 2017, TACL.
[57] Yann LeCun,et al. Convolutional networks and applications in vision , 2010, Proceedings of 2010 IEEE International Symposium on Circuits and Systems.
[58] Quoc V. Le,et al. Sequence to Sequence Learning with Neural Networks , 2014, NIPS.
[59] Barbara Plank,et al. Strong Baselines for Neural Semi-Supervised Learning under Domain Shift , 2018, ACL.
[60] Timothy Dozat,et al. Deep Biaffine Attention for Neural Dependency Parsing , 2016, ICLR.
[61] Eduard H. Hovy,et al. End-to-end Sequence Labeling via Bi-directional LSTM-CNNs-CRF , 2016, ACL.
[62] Jeffrey Pennington,et al. GloVe: Global Vectors for Word Representation , 2014, EMNLP.
[63] Luke S. Zettlemoyer,et al. Deep Contextualized Word Representations , 2018, NAACL.
[64] Dacheng Tao,et al. A Survey on Multi-view Learning , 2013, ArXiv.
[65] David Yarowsky,et al. Unsupervised Word Sense Disambiguation Rivaling Supervised Methods , 1995, ACL.
[66] Andrew Zisserman,et al. Multi-task Self-Supervised Visual Learning , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[67] Jürgen Schmidhuber,et al. Framewise phoneme classification with bidirectional LSTM and other neural network architectures , 2005, Neural Networks.
[68] Yoshua Bengio,et al. Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.
[69] Yoshimasa Tsuruoka,et al. A Joint Many-Task Model: Growing a Neural Network for Multiple NLP Tasks , 2016, EMNLP.
[70] Avrim Blum,et al. The Bottleneck , 2021, Monopsony Capitalism.
[71] Marek Rei,et al. Semi-supervised Multitask Learning for Sequence Labeling , 2017, ACL.
[72] Noah A. Smith,et al. Deep Multitask Learning for Semantic Dependency Parsing , 2017, ACL.
[73] Zhi-Hua Zhou,et al. Tri-training: exploiting unlabeled data using three classifiers , 2005, IEEE Transactions on Knowledge and Data Engineering.
[74] Yann LeCun,et al. What is the best multi-stage architecture for object recognition? , 2009, 2009 IEEE 12th International Conference on Computer Vision.
[75] Philip Bachman,et al. Learning with Pseudo-Ensembles , 2014, NIPS.
[76] Eduard H. Hovy,et al. Neural Probabilistic Model for Non-projective MST Parsing , 2017, IJCNLP.
[77] Eric Nichols,et al. Named Entity Recognition with Bidirectional LSTM-CNNs , 2015, TACL.
[78] Holger Schwenk,et al. Supervised Learning of Universal Sentence Representations from Natural Language Inference Data , 2017, EMNLP.
[79] Zachary Chase Lipton,et al. Born Again Neural Networks , 2018, ICML.
[80] Chandra Bhagavatula,et al. Semi-supervised sequence tagging with bidirectional language models , 2017, ACL.
[81] H. J. Scudder,et al. Probability of error of some adaptive pattern-recognition machines , 1965, IEEE Trans. Inf. Theory.
[82] Noah A. Smith,et al. What Do Recurrent Neural Network Grammars Learn About Syntax? , 2016, EACL.
[83] Tom Schaul,et al. Reinforcement Learning with Unsupervised Auxiliary Tasks , 2016, ICLR.
[84] Christopher D. Manning,et al. Effective Approaches to Attention-based Neural Machine Translation , 2015, EMNLP.
[85] Jingzhou Liu,et al. Stack-Pointer Networks for Dependency Parsing , 2018, ACL.
[86] Fan Yang,et al. Good Semi-supervised Learning That Requires a Bad GAN , 2017, NIPS.