Language Modelling as a Multi-Task Problem
暂无分享,去创建一个
Lucas Weber | Jaap Jumelet | Elia Bruni | Dieuwke Hupkes | Elia Bruni | D. Hupkes | Leon Weber | Jaap Jumelet | Dieuwke Hupkes
[1] Rich Caruana,et al. Multitask Learning , 1998, Encyclopedia of Machine Learning and Data Mining.
[2] Samuel R. Bowman,et al. Can neural networks acquire a structural bias from raw linguistic data? , 2020, CogSci.
[3] Tal Linzen,et al. Targeted Syntactic Evaluation of Language Models , 2018, EMNLP.
[4] Xiaoou Tang,et al. Facial Landmark Detection by Deep Multi-task Learning , 2014, ECCV.
[5] J. Mestre. Transfer of learning from a modern multidisciplinary perspective , 2005 .
[6] Dirk Hovy,et al. Multitask Learning for Mental Health Conditions with Limited Social Media Data , 2017, EACL.
[7] Jonathan Baxter,et al. A Model of Inductive Bias Learning , 2000, J. Artif. Intell. Res..
[8] Sebastian Thrun,et al. Discovering Structure in Multiple Learning Tasks: The TC Algorithm , 1996, ICML.
[9] Tong Zhang,et al. A Framework for Learning Predictive Structures from Multiple Tasks and Unlabeled Data , 2005, J. Mach. Learn. Res..
[10] Jaime G. Carbonell,et al. Characterizing and Avoiding Negative Transfer , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[11] Paul Portner,et al. Semantics: An International Handbook of Natural Language Meaning , 2011 .
[12] Mark Johnson,et al. An Improved Non-monotonic Transition System for Dependency Parsing , 2015, EMNLP.
[13] M. Cole,et al. Cognitive Development: Its Cultural and Social Foundations , 1976 .
[14] Chris Barker,et al. Negative polarity as scope marking , 2018 .
[15] Rich Caruana,et al. Multitask Learning: A Knowledge-Based Source of Inductive Bias , 1993, ICML.
[16] Lukasz Kaiser,et al. One Model To Learn Them All , 2017, ArXiv.
[17] Luke S. Zettlemoyer,et al. Dissecting Contextual Word Embeddings: Architecture and Representation , 2018, EMNLP.
[18] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[19] Qiang Yang,et al. A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.
[20] Andreas Maurer,et al. Bounds for Linear Multi-Task Learning , 2006, J. Mach. Learn. Res..
[21] Edouard Grave,et al. Colorless Green Recurrent Networks Dream Hierarchically , 2018, NAACL.
[22] Jacques Wainer,et al. Flexible Modeling of Latent Task Structures in Multitask Learning , 2012, ICML.
[23] A. Savitzky,et al. Smoothing and Differentiation of Data by Simplified Least Squares Procedures. , 1964 .
[24] Shikha Bordia,et al. Investigating BERT’s Knowledge of Language: Five Analysis Methods with NPIs , 2019, EMNLP.
[25] Qiang Yang,et al. A Survey on Multi-Task Learning , 2017, IEEE Transactions on Knowledge and Data Engineering.
[26] Thomas G. Dietterich,et al. To transfer or not to transfer , 2005, NIPS 2005.
[27] Jitendra Malik,et al. Which Tasks Should Be Learned Together in Multi-task Learning? , 2019, ICML.
[28] Trevor Darrell,et al. DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition , 2013, ICML.
[29] Jacob Hoeksema,et al. On the natural history of negative polarity items , 2012 .
[30] S. Cormier,et al. Transfer of learning: Contemporary research and applications. , 1987 .
[31] Jaap Jumelet,et al. diagNNose: A Library for Neural Activation Analysis , 2020, BLACKBOXNLP.
[32] Charles A. Micchelli,et al. A Spectral Regularization Framework for Multi-Task Structure Learning , 2007, NIPS.
[33] Joachim Bingel,et al. Identifying beneficial task relations for multi-task learning in deep neural networks , 2017, EACL.
[34] Dieuwke Hupkes,et al. Do Language Models Understand Anything? On the Ability of LSTMs to Understand Negative Polarity Items , 2018, BlackboxNLP@EMNLP.
[35] Roger Levy,et al. Structural Supervision Improves Learning of Non-Local Grammatical Dependencies , 2019, NAACL.
[36] Gilles Fauconnier,et al. Polarity and the Scale Principle , 1975 .
[37] Anders Søgaard,et al. Deep multi-task learning with low level tasks supervised at lower layers , 2016, ACL.
[38] Quoc V. Le,et al. Multi-task Sequence to Sequence Learning , 2015, ICLR.
[39] A. Giannakidou,et al. Negative and Positive Polarity Items: Variation, Licensing, and Compositionality , 2008 .
[40] Jason Weston,et al. A unified architecture for natural language processing: deep neural networks with multitask learning , 2008, ICML '08.
[41] Jason Weston,et al. Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..
[42] Dipanjan Das,et al. BERT Rediscovers the Classical NLP Pipeline , 2019, ACL.
[43] William A. Ladusaw. Polarity sensitivity as inherent scope relations , 1980 .
[44] Yonatan Belinkov,et al. Linguistic Knowledge and Transferability of Contextual Representations , 2019, NAACL.