暂无分享,去创建一个
[1] Carlo Lavalle,et al. Assessing the influence of climate model uncertainty on EU-wide climate change impact indicators , 2013, Climatic Change.
[2] Kevin Gimpel,et al. Gaussian Error Linear Units (GELUs) , 2016 .
[3] Jian Zhang,et al. SQuAD: 100,000+ Questions for Machine Comprehension of Text , 2016, EMNLP.
[4] Margaret Mitchell,et al. Perturbation Sensitivity Analysis to Detect Unintended Model Biases , 2019, EMNLP.
[5] Kevin Gimpel,et al. A Baseline for Detecting Misclassified and Out-of-Distribution Examples in Neural Networks , 2016, ICLR.
[6] Sebastian Riedel,et al. Question and Answer Test-Train Overlap in Open-Domain Question Answering Datasets , 2020, EACL.
[7] Ronan Le Bras,et al. Adversarial Filters of Dataset Biases , 2020, ICML.
[8] I. ClintHeyer. Human-Robot Interaction and Future Industrial Robotics Applications , 2010 .
[9] Aditi Raghunathan,et al. Certified Robustness to Adversarial Word Substitutions , 2019, EMNLP.
[10] Dawn Song,et al. Pretrained Transformers Improve Out-of-Distribution Robustness , 2020, ACL.
[11] Christopher Potts,et al. Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank , 2013, EMNLP.
[12] Oren Etzioni,et al. Green AI , 2019, Commun. ACM.
[13] Boris Beizer,et al. Black Box Testing: Techniques for Functional Testing of Software and Systems , 1996, IEEE Software.
[14] Nitika Mathur,et al. Tangled up in BLEU: Reevaluating the Evaluation of Automatic Machine Translation Evaluation Metrics , 2020, ACL.
[15] Christopher Potts,et al. Learning Word Vectors for Sentiment Analysis , 2011, ACL.
[16] Peter Henderson,et al. Towards the Systematic Reporting of the Energy and Carbon Footprints of Machine Learning , 2020, ArXiv.
[17] Zellig S. Harris,et al. Distributional Structure , 1954 .
[18] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[19] Jeffrey Dean,et al. Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.
[20] Jianqi Sun,et al. Projection and uncertainty analysis of global precipitation‐related extremes using CMIP5 models , 2014 .
[21] Swaroop Mishra,et al. Do We Need to Create Big Datasets to Learn a Task? , 2020, SUSTAINLP.
[22] Iryna Gurevych,et al. Reporting Score Distributions Makes a Difference: Performance Study of LSTM-networks for Sequence Tagging , 2017, EMNLP.
[23] Alexei A. Efros,et al. Unbiased look at dataset bias , 2011, CVPR 2011.
[24] Peter Szolovits,et al. Is BERT Really Robust? A Strong Baseline for Natural Language Attack on Text Classification and Entailment , 2020, AAAI.
[25] Yoshua Bengio,et al. Convolutional networks for images, speech, and time series , 1998 .
[26] Luke S. Zettlemoyer,et al. Adversarial Example Generation with Syntactically Controlled Paraphrase Networks , 2018, NAACL.
[27] Yongdong Zhang,et al. Curriculum Learning for Natural Language Understanding , 2020, ACL.
[28] Thibault Sellam,et al. BLEURT: Learning Robust Metrics for Text Generation , 2020, ACL.
[29] Yejin Choi,et al. WINOGRANDE: An Adversarial Winograd Schema Challenge at Scale , 2020, AAAI.
[30] Chitta Baral,et al. DQI: Measuring Data Quality in NLP , 2020, ArXiv.
[31] Omer Levy,et al. RoBERTa: A Robustly Optimized BERT Pretraining Approach , 2019, ArXiv.
[32] Jeffrey Pennington,et al. GloVe: Global Vectors for Word Representation , 2014, EMNLP.
[33] Roy Schwartz,et al. Show Your Work: Improved Reporting of Experimental Results , 2019, EMNLP.
[34] Kyle Gorman,et al. We Need to Talk about Standard Splits , 2019, ACL.
[35] Maxime Peyrard,et al. Studying Summarization Evaluation Metrics in the Appropriate Scoring Range , 2019, ACL.
[36] Avrim Blum,et al. The Ladder: A Reliable Leaderboard for Machine Learning Competitions , 2015, ICML.
[37] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[38] Benjamin Recht,et al. Do ImageNet Classifiers Generalize to ImageNet? , 2019, ICML.
[39] Chitta Baral,et al. Our Evaluation Metric Needs an Update to Encourage Generalization , 2020, ArXiv.
[40] Michael S. Bernstein,et al. ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.
[41] Yonatan Belinkov,et al. Synthetic and Natural Noise Both Break Neural Machine Translation , 2017, ICLR.
[42] Percy Liang,et al. Adversarial Examples for Evaluating Reading Comprehension Systems , 2017, EMNLP.
[43] Wai Lam,et al. Evaluation Challenges in Large-Scale Document Summarization , 2003, ACL.
[44] Christopher Potts,et al. A large annotated corpus for learning natural language inference , 2015, EMNLP.
[45] Sameer Singh,et al. Beyond Accuracy: Behavioral Testing of NLP Models with CheckList , 2020, ACL.
[46] Percy Liang,et al. Selective Question Answering under Domain Shift , 2020, ACL.