PROD: Progressive Distillation for Dense Retrieval
暂无分享,去创建一个
Anlei Dong | Rangan Majumder | Nan Duan | Daxin Jiang | Yeyun Gong | Hang Zhang | Xiao Liu | Jian Jiao | Jing Lu | Chen Lin | Zhenghao Lin
[1] Tao Shen,et al. LED: Lexicon-Enlightened Dense Retriever for Large-Scale Retrieval , 2022, WWW.
[2] Hua Wu,et al. ERNIE-Search: Bridging Cross-Encoder with Dual-Encoder via Self On-the-fly Distillation for Dense Passage Retrieval , 2022, ArXiv.
[3] Hamed Zamani,et al. Curriculum Learning for Dense Retrieval Distillation , 2022, SIGIR.
[4] Raffay Hamid,et al. Robust Cross-Modal Representation Learning with Progressive Self-Distillation , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[5] Chuhan Wu,et al. Unified and Effective Ensemble Knowledge Distillation , 2022, ArXiv.
[6] Tim Salimans,et al. Progressive Distillation for Fast Sampling of Diffusion Models , 2022, ICLR.
[7] Reza Yazdani Aminabadi,et al. Using DeepSpeed and Megatron to Train Megatron-Turing NLG 530B, A Large-Scale Generative Language Model , 2022, ArXiv.
[8] M. Zaharia,et al. ColBERTv2: Effective and Efficient Retrieval via Lightweight Late Interaction , 2021, NAACL.
[9] Ali Ghodsi,et al. Pro-KD: Progressive Distillation by Following the Footsteps of the Teacher , 2021, COLING.
[10] I. E. Yen,et al. Sparse Progressive Distillation: Resolving Overfitting under Pretrain-and-Finetune Paradigm , 2021, ACL.
[11] Wayne Xin Zhao,et al. RocketQAv2: A Joint Training Method for Dense Passage Retrieval and Passage Re-ranking , 2021, EMNLP.
[12] Weizhu Chen,et al. Adversarial Retriever-Ranker for dense text retrieval , 2021, ICLR.
[13] Benjamin Piwowarski,et al. SPLADE v2: Sparse Lexical and Expansion Model for Information Retrieval , 2021, ArXiv.
[14] Hua Wu,et al. PAIR: Leveraging Passage-Centric Similarity Relation for Improving Dense Passage Retrieval , 2021, FINDINGS.
[15] Wen-tau Yih,et al. Domain-matched Pre-training Tasks for Dense Retrieval , 2021, NAACL-HLT.
[16] Julian McAuley,et al. BERT Learns to Teach: Knowledge Distillation with Meta Learning , 2021, ACL.
[17] Hao Tian,et al. ERNIE-Tiny : A Progressive Distillation Framework for Pretrained Transformer Compression , 2021, ArXiv.
[18] Se-Young Yun,et al. Comparing Kullback-Leibler Divergence and Mean Squared Error Loss in Knowledge Distillation , 2021, IJCAI.
[19] Jiaya Jia,et al. Distilling Knowledge via Knowledge Review , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[20] Jiafeng Guo,et al. Optimizing Dense Retrieval Model Training with Hard Negatives , 2021, SIGIR.
[21] Jimmy J. Lin,et al. Efficiently Teaching an Effective Dense Retriever with Balanced Topic Aware Sampling , 2021, SIGIR.
[22] William L. Hamilton,et al. End-to-End Training of Neural Retrievers for Open-Domain Question Answering , 2021, ACL.
[23] Jun Zhao,et al. Incremental Event Detection via Knowledge Consolidation Networks , 2020, EMNLP.
[24] Jimmy J. Lin,et al. Distilling Dense Representations for Ranking using Tightly-Coupled Teachers , 2020, ArXiv.
[25] Hua Wu,et al. RocketQA: An Optimized Training Approach to Dense Passage Retrieval for Open-Domain Question Answering , 2020, NAACL.
[26] Allan Hanbury,et al. Improving Efficient Neural Ranking Models with Cross-Architecture Knowledge Distillation , 2020, ArXiv.
[27] Tiancheng Zhao,et al. SPARTA: Efficient Open-Domain Question Answering via Sparse Transformer Matching Retrieval , 2020, NAACL.
[28] Minjoon Seo,et al. Is Retriever Merely an Approximator of Reader? , 2020, ArXiv.
[29] Yejin Choi,et al. Dataset Cartography: Mapping and Diagnosing Datasets with Training Dynamics , 2020, EMNLP.
[30] Yelong Shen,et al. Generation-Augmented Retrieval for Open-Domain Question Answering , 2020, ACL.
[31] Paul N. Bennett,et al. Approximate Nearest Neighbor Negative Contrastive Learning for Dense Text Retrieval , 2020, ICLR.
[32] Mark Chen,et al. Language Models are Few-Shot Learners , 2020, NeurIPS.
[33] M. Zaharia,et al. ColBERT: Efficient and Effective Passage Search via Contextualized Late Interaction over BERT , 2020, SIGIR.
[34] Danqi Chen,et al. Dense Passage Retrieval for Open-Domain Question Answering , 2020, EMNLP.
[35] Ming Zhou,et al. ProphetNet: Predicting Future N-gram for Sequence-to-Sequence Pre-training , 2020, FINDINGS.
[36] Graham Neubig,et al. Understanding Knowledge Distillation in Non-autoregressive Machine Translation , 2019, ICLR.
[37] Omer Levy,et al. BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension , 2019, ACL.
[38] Thomas Wolf,et al. DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter , 2019, ArXiv.
[39] Xin Jiang,et al. TinyBERT: Distilling BERT for Natural Language Understanding , 2019, FINDINGS.
[40] Ming-Wei Chang,et al. Natural Questions: A Benchmark for Question Answering Research , 2019, TACL.
[41] Jamie Callan,et al. Deeper Text Understanding for IR with Contextual Neural Language Modeling , 2019, SIGIR.
[42] Jimmy J. Lin,et al. Document Expansion by Query Prediction , 2019, ArXiv.
[43] Seyed Iman Mirzadeh,et al. Improved Knowledge Distillation via Teacher Assistant , 2019, AAAI.
[44] Frank Hutter,et al. Decoupled Weight Decay Regularization , 2017, ICLR.
[45] Bhuvana Ramabhadran,et al. Efficient Knowledge Distillation from an Ensemble of Teachers , 2017, INTERSPEECH.
[46] Jimmy J. Lin,et al. Anserini: Enabling the Use of Lucene for Information Retrieval Research , 2017, SIGIR.
[47] Jeff Johnson,et al. Billion-Scale Similarity Search with GPUs , 2017, IEEE Transactions on Big Data.
[48] Nikos Komodakis,et al. Paying More Attention to Attention: Improving the Performance of Convolutional Neural Networks via Attention Transfer , 2016, ICLR.
[49] Jianfeng Gao,et al. A Human Generated MAchine Reading COmprehension Dataset , 2018 .
[50] Derek Hoiem,et al. Learning without Forgetting , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[51] Geoffrey E. Hinton,et al. Distilling the Knowledge in a Neural Network , 2015, ArXiv.
[52] Yoshua Bengio,et al. FitNets: Hints for Thin Deep Nets , 2014, ICLR.
[53] Yejin Choi,et al. Understanding Dataset Difficulty with V-Usable Information , 2022, ICML.
[54] Jimmy J. Lin,et al. In-Batch Negatives for Knowledge Distillation with Tightly-Coupled Teachers for Dense Retrieval , 2021, REPL4NLP.
[55] Nick Craswell,et al. O VERVIEW OF THE TREC 2019 DEEP LEARNING TRACK , 2020 .
[56] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.