Matryoshka Representation Learning
暂无分享,去创建一个
S. Kakade | Ali Farhadi | Prateek Jain | Matthew Wallingford | Aditya Kusupati | Kaifeng Chen | Gantavya Bhatt | Aniket Rege | Aditya Sinha | Vivek Ramanujan | William Howard-Snyder | V. Ramanujan
[1] Charless C. Fowlkes,et al. Task Adaptive Parameter Sharing for Multi-Task Learning , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[2] Yejin Choi,et al. MERLOT RESERVE: Neural Script Knowledge through Vision and Language and Sound , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[3] Ross B. Girshick,et al. Masked Autoencoders Are Scalable Vision Learners , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[4] Jong Wook Kim,et al. Robust fine-tuning of zero-shot models , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[5] Inderjit S. Dhillon,et al. Extreme Multi-label Learning for Semantic Matching in Product Search , 2021, KDD.
[6] Ali Farhadi,et al. LLC: Accurate, Multi-purpose Learnt Low-dimensional Binary Codes , 2021, NeurIPS.
[7] Michael W. Mahoney,et al. A Survey of Quantization Methods for Efficient Neural Network Inference , 2021, Low-Power Computer Vision.
[8] Ilya Sutskever,et al. Learning Transferable Visual Models From Natural Language Supervision , 2021, ICML.
[9] Quoc V. Le,et al. Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision , 2021, ICML.
[10] Ali Farhadi,et al. Are We Overfitting to Experimental Setups in Recognition , 2020 .
[11] Elad Eban,et al. Multiple Networks are More Efficient than One: Fast and Accurate Models via Ensembles and Cascades , 2020, ArXiv.
[12] S. Gelly,et al. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale , 2020, ICLR.
[13] I. Dhillon,et al. PECOS: Prediction for Enormous and Correlated Output Spaces , 2020, J. Mach. Learn. Res..
[14] Emily Denton,et al. Characterising Bias in Compressed Models , 2020, ArXiv.
[15] D. Song,et al. The Many Faces of Robustness: A Critical Analysis of Out-of-Distribution Generalization , 2020, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[16] Justin Johnson,et al. VirTex: Learning Visual Representations from Textual Annotations , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[17] Mark Chen,et al. Language Models are Few-Shot Learners , 2020, NeurIPS.
[18] Trevor Darrell,et al. Meta-Baseline: Exploring Simple Meta-Learning for Few-Shot Learning , 2020, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[19] Prateek Jain,et al. RNNPool: Efficient Non-linear Pooling for RAM Constrained Inference , 2020, NeurIPS.
[20] Geoffrey E. Hinton,et al. A Simple Framework for Contrastive Learning of Visual Representations , 2020, ICML.
[21] Wei-Cheng Chang,et al. Pre-training Tasks for Embedding-based Large-scale Retrieval , 2020, ICLR.
[22] S. Kakade,et al. Soft Threshold Weight Reparameterization for Learnable Sparsity , 2020, ICML.
[23] Manik Varma,et al. Extreme Regression for Dynamic Search Advertising , 2020, WSDM.
[24] Natalia Gimelshein,et al. PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.
[25] Ross B. Girshick,et al. Momentum Contrast for Unsupervised Visual Representation Learning , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[26] Aaron C. Courville,et al. What Do Compressed Deep Neural Networks Forget , 2019, 1911.05248.
[27] Anish Arora,et al. One Size Does Not Fit All: Multi-Scale, Cascaded RNNs for Radar Classification , 2019, BuildSys@SenSys.
[28] Chuang Gan,et al. Once for All: Train One Network and Specialize it for Efficient Deployment , 2019, ICLR.
[29] Dawn Song,et al. Natural Adversarial Examples , 2019, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[30] Thomas Wolf,et al. Transfer Learning in Natural Language Processing , 2019, NAACL.
[31] Quoc V. Le,et al. EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks , 2019, ICML.
[32] Eric P. Xing,et al. Learning Robust Global Representations by Penalizing Local Predictive Power , 2019, NeurIPS.
[33] Benjamin Recht,et al. Do ImageNet Classifiers Generalize to ImageNet? , 2019, ICML.
[34] Venkatesh Balasubramanian,et al. Slice: Scalable Linear Extreme Classifiers Trained on 100 Million Labels for Related Searches , 2019, WSDM.
[35] Ning Xu,et al. Slimmable Neural Networks , 2018, ICLR.
[36] Sebastian Fedden,et al. Extreme classification , 2018, Cognitive Linguistics.
[37] Prateek Jain,et al. FastGRNN: A Fast, Accurate, Stable and Tiny Kilobyte Sized Gated Recurrent Neural Network , 2018, NeurIPS.
[38] Noam Shazeer,et al. Adafactor: Adaptive Learning Rates with Sublinear Memory Cost , 2018, ICML.
[39] Luke S. Zettlemoyer,et al. Deep Contextualized Word Representations , 2018, NAACL.
[40] Sebastian Ruder,et al. Universal Language Model Fine-tuning for Text Classification , 2018, ACL.
[41] Frank Hutter,et al. Decoupled Weight Decay Regularization , 2017, ICLR.
[42] Nathan Srebro,et al. The Implicit Bias of Gradient Descent on Separable Data , 2017, J. Mach. Learn. Res..
[43] Debadeepta Dey,et al. Learning Anytime Predictions in Neural Networks via Adaptive Loss Balancing , 2017, AAAI.
[44] Chen Sun,et al. Revisiting Unreasonable Effectiveness of Data in Deep Learning Era , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[45] Bo Chen,et al. MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications , 2017, ArXiv.
[46] David A. Patterson,et al. In-datacenter performance analysis of a tensor processing unit , 2017, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).
[47] Jeff Johnson,et al. Billion-Scale Similarity Search with GPUs , 2017, IEEE Transactions on Big Data.
[48] Abhishek Das,et al. Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).
[49] Kevin Gimpel,et al. A Baseline for Detecting Misclassified and Out-of-Distribution Examples in Neural Networks , 2016, ICLR.
[50] Michael Cogswell,et al. Stochastic Multiple Choice Learning for Training Diverse Deep Ensembles , 2016, NIPS.
[51] Yury A. Malkov,et al. Efficient and Robust Approximate Nearest Neighbor Search Using Hierarchical Navigable Small World Graphs , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[52] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[53] Leslie N. Smith,et al. Cyclical Learning Rates for Training Neural Networks , 2015, 2017 IEEE Winter Conference on Applications of Computer Vision (WACV).
[54] Geoffrey E. Hinton,et al. Deep Learning , 2015, Nature.
[55] Yoshua Bengio,et al. How transferable are features in deep neural networks? , 2014, NIPS.
[56] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.
[57] Michael S. Bernstein,et al. ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.
[58] Ali Farhadi,et al. Learning Everything about Anything: Webly-Supervised Visual Concept Learning , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.
[59] Ryan P. Adams,et al. Learning Ordered Representations with Nested Dropout , 2014, ICML.
[60] Geoffrey E. Hinton,et al. On the importance of initialization and momentum in deep learning , 2013, ICML.
[61] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[62] Yoshua Bengio,et al. Deep Learning of Representations for Unsupervised and Transfer Learning , 2011, ICML Unsupervised and Transfer Learning.
[63] Fei-Fei Li,et al. Hierarchical semantic indexing for large scale image retrieval , 2011, CVPR 2011.
[64] Jürgen Schmidhuber,et al. Stacked Convolutional Auto-Encoders for Hierarchical Feature Extraction , 2011, ICANN.
[65] Jason Weston,et al. Label Embedding Trees for Large Multi-Class Tasks , 2010, NIPS.
[66] Aapo Hyvärinen,et al. Noise-contrastive estimation: A new estimation principle for unnormalized statistical models , 2010, AISTATS.
[67] Prateek Jain,et al. Fast Similarity Search for Learned Metrics , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[68] Geoffrey E. Hinton,et al. Semantic hashing , 2009, Int. J. Approx. Reason..
[69] Fei-Fei Li,et al. ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.
[70] Jeffrey Dean,et al. Challenges in building large-scale information retrieval systems: invited talk , 2009, WSDM '09.
[71] J. Hegdé. Time course of visual perception: Coarse-to-fine processing and beyond , 2008, Progress in Neurobiology.
[72] Geoffrey E. Hinton,et al. Learning a Nonlinear Embedding by Preserving Class Neighbourhood Structure , 2007, AISTATS.
[73] John Langford,et al. Cover trees for nearest neighbor , 2006, ICML.
[74] Nicole Immorlica,et al. Locality-sensitive hashing scheme based on p-stable distributions , 2004, SCG '04.
[75] C. A. Murthy,et al. Unsupervised Feature Selection Using Feature Similarity , 2002, IEEE Trans. Pattern Anal. Mach. Intell..
[76] Paul A. Viola,et al. Rapid object detection using a boosted cascade of simple features , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.
[77] Christos D Giachritsis,et al. Coarse-grained information dominates fine-grained information in judgments of time-to-contact from retinal flow , 2000, Vision Research.
[78] Piotr Indyk,et al. Approximate nearest neighbors: towards removing the curse of dimensionality , 1998, STOC '98.
[79] Sergey Brin,et al. The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.
[80] Filiberto Pla,et al. On the use of neighbourhood-based non-parametric classifiers , 1997, Pattern Recognit. Lett..
[81] Thomas G. Dietterich,et al. Solving Multiclass Learning Problems via Error-Correcting Output Codes , 1994, J. Artif. Intell. Res..
[82] Jon Louis Bentley,et al. K-d trees for semidynamic point sets , 1990, SCG '90.
[83] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[84] Suhas Jayaram Subramanya,et al. DiskANN : Fast Accurate Billion-point Nearest Neighbor Search on a Single Node , 2019 .
[85] Boris Katz,et al. ObjectNet: A large-scale bias-controlled dataset for pushing the limits of object recognition models , 2019, NeurIPS.
[86] Alec Radford,et al. Improving Language Understanding by Generative Pre-Training , 2018 .
[87] Cordelia Schmid,et al. Product Quantization for Nearest Neighbor Search , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[88] W. B. Johnson,et al. Extensions of Lipschitz mappings into Hilbert space , 1984 .
[89] Robert M. Gray,et al. An Algorithm for Vector Quantizer Design , 1980, IEEE Trans. Commun..
[90] H. Hotelling. Analysis of a complex of statistical variables into principal components. , 1933 .