论文信息 - Transferability Metrics for Selecting Source Model Ensembles

Transferability Metrics for Selecting Source Model Ensembles

We address the problem of ensemble selection in transfer learning: Given a large pool of source models we want to select an ensemble of models which, after fine-tuning on the target training set, yields the best performance on the target test set. Since fine-tuning all possible ensembles is computationally prohibitive, we aim at predicting performance on the target dataset using a computationally efficient transferability metric. We propose several new transferability metrics designed for this task and evaluate them in a challenging and realistic transfer learning setup for semantic segmentation: we create a large and diverse pool of source models by considering 17 source datasets covering a wide variety of image domain, two different architectures, and two pre-training schemes. Given this pool, we then automatically select a subset to form an ensemble performing well on a given target dataset. We compare the ensemble selected by our method to two baselines which select a single source model, either (1) from the same pool as our method; or (2) from a pool containing large source models, each with similar capacity as an ensemble. Averaged over 17 target datasets, we outperform these baselines by 6.0% and 2.5% relative mean IoU, respectively. *Contact: agostinelli@google.com

Thomas Mensink | Jasper Uijlings | Vittorio Ferrari | Andrea Agostinelli

[1] Luc Van Gool,et al. The Pascal Visual Object Classes Challenge: A Retrospective , 2014, International Journal of Computer Vision.

[2] Leo Breiman,et al. Bagging Predictors , 1996, Machine Learning.

[3] Zhangjie Cao,et al. Zoo-Tuning: Adaptive Transfer from a Zoo of Models , 2021, ICML.

[4] Tal Hassner,et al. Transferability and Hardness of Supervised Classification Tasks , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[5] Neil Houlsby,et al. Supervised Transfer Learning at Scale for Medical Imaging , 2021, ArXiv.

[6] Kaiming He,et al. Momentum Contrast for Unsupervised Visual Representation Learning , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[7] Georg Heigold,et al. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale , 2021, ICLR.

[8] Gabriela Csurka,et al. Visual Localization by Learning Objects-Of-Interest Dense Match Regression , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[9] R. Sarpong,et al. Bio-inspired synthesis of xishacorenes A, B, and C, and a new congener from fuscol† †Electronic supplementary information (ESI) available. See DOI: 10.1039/c9sc02572c , 2019, Chemical science.

[10] Quoc V. Le,et al. Do Better ImageNet Models Transfer Better? , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[11] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12] Sanja Fidler,et al. Neural Data Server: A Large-Scale Search Engine for Transfer Learning Data , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[13] Anders Krogh,et al. Neural Network Ensembles, Cross Validation, and Active Learning , 1994, NIPS.

[14] Sebastian Ramos,et al. The Cityscapes Dataset for Semantic Urban Scene Understanding , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[15] Trevor Darrell,et al. Fully Convolutional Networks for Semantic Segmentation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16] Lu Yuan,et al. Improve Unsupervised Pretraining for Few-label Transfer , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[17] C. V. Jawahar,et al. IDD: A Dataset for Exploring Problems of Autonomous Navigation in Unconstrained Environments , 2018, 2019 IEEE Winter Conference on Applications of Computer Vision (WACV).

[18] Luc Van Gool,et al. The 2005 PASCAL Visual Object Classes Challenge , 2005, MLCW.

[19] Sebastiano Vigna,et al. A Weighted Correlation Index for Rankings with Ties , 2014, WWW.

[20] Peter Kontschieder,et al. The Mapillary Vistas Dataset for Semantic Understanding of Street Scenes , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[21] Xindong Wu,et al. Ensemble pruning via individual contribution ordering , 2010, KDD.

[22] Vittorio Ferrari,et al. COCO-Stuff: Thing and Stuff Classes in Context , 2016, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[23] Jasper Snoek,et al. Sparse MoEs meet Efficient Ensembles , 2021, ArXiv.

[24] Leonidas J. Guibas,et al. Taskonomy: Disentangling Task Transfer Learning , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[25] Quoc V. Le,et al. Domain Adaptive Transfer Learning with Specialist Models , 2018, ArXiv.

[26] Tal Hassner,et al. LEEP: A New Measure to Evaluate Transferability of Learned Representations , 2020, ICML.

[27] Qiao Wang,et al. VirtualWorlds as Proxy for Multi-object Tracking Analysis , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[28] Trevor Darrell,et al. BDD100K: A Diverse Driving Video Database with Scalable Annotation Tooling , 2018, ArXiv.

[29] Jianguo Zhang,et al. The PASCAL Visual Object Classes Challenge , 2006 .

[30] Huanhuan Chen,et al. When does Diversity Help Generalization in Classification Ensembles? , 2019, ArXiv.

[31] Roberto Cipolla,et al. Semantic object classes in video: A high-definition ground truth database , 2009, Pattern Recognit. Lett..

[32] Chen Sun,et al. Revisiting Unreasonable Effectiveness of Data in Deep Learning Era , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[33] Oleksandr Makeyev,et al. Neural network with ensembles , 2010, The 2010 International Joint Conference on Neural Networks (IJCNN).

[34] P. Alam. ‘L’ , 2021, Composites Engineering: An A–Z Guide.

[35] Carsten Rother,et al. Panoptic Segmentation , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[36] Andreas Geiger,et al. Augmented Reality Meets Computer Vision: Efficient Data Generation for Urban Driving Scenes , 2017, International Journal of Computer Vision.

[37] Sadman Sakib Enan,et al. Semantic Segmentation of Underwater Imagery: Dataset and Benchmark , 2020, 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[38] Vladlen Koltun,et al. MSeg: A Composite Dataset for Multi-Domain Semantic Segmentation , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[39] Atsuto Maki,et al. Factors of Transferability for a Generic ConvNet Representation , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[40] Trevor Darrell,et al. Best Practices for Fine-Tuning Visual Classifiers to New Domains , 2016, ECCV Workshops.

[41] Joan Puigcerver,et al. Deep Ensembles for Low-Data Transfer Learning , 2020, ArXiv.

[42] Kilian Q. Weinberger,et al. On Calibration of Modern Neural Networks , 2017, ICML.

[43] Matthias Nießner,et al. ScanNet: Richly-Annotated 3D Reconstructions of Indoor Scenes , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[44] Ning Qian,et al. On the momentum term in gradient descent learning algorithms , 1999, Neural Networks.

[45] Sanja Fidler,et al. The Role of Context for Object Detection and Semantic Segmentation in the Wild , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[46] Chun-Fu Chen,et al. A Broad Study on the Transferability of Visual Representations with Contrastive Learning , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[47] Kris M. Kitani,et al. On the Surprising Efficiency of Committee-based Models , 2021 .

[48] Ross B. Girshick,et al. Fast R-CNN , 2015, 1504.08083.

[49] Xingyi Zhou,et al. Objects as Points , 2019, ArXiv.

[50] Kaiming He,et al. Exploring the Limits of Weakly Supervised Pretraining , 2018, ECCV.

[51] Charles Blundell,et al. Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles , 2016, NIPS.

[52] Ling Shao,et al. iSAID: A Large-scale Dataset for Instance Segmentation in Aerial Images , 2019, CVPR Workshops.

[53] Alexei A. Efros,et al. What makes ImageNet good for transfer learning? , 2016, ArXiv.

[54] Michael Cogswell,et al. Why M Heads are Better than One: Training a Diverse Ensemble of Deep Networks , 2015, ArXiv.

[55] Leonidas J. Guibas,et al. An Information-Theoretic Approach to Transferability in Task Transfer Learning , 2019, 2019 IEEE International Conference on Image Processing (ICIP).

[56] Iasonas Kokkinos,et al. DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[57] Thomas Mensink,et al. Factors of Influence for Transfer Learning Across Diverse Appearance Domains and Task Types , 2021, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[58] Anton van den Hengel,et al. Wider or Deeper: Revisiting the ResNet Model for Visual Recognition , 2016, Pattern Recognit..

[59] Mingsheng Long,et al. LogME: Practical Assessment of Pre-trained Models for Transfer Learning , 2021, ICML.

[60] Yoav Freund,et al. Experiments with a New Boosting Algorithm , 1996, ICML.

[61] Xinlei Chen,et al. Exploring Simple Siamese Representation Learning , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[62] Gorjan Alagic,et al. #p , 2019, Quantum information & computation.

[63] Pietro Perona,et al. Microsoft COCO: Common Objects in Context , 2014, ECCV.

[64] Jianxiong Xiao,et al. SUN RGB-D: A RGB-D scene understanding benchmark suite , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[65] Yukun Zhu,et al. Ranking Neural Checkpoints , 2020, ArXiv.

[66] Geoffrey E. Hinton,et al. A Simple Framework for Contrastive Learning of Visual Representations , 2020, ICML.

[67] Lucas Beyer,et al. Big Transfer (BiT): General Visual Representation Learning , 2020, ECCV.

[68] Jaewook Jung,et al. Results of the ISPRS benchmark on urban object detection and 3D building reconstruction , 2014 .

[69] Jian Peng,et al. Knowledge Flow: Improve Upon Your Teachers , 2019, ICLR.

[70] Bolei Zhou,et al. Scene Parsing through ADE20K Dataset , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[71] Shao-Lun Huang,et al. OTCE: A Transferability Metric for Cross-Domain Cross-Task Representations , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[72] Mingliang Xu,et al. Margin & diversity based ordering ensemble pruning , 2018, Neurocomputing.

[73] Sebastian Nowozin,et al. Can You Trust Your Model's Uncertainty? Evaluating Predictive Uncertainty Under Dataset Shift , 2019, NeurIPS.

[74] Qiang Yang,et al. A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[75] Jiebo Luo,et al. DOTA: A Large-Scale Dataset for Object Detection in Aerial Images , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[76] Yang Zhao,et al. Deep High-Resolution Representation Learning for Visual Recognition , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[77] Balaji Lakshminarayanan,et al. Deep Ensembles: A Loss Landscape Perspective , 2019, ArXiv.