Neural Data Server: A Large-Scale Search Engine for Transfer Learning Data

Transfer learning has proven to be a successful technique to train deep learning models in the domains where little training data is available. The dominant approach is to pretrain a model on a large generic dataset such as ImageNet and finetune its weights on the target domain. However, in the new era of an ever-increasing number of massive datasets, selecting the relevant data for pretraining is a critical issue. We introduce Neural Data Server (NDS), a large-scale search engine for finding the most useful transfer learning data to the target domain. NDS consists of a dataserver which indexes several large popular image datasets, and aims to recommend data to a client, an end-user with a target application with its own small labeled dataset. The dataserver represents large datasets with a much more compact mixture-of-experts model, and employs it to perform data search in a series of dataserver-client transactions at a low computational cost. We show the effectiveness of NDS in various transfer learning scenarios, demonstrating state-of-the-art performance on several target datasets and tasks such as image classification, object detection and instance segmentation. Neural Data Server is available as a web-service at http://aidemo s.cs.toronto.edu/nds/.

[1]  Geoffrey E. Hinton,et al.  Distilling the Knowledge in a Neural Network , 2015, ArXiv.

[2]  Geoffrey E. Hinton,et al.  Adaptive Mixtures of Local Experts , 1991, Neural Computation.

[3]  Iasonas Kokkinos,et al.  DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[6]  Quoc V. Le,et al.  Domain Adaptive Transfer Learning with Specialist Models , 2018, ArXiv.

[7]  Frank Hutter,et al.  A Downsampled Variant of ImageNet as an Alternative to the CIFAR datasets , 2017, ArXiv.

[8]  Nikos Komodakis,et al.  Unsupervised Representation Learning by Predicting Image Rotations , 2018, ICLR.

[9]  Koby Crammer,et al.  A theory of learning from different domains , 2010, Machine Learning.

[10]  Koby Crammer,et al.  Analysis of Representations for Domain Adaptation , 2006, NIPS.

[11]  Pietro Perona,et al.  The Caltech-UCSD Birds-200-2011 Dataset , 2011 .

[12]  Blaise Agüera y Arcas,et al.  Federated Learning of Deep Networks using Model Averaging , 2016, ArXiv.

[13]  Kaiming He,et al.  Exploring the Limits of Weakly Supervised Pretraining , 2018, ECCV.

[14]  Sarvar Patel,et al.  Practical Secure Aggregation for Privacy-Preserving Machine Learning , 2017, IACR Cryptol. ePrint Arch..

[15]  Trevor Darrell,et al.  Fully Convolutional Networks for Semantic Segmentation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Fei-Fei Li,et al.  Novel Dataset for Fine-Grained Image Categorization : Stanford Dogs , 2012 .

[17]  Shawn D. Newsam,et al.  Improving Semantic Segmentation via Video Propagation and Label Relaxation , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Qiang Yang,et al.  A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[19]  Jason Weston,et al.  Curriculum learning , 2009, ICML '09.

[20]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[21]  Leonidas J. Guibas,et al.  Taskonomy: Disentangling Task Transfer Learning , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[22]  Sanja Fidler,et al.  Gated-SCNN: Gated Shape CNNs for Semantic Segmentation , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[23]  C. V. Jawahar,et al.  Cats and dogs , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[24]  Sebastian Ramos,et al.  The Cityscapes Dataset for Semantic Urban Scene Understanding , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Larry S. Davis,et al.  An Analysis of Pre-Training on Object Detection , 2019, ArXiv.

[26]  François Laviolette,et al.  Domain-Adversarial Training of Neural Networks , 2015, J. Mach. Learn. Res..

[27]  Chen Sun,et al.  Revisiting Unreasonable Effectiveness of Data in Deep Learning Era , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[28]  Manmohan Krishna Chandraker,et al.  Learning To Simulate , 2018, ICLR.

[29]  Kaiming He,et al.  Rethinking ImageNet Pre-Training , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[30]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[31]  Yoshua Bengio,et al.  How transferable are features in deep neural networks? , 2014, NIPS.

[32]  Sanja Fidler,et al.  Devil Is in the Edges: Learning Semantic Boundaries From Noisy Annotations , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[33]  Marc'Aurelio Ranzato,et al.  Hard Mixtures of Experts for Large Scale Weakly Supervised Vision , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[34]  Qiang Xu,et al.  nuScenes: A Multimodal Dataset for Autonomous Driving , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[35]  Andrew Zisserman,et al.  Automated Flower Classification over a Large Number of Classes , 2008, 2008 Sixth Indian Conference on Computer Vision, Graphics & Image Processing.

[36]  Burr Settles,et al.  Active Learning Literature Survey , 2009 .

[37]  James M. Rehg,et al.  Learning to Generate Synthetic Data via Compositing , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[38]  Robinson Piramuthu,et al.  ModaNet: A Large-scale Street Fashion Dataset with Polygon Annotations , 2018, ACM Multimedia.

[39]  Sanja Fidler,et al.  Efficient Interactive Annotation of Segmentation Datasets with Polygon-RNN++ , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[40]  Yang Song,et al.  Large Scale Fine-Grained Categorization and Domain-Specific Transfer Learning , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[41]  Kaiming He,et al.  Momentum Contrast for Unsupervised Visual Representation Learning , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[42]  Sanja Fidler,et al.  Meta-Sim: Learning to Generate Synthetic Datasets , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[43]  Apostol Natsev,et al.  YouTube-8M: A Large-Scale Video Classification Benchmark , 2016, ArXiv.

[44]  Luc Van Gool,et al.  The 2018 DAVIS Challenge on Video Object Segmentation , 2018, ArXiv.

[45]  Varun Jampani,et al.  Training Deep Networks with Synthetic Data: Bridging the Reality Gap by Domain Randomization , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[46]  Andreas Geiger,et al.  Are we ready for autonomous driving? The KITTI vision benchmark suite , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[47]  Kilian Q. Weinberger,et al.  Marginalizing stacked linear denoising autoencoders , 2015, J. Mach. Learn. Res..

[48]  Subhransu Maji,et al.  Task2Vec: Task Embedding for Meta-Learning , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[49]  Christopher Joseph Pal,et al.  Active Domain Randomization , 2019, CoRL.

[50]  Jonathan Krause,et al.  3D Object Representations for Fine-Grained Categorization , 2013, 2013 IEEE International Conference on Computer Vision Workshops.

[51]  Gabriela Csurka,et al.  Domain Adaptation for Visual Applications: A Comprehensive Survey , 2017, ArXiv.