The Semi-Supervised iNaturalist-Aves Challenge at FGVC7 Workshop

This document describes the details and the motivation behind a new dataset we collected for the semisupervised recognition challenge [3] at the FGVC7 workshop at CVPR 2020. The dataset contains 1000 species of birds sampled from the iNat-2018 dataset for a total of nearly 150k images. From this collection, we sample a subset of classes and their labels, while adding the images from the remaining classes to the unlabeled set of images. The presence of out-of-domain data (novel classes), high class-imbalance, and fine-grained similarity between classes poses significant challenges for existing semi-supervised recognition techniques in the literature. The dataset is available here: https://github.com/ cvl-umass/semi-inat-2020

[1]  Colin Raffel,et al.  Realistic Evaluation of Deep Semi-Supervised Learning Algorithms , 2018, NeurIPS.

[2]  Yang Song,et al.  The iNaturalist Species Classification and Detection Dataset , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[3]  Subhransu Maji,et al.  A Realistic Evaluation of Semi-Supervised Learning for Fine-Grained Classification , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Quoc V. Le,et al.  Unsupervised Data Augmentation for Consistency Training , 2019, NeurIPS.

[5]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[6]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Quoc V. Le,et al.  Rethinking Pre-training and Self-training , 2020, NeurIPS.

[8]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[9]  David Berthelot,et al.  ReMixMatch: Semi-Supervised Learning with Distribution Alignment and Augmentation Anchoring , 2019, ArXiv.

[10]  Subhransu Maji,et al.  Adapting Models to Signal Degradation using Distillation , 2017, BMVC.

[11]  Zhongji Liu,et al.  Semi-Supervised Recognition under a Noisy and Fine-grained Dataset , 2020, ArXiv.

[12]  David Berthelot,et al.  MixMatch: A Holistic Approach to Semi-Supervised Learning , 2019, NeurIPS.

[13]  Quoc V. Le,et al.  Self-Training With Noisy Student Improves ImageNet Classification , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Geoffrey E. Hinton,et al.  Big Self-Supervised Models are Strong Semi-Supervised Learners , 2020, NeurIPS.

[15]  Honglak Lee,et al.  An Analysis of Single-Layer Networks in Unsupervised Feature Learning , 2011, AISTATS.

[16]  David Berthelot,et al.  FixMatch: Simplifying Semi-Supervised Learning with Consistency and Confidence , 2020, NeurIPS.

[17]  Shin Ishii,et al.  Virtual Adversarial Training: A Regularization Method for Supervised and Semi-Supervised Learning , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  Yanjun Qi,et al.  Curriculum Labeling: Self-paced Pseudo-Labeling for Semi-Supervised Learning , 2020, ArXiv.

[19]  Subhransu Maji,et al.  Reasoning About Fine-Grained Attribute Phrases Using Reference Games , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[20]  Andrew Y. Ng,et al.  Reading Digits in Natural Images with Unsupervised Feature Learning , 2011 .