Extending Contrastive Learning to Unsupervised Coreset Selection

Self-supervised contrastive learning offers a means of learning informative features from a pool of unlabeled data. In this paper, we investigate another useful approach. We propose an entirely unlabeled coreset selection method. In this regard, contrastive learning, one of several self-supervised methods, was recently proposed and has consistently delivered the highest performance. This prompted us to choose two leading methods for contrastive learning: the simple framework for contrastive learning of visual representations (SimCLR) and the momentum contrastive (MoCo) learning framework. We calculated the cosine similarities for each example of an epoch for the entire duration of the contrastive learning process and subsequently accumulated the cosine similarity values to obtain the coreset score. Our assumption was that a sample with low similarity would likely behave as a coreset. Compared with existing coreset selection methods with labels, our approach reduced the cost associated with human annotation. In this study, the unsupervised method implemented for coreset selection achieved improvements of 1.25% (for CIFAR10), 0.82% (for SVHN), and 0.19% (for QMNIST) over a randomly selected subset with a size of 30%. Furthermore, our results are comparable to those of the existing supervised coreset selection methods. The differences between the proposed and the above mentioned supervised coreset selection method (forgetting events) were 0.81% on the CIFAR10 dataset, −2.08% on the SVHN dataset (the proposed method outperformed the existing method), and 0.01% on the QMNIST dataset at a subset size of 30%. In addition, our proposed approach exhibited robustness even if the coreset selection model and target model were not identical (e.g., using ResNet18 as a selection model and ResNet101 as the target model). Lastly, we obtained more concrete proof that our coreset examples are highly informative by showing the performance gap between the coreset and non-coreset samples in the coreset cross test experiment. We observed a pair of performance ((testing: non-coreset, training: coreset), (testing: coreset, training: non-coreset)), i.e. (94.27%, 67.39 %) for CIFAR10, (98.24%, 83.30%) for SVHN, and (99.89%, 93.07%) for QMNIST with a subset size of 30%.

[1]  Andreas Krause,et al.  Training Gaussian Mixture Models at Scale via Coresets , 2017, J. Mach. Learn. Res..

[2]  Rishabh K. Iyer,et al.  Submodularity in Data Subset Selection and Active Learning , 2015, ICML.

[3]  Kaiming He,et al.  Momentum Contrast for Unsupervised Visual Representation Learning , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Jeff A. Bilmes,et al.  Submodular subset selection for large-scale speech training data , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[5]  Ivor W. Tsang,et al.  Core Vector Machines: Fast SVM Training on Very Large Data Sets , 2005, J. Mach. Learn. Res..

[6]  Trevor Campbell,et al.  Coresets for Scalable Bayesian Logistic Regression , 2016, NIPS.

[7]  Yoshua Bengio,et al.  An Empirical Study of Example Forgetting during Deep Neural Network Learning , 2018, ICLR.

[8]  Paolo Favaro,et al.  Unsupervised Learning of Visual Representations by Solving Jigsaw Puzzles , 2016, ECCV.

[9]  Bin Ma,et al.  Unsupervised data selection and word-morph mixed language model for tamil low-resource keyword search , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[10]  Radhika M. Pai,et al.  Performance Analysis of Semantic Segmentation Algorithms for Finely Annotated New UAV Aerial Video Dataset (ManipalUAVid) , 2019, IEEE Access.

[11]  Kilian Q. Weinberger,et al.  Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Trevor Campbell,et al.  Bayesian Coreset Construction via Greedy Iterative Geodesic Ascent , 2018, ICML.

[13]  Paolo Favaro,et al.  Representation Learning by Learning to Count , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[14]  Yang You,et al.  Large Batch Training of Convolutional Networks , 2017, 1708.03888.

[15]  Silvio Savarese,et al.  Active Learning for Convolutional Neural Networks: A Core-Set Approach , 2017, ICLR.

[16]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[17]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[19]  Emiliano Ricciardi,et al.  Unsupervised Data Selection for Supervised Learning , 2018, ArXiv.

[20]  Sariel Har-Peled,et al.  On coresets for k-means and k-median clustering , 2004, STOC '04.

[21]  Baharan Mirzasoleiman,et al.  Selection Via Proxy: Efficient Data Selection For Deep Learning , 2019, ICLR.

[22]  Kaiming He,et al.  Improved Baselines with Momentum Contrastive Learning , 2020, ArXiv.

[23]  Zhuowen Tu,et al.  Aggregated Residual Transformations for Deep Neural Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Alexei A. Efros,et al.  Unsupervised Visual Representation Learning by Context Prediction , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[25]  Shin-Jye Lee,et al.  Image Classification Based on the Boost Convolutional Neural Network , 2018, IEEE Access.

[26]  Rishabh K. Iyer,et al.  Learning Mixtures of Submodular Functions for Image Collection Summarization , 2014, NIPS.

[27]  Nikos Komodakis,et al.  Unsupervised Representation Learning by Predicting Image Rotations , 2018, ICLR.

[28]  Laurens van der Maaten,et al.  Self-Supervised Learning of Pretext-Invariant Representations , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[29]  Geoffrey E. Hinton,et al.  A Simple Framework for Contrastive Learning of Visual Representations , 2020, ICML.

[30]  Patrick J. Grother,et al.  NIST Special Database 19 Handprinted Forms and Characters Database , 1995 .

[31]  Carlo Fischione,et al.  Learning and Data Selection in Big Datasets , 2019, ICML.

[32]  L'eon Bottou,et al.  Cold Case: The Lost MNIST Digits , 2019, NeurIPS.

[33]  Nikos Komodakis,et al.  Wide Residual Networks , 2016, BMVC.

[34]  Andrew Y. Ng,et al.  Reading Digits in Natural Images with Unsupervised Feature Learning , 2011 .