论文信息 - Classify and Generate Reciprocally: Simultaneous Positive-Unlabelled Learning and Conditional Generation with Extra Data

Classify and Generate Reciprocally: Simultaneous Positive-Unlabelled Learning and Conditional Generation with Extra Data

The scarcity of class-labeled data is a ubiquitous bottleneck in a wide range of machine learning problems. While abundant unlabeled data normally exist and provide a potential solution, it is extremely challenging to exploit them. In this paper, we address this problem by leveraging Positive-Unlabeled~(PU) classification and conditional generation with extra unlabeled data \emph{simultaneously}, both of which aim to make full use of agnostic unlabeled data to improve classification and generation performances. In particular, we present a novel training framework to jointly target both PU classification and conditional generation when exposing to extra data, especially out-of-distribution unlabeled data, by exploring the interplay between them: 1) enhancing the performance of PU classifiers with the assistance of a novel Conditional Generative Adversarial Network~(CGAN) that is robust to noisy labels, 2) leveraging extra data with predicted labels from a PU classifier to help the generation. Our key contribution is a Classifier-Noise-Invariant Conditional GAN~(CNI-CGAN) that can learn the clean data distribution from noisy labels predicted by a PU classifier. Theoretically, we proved the optimal condition of CNI-CGAN and experimentally, we conducted extensive evaluations on diverse datasets, verifying the simultaneous improvements on both classification and generation.

Zhanxing Zhu | Bing Yu | He Wang | Ke Sun | Zhouchen Lin

[1] Sewoong Oh,et al. Robust conditional GANs under missing or uncertain labels , 2019, ArXiv.

[2] Shaogang Gong,et al. Semi-Supervised Learning under Class Distribution Mismatch , 2020, AAAI.

[3] Takafumi Kanamori,et al. Inlier-Based Outlier Detection via Direct Density Ratio Estimation , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[4] Thomas G. Dietterich,et al. Deep Anomaly Detection with Outlier Exposure , 2018, ICLR.

[5] Lawrence Carin,et al. On Leveraging Pretrained GANs for Generation with Limited Data , 2020, ICML.

[6] Shin Ishii,et al. Virtual Adversarial Training: A Regularization Method for Supervised and Semi-Supervised Learning , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7] Zhanxing Zhu,et al. Tangent-Normal Adversarial Regularization for Semi-Supervised Learning , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[8] Simon Osindero,et al. Conditional Generative Adversarial Nets , 2014, ArXiv.

[9] Chao Xu,et al. Learning from Bad Data via Generation , 2019, NeurIPS.

[10] Charles Elkan,et al. The Foundations of Cost-Sensitive Learning , 2001, IJCAI.

[11] Zhenan Sun,et al. A Review on Generative Adversarial Networks: Algorithms, Theory, and Applications , 2020, IEEE Transactions on Knowledge and Data Engineering.

[12] Lawrence Carin,et al. On Leveraging Pretrained GANs for Limited-Data Generation , 2020, ICML 2020.

[13] Takuhiro Kaneko,et al. Label-Noise Robust Generative Adversarial Networks , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[14] Dacheng Tao,et al. Multi-Positive and Unlabeled Learning , 2017, IJCAI.

[15] David Berthelot,et al. MixMatch: A Holistic Approach to Semi-Supervised Learning , 2019, NeurIPS.

[16] Takeharu Eda,et al. Effective Data Augmentation with Multi-Domain Learning GANs , 2019, AAAI.

[17] Le Song,et al. Relative Novelty Detection , 2009, AISTATS.

[18] Luc Van Gool,et al. Semi-Supervised Learning by Augmented Distribution Alignment , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[19] Aaron C. Courville,et al. Improved Training of Wasserstein GANs , 2017, NIPS.

[20] Jeff Donahue,et al. Large Scale Adversarial Representation Learning , 2019, NeurIPS.

[21] Ashish Khetan,et al. Robustness of Conditional GANs to Noisy Labels , 2018, NeurIPS.

[22] Dapeng Chen,et al. Mutual Mean-Teaching: Pseudo Label Refinery for Unsupervised Domain Adaptation on Person Re-identification , 2020, ICLR.

[23] Chongxuan Li,et al. Countering Noisy Labels By Learning From Auxiliary Clean Labels , 2019 .

[24] Xiaohua Zhai,et al. High-Fidelity Image Generation With Fewer Labels , 2019, ICML.

[25] Gang Niu,et al. Positive-Unlabeled Learning with Non-Negative Risk Estimator , 2017, NIPS.

[26] Yoshua Bengio,et al. Generative Adversarial Nets , 2014, NIPS.

[27] Takuhiro Kaneko,et al. Noise Robust Generative Adversarial Networks , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[28] Masahiro Kato,et al. Learning from Positive and Unlabeled Data with a Selection Bias , 2018, ICLR.

[29] Brahim Chaib-draa,et al. Generative Adversarial Positive-Unlabelled Learning , 2017, IJCAI.

[30] Jesse Davis,et al. Learning from positive and unlabeled data: a survey , 2018, Machine Learning.

[31] Zhanxing Zhu,et al. Patch-level Neighborhood Interpolation: A General and Effective Graph-based Regularization Strategy , 2019, ArXiv.

[32] Tatsuya Harada,et al. Image Generation From Small Datasets via Batch Statistics Adaptation , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[33] Stefanos Zafeiriou,et al. Robust Conditional Generative Adversarial Networks , 2018, ICLR.