Predictive Adversarial Learning from Positive and Unlabeled Data

This paper studies learning from positive and unlabeled examples, known as PU learning. It proposes a novel PU learning method called Predictive Adversarial Networks (PAN) based on GAN (Generative Adversarial Networks). GAN learns a generator to generate data (e.g., images) to fool a discriminator which tries to determine whether the generated data belong to a (positive) training class. PU learning can be casted as trying to identify (not generate) likely positive instances from the unlabeled set to fool a discriminator that determines whether the identified likely positive instances from the unlabeled set are indeed positive. However, directly applying GAN is problematic because GAN focuses on only the positive data. The resulting PU learning method will have high precision but low recall. We propose a new objective function based on KLdivergence. Evaluation using both image and text data shows that PAN outperforms state-of-the-art PU learning methods and also a direct adaptation of GAN for PU learning.

[1]  Yi Chang,et al.  Positive-Unlabeled Learning in Streaming Networks , 2016, KDD.

[2]  Zhi-Hua Zhou,et al.  Efficient Training for Positive Unlabeled Learning , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Masahiro Kato,et al.  Learning from Positive and Unlabeled Data with a Selection Bias , 2018, ICLR.

[4]  Kevin Chen-Chuan Chang,et al.  PEBL: positive example based learning for Web page classification using SVM , 2002, KDD.

[5]  Nagarajan Natarajan,et al.  PU Learning for Matrix Completion , 2014, ICML.

[6]  Gang Niu,et al.  Semi-supervised AUC optimization based on positive-unlabeled learning , 2017, Machine Learning.

[7]  Gang Niu,et al.  Classification from Positive, Unlabeled and Biased Negative Data , 2018, ICML.

[8]  Charles Elkan,et al.  Learning classifiers from only positive and unlabeled data , 2008, KDD.

[9]  Gang Niu,et al.  Convex Formulation for Learning from Positive and Unlabeled Data , 2015, ICML.

[10]  Hao Wu,et al.  Discriminative adversarial networks for positive-unlabeled learning , 2019, ArXiv.

[11]  Ido Dagan,et al.  Synthesis Lectures on Human Language Technologies , 2009 .

[12]  T. Hastie,et al.  Presence‐Only Data and the EM Algorithm , 2009, Biometrics.

[13]  Dacheng Tao,et al.  Multi-Positive and Unlabeled Learning , 2017, IJCAI.

[14]  Dongyan Zhao,et al.  Abstractive Text Summarization by Incorporating Reader Comments , 2018, AAAI.

[15]  Gang Niu,et al.  Analysis of Learning from Positive and Unlabeled Data , 2014, NIPS.

[16]  Ian J. Wassell,et al.  Re-weighted Adversarial Adaptation Network for Unsupervised Domain Adaptation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[17]  Martha White,et al.  Estimating the class prior and posterior from noisy positives and unlabeled data , 2016, NIPS.

[18]  Jian Yang,et al.  Positive and Unlabeled Learning via Loss Decomposition and Centroid Estimation , 2018, IJCAI.

[19]  Wenkai Li,et al.  A Positive and Unlabeled Learning Algorithm for One-Class Classification of Remote-Sensing Data , 2011, IEEE Transactions on Geoscience and Remote Sensing.

[20]  Philip S. Yu,et al.  Partially Supervised Classification of Text Documents , 2002, ICML.

[21]  Chen Gong,et al.  Self-PU: Self Boosted and Calibrated Positive-Unlabeled Training , 2020, ICML.

[22]  Jing Zhang,et al.  Importance Weighted Adversarial Nets for Partial Domain Adaptation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[23]  Lei Zhang,et al.  Sentiment Analysis and Opinion Mining , 2017, Encyclopedia of Machine Learning and Data Mining.

[24]  Bing Liu,et al.  Learning with Positive and Unlabeled Examples Using Weighted Logistic Regression , 2003, ICML.

[25]  Philip S. Yu,et al.  Building text classifiers using positive and unlabeled examples , 2003, Third IEEE International Conference on Data Mining.

[26]  Yan Zhang,et al.  Learning from Positive and Unlabeled Data without Explicit Estimation of Class Prior , 2020, AAAI.

[27]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[28]  Gang Niu,et al.  Class-prior estimation for learning from positive and unlabeled data , 2016, Machine Learning.

[29]  Rémi Gilleron,et al.  Learning from positive and unlabeled examples , 2000, Theor. Comput. Sci..

[30]  Gang Niu,et al.  Semi-Supervised Classification Based on Classification from Positive and Unlabeled Data , 2016, ICML.

[31]  Gang Niu,et al.  Information-Theoretic Representation Learning for Positive-Unlabeled Classification , 2017, Neural Computation.

[32]  Dongyan Zhao,et al.  Modeling Personalization in Continuous Space for Response Generation via Augmented Wasserstein Autoencoders , 2019, EMNLP.

[33]  Xiaoli Li,et al.  Learning from Positive and Unlabeled Examples with Different Data Distributions , 2005, ECML.

[34]  Gang Niu,et al.  Theoretical Comparisons of Positive-Unlabeled Learning against Positive-Negative Learning , 2016, NIPS.

[35]  Jesse Davis,et al.  Learning from positive and unlabeled data: a survey , 2018, Machine Learning.

[36]  Frédéric Dufaux,et al.  Learning with A Generative Adversarial Network From a Positive Unlabeled Dataset for Image Classification , 2018, 2018 25th IEEE International Conference on Image Processing (ICIP).

[37]  Brahim Chaib-draa,et al.  Generative Adversarial Positive-Unlabelled Learning , 2017, IJCAI.

[38]  Dongyan Zhao,et al.  Product-Aware Answer Generation in E-Commerce Question-Answering , 2019, WSDM.

[39]  Jieping Ye,et al.  Margin Based PU Learning , 2018, AAAI.

[40]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[41]  Gang Niu,et al.  Positive-Unlabeled Learning with Non-Negative Risk Estimator , 2017, NIPS.

[42]  Ambuj Tewari,et al.  Mixture Proportion Estimation via Kernel Embeddings of Distributions , 2016, ICML.