Online Positive and Unlabeled Learning

Positive and Unlabeled learning (PU learning) aims to build a binary classifier where only positive and unlabeled data are available for classifier training. However, existing PU learning methods all work on a batch learning mode, which cannot deal with the online learning scenarios with sequential data. Therefore, this paper proposes a novel positive and unlabeled learning algorithm in an online training mode, which trains a classifier solely on the positive and unlabeled data arriving in a sequential order. Specifically, we adopt an unbiased estimate for the loss induced by the arriving positive or unlabeled examples at each time. Then we show that for any coming new single datum, the model can be updated independently and incrementally by gradient based online learning method. Furthermore, we extend our method to tackle the cases when more than one example is received at each time. Theoretically, we show that the proposed online PU learning method achieves low regret even though it receives sequential positive and unlabeled data. Empirically, we conduct intensive experiments on both benchmark and real-world datasets, and the results clearly demonstrate the effectiveness of the proposed method.

[1]  Kevin Chen-Chuan Chang,et al.  PEBL: positive example based learning for Web page classification using SVM , 2002, KDD.

[2]  Ming Li,et al.  Positive and Unlabeled Learning for Detecting Software Functional Clones with Adversarial Training , 2018, IJCAI.

[3]  Gang Niu,et al.  Convex Formulation for Learning from Positive and Unlabeled Data , 2015, ICML.

[4]  Elad Hazan,et al.  Logarithmic regret algorithms for online convex optimization , 2006, Machine Learning.

[5]  Gang Niu,et al.  Positive-Unlabeled Learning with Non-Negative Risk Estimator , 2017, NIPS.

[6]  Claudio Gentile,et al.  A Second-Order Perceptron Algorithm , 2002, SIAM J. Comput..

[7]  Koby Crammer,et al.  Adaptive regularization of weight vectors , 2009, Machine Learning.

[8]  Zhi-Hua Zhou,et al.  Efficient Training for Positive Unlabeled Learning , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  Jian Yang,et al.  Large-Margin Label-Calibrated Support Vector Machines for Positive and Unlabeled Learning , 2019, IEEE Transactions on Neural Networks and Learning Systems.

[10]  Philip S. Yu,et al.  Partially Supervised Classification of Text Documents , 2002, ICML.

[11]  Chen Gong,et al.  Multi-Manifold Positive and Unlabeled Learning for Visual Analysis , 2020, IEEE Transactions on Circuits and Systems for Video Technology.

[12]  Andrew Y. Ng,et al.  Reading Digits in Natural Images with Unsupervised Feature Learning , 2011 .

[13]  Alexander J. Smola,et al.  Online learning with kernels , 2001, IEEE Transactions on Signal Processing.

[14]  Sergey Levine,et al.  Online Meta-Learning , 2019, ICML.

[15]  Rémi Gilleron,et al.  Learning from positive and unlabeled examples , 2000, Theor. Comput. Sci..

[16]  Koby Crammer,et al.  Ultraconservative Online Algorithms for Multiclass Problems , 2001, J. Mach. Learn. Res..

[17]  Yi Li,et al.  The Relaxed Online Maximum Margin Algorithm , 1999, Machine Learning.

[18]  F ROSENBLATT,et al.  The perceptron: a probabilistic model for information storage and organization in the brain. , 1958, Psychological review.

[19]  Ohad Shamir,et al.  Optimal Distributed Online Prediction Using Mini-Batches , 2010, J. Mach. Learn. Res..

[20]  Yoav Freund,et al.  Large Margin Classification Using the Perceptron Algorithm , 1998, COLT' 98.

[21]  Philip S. Yu,et al.  Positive Unlabeled Learning for Data Stream Classification , 2009, SDM.

[22]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[23]  Bing Liu,et al.  Learning with Positive and Unlabeled Examples Using Weighted Logistic Regression , 2003, ICML.

[24]  Tongliang Liu,et al.  Positive and Unlabeled Learning with Label Disambiguation , 2019, IJCAI.

[25]  Martin Zinkevich,et al.  Online Convex Programming and Generalized Infinitesimal Gradient Ascent , 2003, ICML.

[26]  Koby Crammer,et al.  Confidence-weighted linear classification , 2008, ICML '08.

[27]  Gang Niu,et al.  Analysis of Learning from Positive and Unlabeled Data , 2014, NIPS.

[28]  Chengqi Zhang,et al.  Similarity-Based Approach for Positive and Unlabeled Learning , 2011, IJCAI.

[29]  Charles Elkan,et al.  Learning classifiers from only positive and unlabeled data , 2008, KDD.

[30]  Dacheng Tao,et al.  Loss Decomposition and Centroid Estimation for Positive and Unlabeled Learning , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[31]  Jian Yang,et al.  Positive and Unlabeled Learning via Loss Decomposition and Centroid Estimation , 2018, IJCAI.

[32]  Wenkai Li,et al.  A Positive and Unlabeled Learning Algorithm for One-Class Classification of Remote-Sensing Data , 2011, IEEE Transactions on Geoscience and Remote Sensing.

[33]  Alexander J. Smola,et al.  Efficient mini-batch training for stochastic optimization , 2014, KDD.