This paper introduces the minimum entropy regularizer for learning from partial labels. This learning problem encompasses the semi-supervised setting, where a decision rule is to be learned from labeled and unlabeled examples. The minimum entropy regularizer applies to diagnosis models, i.e. models of the posterior probabilities of classes. It is shown to include other approaches to the semi-supervised problem as particular or limiting cases. A series of experiments illustrates that the proposed criterion provides solutions taking advantage of unlabeled examples when the latter convey information. Even when the data are sampled from the distribution class spanned by a generative model, the proposed approach improves over the estimated generative model when the number of features is of the order of sample size. The performances are definitely in favor of minimum entropy when the generative model is slightly misspecified. Finally, the robustness of the learning scheme is demonstrated: in situations where unlabeled examples do not convey information, minimum entropy returns a solution discarding unlabeled examples and performs as well as supervised learning. Cet article introduit le regularisateur a entropie minimum pour l'apprentissage d'etiquettes partielles. Ce probleme d'apprentissage incorpore le cadre non supervise, ou une regle de decision doit etre apprise a partir d'exemples etiquetes et non etiquetes. Le regularisateur a entropie minimum s'applique aux modeles de diagnostics, c'est-a-dire aux modeles des probabilites posterieures de classes. Nous montrons comment inclure d'autres approches comme un cas particulier ou limite du probleme semi-supervise. Une serie d'experiences montrent que le critere propose fournit des solutions utilisant les exemples non etiquetes lorsque ces dernieres sont instructives. Meme lorsque les donnees sont echantillonnees a partir de la classe de distribution balayee par un modele generatif, l'approche mentionnee ameliore le modele generatif estime lorsque le nombre de caracteristiques est de l'ordre de la taille de l'echantillon. Les performances avantagent certainement l'entropie minimum lorsque le modele generatif est legerement mal specifie. Finalement, la robustesse de ce cadre d'apprentissage est demontre : lors de situations ou les exemples non etiquetes n'apportent aucune information, l'entropie minimum retourne une solution rejetant les exemples non etiquetes et est aussi performante que l'apprentissage supervise.
[1]
Terence J. O'Neill.
Normal Discrimination with Unclassified Observations
,
1978
.
[2]
J. Berger.
Statistical Decision Theory and Bayesian Analysis
,
1988
.
[3]
Geoffrey C. Fox,et al.
A deterministic annealing approach to clustering
,
1990,
Pattern Recognit. Lett..
[4]
G. Celeux,et al.
A Classification EM algorithm for clustering and two stochastic versions
,
1992
.
[5]
Vittorio Castelli,et al.
The relative value of labeled and unlabeled samples in pattern recognition with an unknown mixing parameter
,
1996,
IEEE Trans. Inf. Theory.
[6]
Vladimir Vapnik,et al.
Statistical learning theory
,
1998
.
[7]
Ayhan Demiriz,et al.
Semi-Supervised Support Vector Machines
,
1998,
NIPS.
[8]
Matthew Brand,et al.
Structure Learning in Conditional Probability Models via an Entropic Prior and Parameter Extinction
,
1999,
Neural Computation.
[9]
Daphne Koller,et al.
Restricted Bayes Optimal Classifiers
,
2000,
AAAI/IAAI.
[10]
Rayid Ghani,et al.
Analyzing the effectiveness and applicability of co-training
,
2000,
CIKM '00.
[11]
Rong Jin,et al.
Learning with Multiple Labels
,
2002,
NIPS.
[12]
Tommi S. Jaakkola,et al.
Information Regularization with Partially Labeled Data
,
2002,
NIPS.
[13]
Massih-Reza Amini,et al.
Semi Supervised Logistic Regression
,
2002,
ECAI.
[14]
Sebastian Thrun,et al.
Text Classification from Labeled and Unlabeled Documents using EM
,
2000,
Machine Learning.
[15]
Theofanis Sapatinas,et al.
Discriminant Analysis and Statistical Pattern Recognition
,
2005
.
[16]
James V. Candy,et al.
Adaptive and Learning Systems for Signal Processing, Communications, and Control
,
2006
.