Semi-Supervised Learning with Partially Labeled Examples

Traditionally, machine learning community has been focused on supervised learning where the source of learning is fully labeled examples including both input features and corresponding output labels. As one way to alleviate the costly effort of collecting fully labeled examples, semi-supervised learning usually concentrates on utilizing a large amount of unlabeled examples together with a relatively small number of fully labeled examples to build better classifiers. Even though many semi-supervised learning algorithms are able to take advantage of unlabeled examples, there is a significant amount of effort in designing good models, features, kernels, and similarity functions. In this dissertation, we focus on semi-supervised learning with partially labeled examples. Partially labeled data can be viewed as a trade-off between fully labeled data and unlabeled data, which can provide additional discriminative information in comparison to unlabeled data and requires less human effort to collect than fully labeled data. In our setting of semi-supervised learning with partially labeled examples, the learning method is provided with a large amount of partially labeled examples and is usually augmented with a relatively small set of fully labeled examples. Our main goal is to integrate partially labeled examples into the conventional learning framework, i.e. to build a more accurate classifier. The dissertation addresses four different semi-supervised learning problems in presence of partially labeled examples. In addition, we summarize general principles for the semi-supervised learning with partially labeled examples.

[1]  Yoram Singer,et al.  Pegasos: primal estimated sub-gradient solver for SVM , 2007, ICML '07.

[2]  Fabrizio Sebastiani,et al.  Machine learning in automated text categorization , 2001, CSUR.

[3]  Jason Weston,et al.  A kernel method for multi-labelled classification , 2001, NIPS.

[4]  Frank E. Harrell,et al.  Regression Modeling Strategies: With Applications to Linear Models, Logistic Regression, and Survival Analysis , 2001 .

[5]  Min-Ling Zhang,et al.  MIMLRBF: RBF neural networks for multi-instance multi-label learning , 2009, Neurocomputing.

[6]  Tom M. Mitchell,et al.  Improving Text Classification by Shrinkage in a Hierarchy of Classes , 1998, ICML.

[7]  Zhi-Hua Zhou,et al.  Multi-instance clustering with applications to multi-instance prediction , 2009, Applied Intelligence.

[8]  Eisaku Maeda,et al.  Maximal Margin Labeling for Multi-Topic Text Categorization , 2004, NIPS.

[9]  Thomas Hofmann,et al.  Support Vector Machines for Multiple-Instance Learning , 2002, NIPS.

[10]  Yann LeCun,et al.  Learning a similarity metric discriminatively, with application to face verification , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[11]  A. Atiya,et al.  Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2005, IEEE Transactions on Neural Networks.

[12]  Ee-Peng Lim,et al.  Hierarchical text classification and evaluation , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[13]  Y. Singer,et al.  Logarithmic Regret Algorithms for Strongly Convex Repeated Games , 2007 .

[14]  Claire Cardie,et al.  Proceedings of the Eighteenth International Conference on Machine Learning, 2001, p. 577–584. Constrained K-means Clustering with Background Knowledge , 2022 .

[15]  Oded Maron,et al.  Multiple-Instance Learning for Natural Scene Classification , 1998, ICML.

[16]  Yixin Chen,et al.  MILES: Multiple-Instance Learning via Embedded Instance Selection , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  Amir Globerson,et al.  Metric Learning by Collapsing Classes , 2005, NIPS.

[18]  Xin Xu,et al.  Logistic Regression and Boosting for Labeled Bags of Instances , 2004, PAKDD.

[19]  Michael I. Jordan,et al.  Distance Metric Learning with Application to Clustering with Side-Information , 2002, NIPS.

[20]  Thomas G. Dietterich,et al.  Solving the Multiple Instance Problem with Axis-Parallel Rectangles , 1997, Artif. Intell..

[21]  Raymond J. Mooney,et al.  Integrating constraints and metric learning in semi-supervised clustering , 2004, ICML.

[22]  Yann Chevaleyre,et al.  Solving Multiple-Instance and Multiple-Part Learning Problems with Decision Trees and Rule Sets. Application to the Mutagenesis Problem , 2001, Canadian Conference on AI.

[23]  Grigorios Tsoumakas,et al.  Multi-Label Classification: An Overview , 2007, Int. J. Data Warehous. Min..

[24]  Chin-Hui Lee,et al.  A MFoM learning approach to robust multiclass multi-label text categorization , 2004, ICML.

[25]  Koby Crammer,et al.  On the Algorithmic Implementation of Multiclass Kernel-based Vector Machines , 2002, J. Mach. Learn. Res..

[26]  Claudio Gentile,et al.  Incremental Algorithms for Hierarchical Classification , 2004, J. Mach. Learn. Res..

[27]  Yihong Gong,et al.  Multi-labelled classification using maximum entropy method , 2005, SIGIR '05.

[28]  Yoram Singer,et al.  Online and batch learning of pseudo-metrics , 2004, ICML.

[29]  Peter Dalgaard,et al.  Introductory statistics with R, 2nd Edition , 2020, Statistics and computing.

[30]  Mark W. Schmidt,et al.  Accelerated training of conditional random fields with stochastic gradient methods , 2006, ICML.

[31]  Yoshua Bengio,et al.  Learning from Partial Labels with Minimum Entropy , 2004 .

[32]  Andrew McCallum,et al.  Semi-Supervised Clustering with User Feedback , 2003 .

[33]  Inderjit S. Dhillon,et al.  Information-theoretic metric learning , 2006, ICML '07.

[34]  Qi Zhang,et al.  Content-Based Image Retrieval Using Multiple-Instance Learning , 2002, ICML.

[35]  Thomas Hofmann,et al.  Hierarchical document categorization with support vector machines , 2004, CIKM '04.

[36]  Yoram Singer,et al.  Large margin hierarchical classification , 2004, ICML.

[37]  Jiebo Luo,et al.  Learning multi-label scene classification , 2004, Pattern Recognit..

[38]  David Page,et al.  Multiple Instance Regression , 2001, ICML.

[39]  Kilian Q. Weinberger,et al.  Distance Metric Learning for Large Margin Nearest Neighbor Classification , 2005, NIPS.

[40]  Susan T. Dumais,et al.  Hierarchical classification of Web content , 2000, SIGIR '00.

[41]  Junbin Gao,et al.  A Probabilistic Framework for SVM Regression and Error Bar Estimation , 2002, Machine Learning.

[42]  Yoram Singer,et al.  BoosTexter: A Boosting-based System for Text Categorization , 2000, Machine Learning.

[43]  G. A. Edgar Measure, Topology, and Fractal Geometry , 1990 .

[44]  Arindam Banerjee,et al.  Semi-supervised Clustering by Seeding , 2002, ICML.

[45]  Thomas Hofmann,et al.  Multiple-Instance Learning via Disjunctive Programming Boosting , 2003, NIPS.

[46]  W. Bruce Croft,et al.  Query expansion using local and global document analysis , 1996, SIGIR '96.

[47]  Eyke Hüllermeier,et al.  A Unified Model for Multilabel Classification and Ranking , 2006, ECAI.

[48]  Thorsten Joachims,et al.  Learning a Distance Metric from Relative Comparisons , 2003, NIPS.

[49]  James C. Bezdek,et al.  Convergence of Alternating Optimization , 2003, Neural Parallel Sci. Comput..

[50]  Rong Yan,et al.  A Discriminative Learning Framework with Pairwise Constraints for Video Object Classification , 2006, IEEE Trans. Pattern Anal. Mach. Intell..

[51]  Zhi-Hua Zhou,et al.  Multilabel Neural Networks with Applications to Functional Genomics and Text Categorization , 2006, IEEE Transactions on Knowledge and Data Engineering.

[52]  Tomás Lozano-Pérez,et al.  A Framework for Multiple-Instance Learning , 1997, NIPS.

[53]  Zhi-Hua Zhou,et al.  M3MIML: A Maximum Margin Method for Multi-instance Multi-label Learning , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[54]  Rong Yan,et al.  On the value of pairwise constraints in classification and consistency , 2007, ICML '07.

[55]  Dan Klein,et al.  From Instance-level Constraints to Space-Level Constraints: Making the Most of Prior Knowledge in Data Clustering , 2002, ICML.

[56]  Fabio Gagliardi Cozman,et al.  Semi-Supervised Learning of Mixture Models and Bayesian Networks , 2003 .

[57]  Tomer Hertz,et al.  Learning Distance Functions using Equivalence Relations , 2003, ICML.

[58]  Tao Mei,et al.  Joint multi-label multi-instance learning for image classification , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[59]  Geoffrey E. Hinton,et al.  Neighbourhood Components Analysis , 2004, NIPS.

[60]  Yiming Yang,et al.  Support vector machines classification with a very large-scale taxonomy , 2005, SKDD.

[61]  Naonori Ueda,et al.  Parametric Mixture Models for Multi-Labeled Text , 2002, NIPS.

[62]  Nello Cristianini,et al.  Kernel Methods for Pattern Analysis , 2004 .

[63]  Thomas Hofmann,et al.  Multi-Instance Multi-Label Learning with Application to Scene Classification , 2007 .

[64]  Yixin Chen,et al.  Image Categorization by Learning and Reasoning with Regions , 2004, J. Mach. Learn. Res..

[65]  Padmini Srinivasan,et al.  Hierarchical Text Categorization Using Neural Networks , 2004, Information Retrieval.

[66]  Robert A. Jacobs,et al.  Hierarchical Mixtures of Experts and the EM Algorithm , 1993, Neural Computation.

[67]  Jun Yang Review of Multi-Instance Learning and Its applications , 2005 .

[68]  Jun Wang,et al.  Solving the Multiple-Instance Problem: A Lazy Learning Approach , 2000, ICML.