Multi-view learning with dependent views

Multi-view algorithms, such as co-training and co-EM, utilize unlabeled data when the available attributes can be split into independent and compatible subsets. Experiments have shown that multi-view learning is sometimes beneficial for problems for which the independence assumption is not satisfied. In practice, unfortunately, it is not possible to measure the dependency between two attribute sets; hence, there is no criterion which allows to decide whether multi-view learning is applicable. We conduct experiments with various text classification problems and investigate on the effectiveness of the co-trained SVM and the co-EM SVM under various conditions, including violations of the independence assumption. We identify the error correlation coefficient of the initial classifiers as an elaborate indicator of the expected benefit of multi-view learning.

[1]  Matthias Seeger,et al.  Learning from Labeled and Unlabeled Data , 2010, Encyclopedia of Machine Learning.

[2]  Thorsten Joachims,et al.  Transductive Inference for Text Classification using Support Vector Machines , 1999, ICML.

[3]  Avrim Blum,et al.  The Bottleneck , 2021, Monopsony Capitalism.

[4]  David B. Cooper,et al.  On the Asymptotic Improvement in the Out- come of Supervised Learning Provided by Additional Nonsupervised Learning , 1970, IEEE Transactions on Computers.

[5]  Tobias Scheffer,et al.  Using Transduction and Multi-view Learning to Answer Emails , 2003, PKDD.

[6]  Andrew McCallum,et al.  Employing EM and Pool-Based Active Learning for Text Classification , 1998, ICML.

[7]  Ron Kohavi,et al.  The Case against Accuracy Estimation for Comparing Induction Algorithms , 1998, ICML.

[8]  Yoram Singer,et al.  Unsupervised Models for Named Entity Classification , 1999, EMNLP.

[9]  Kamal Nigamyknigam,et al.  Employing Em in Pool-based Active Learning for Text Classiication , 1998 .

[10]  Craig A. Knoblock,et al.  Adaptive View Validation: A First Step Towards Automatic View Detection , 2002, ICML.

[11]  Ulf Brefeld,et al.  Co-EM support vector learning , 2004, ICML.

[12]  Craig A. Knoblock,et al.  Active + Semi-supervised Learning = Robust Multi-View Learning , 2002, ICML.

[13]  Eiji Watanabe,et al.  A Distributed-Cooperative Learning Algorithm for Multi-Layered Neural Networks using a PC Cluster , 2001 .

[14]  F. Denis Classification and Co-training from Positive and Unlabeled Examples , 2003 .

[15]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[16]  Andrew P. Bradley,et al.  The use of the area under the ROC curve in the evaluation of machine learning algorithms , 1997, Pattern Recognit..

[17]  Rayid Ghani,et al.  Analyzing the effectiveness and applicability of co-training , 2000, CIKM '00.

[18]  Sebastian Thrun,et al.  Text Classification from Labeled and Unlabeled Documents using EM , 2000, Machine Learning.

[19]  Stan Matwin,et al.  Email classification with co-training , 2011, CASCON.

[20]  Rayid Ghani,et al.  Combining Labeled and Unlabeled Data for MultiClass Text Categorization , 2002, ICML.

[21]  Dunja Mladenic,et al.  Learning word normalization using word suffix and context from unlabeled data , 2002, ICML.

[22]  Ulf Brefeld,et al.  Support Vector Machines with Example Dependent Costs , 2003, ECML.