Mining noisy tagging from multi-label space

In this paper we study the problem of mining noisy tagging. Most of the existing discriminative classification methods to this problem only consider one tag at a time as the classification target, and completely ignore the rest of the given tags at the same time. In this paper we argue that all the given multiple tags can be utilized simultaneously as an additional feature and the information contained in the multi-label space can be taken advantage of to improve the performance of the classification. We first propose a novel distance measure to compute the distance between instances in the multi-label space. Then we propose several novel methods to incorporate the information of the multi-label space into the discriminative classification methods in one view learning or in two views learning to solve a general multi-label classification problem and to mitigate the influence of the noise in the classification. We apply the proposed solutions to the problem with a more specific context - noisy image annotation, and evaluate the proposed methods on a standard dataset from the related literature. Experiments show that they are superior to the peer methods in the existing literature on solving the problem of mining noisy tagging.