Transductive video annotation via local learnable kernel classifier

One crucial problem in transductive video annotation is how to estimate the label from the neighboring samples. Existing methods such as graph-based Gaussian random filed only considered the pair-wise similarity and then propagated the labels based on it. In this paper, we propose a new method from the perspective of local learning, which formulate the prediction of labels from the neighbors into a learning problem. Our contributions lie in two-fold: (1) we propose a new transductive video annotation method based on local kernel classifier; (2) local learnable is proposed to measure whether a sample can be learned from the neighbors well and we employ this measure into the optimization objective. Experiments on TRECVID 2005 dataset prove that the proposed method is effective and the local learning perspective is promising for video annotation.