Temporally Consistent Gaussian Random Field for Video Semantic Analysis

As a major family of semi-supervised learning, graph based semi-supervised learning methods have attracted lots of interests in the machine learning community as well as many application areas recently. However, for the application of video semantic annotation, these methods only consider the relations among samples in the feature space and neglect an intrinsic property of video data: the temporally adjacent video segments (e.g., shots) usually have similar semantic concept. In this paper, we adapt this temporal consistency property of video data into graph based semi-supervised learning and propose a novel method named temporally consistent Gaussian random field (TCGRF) to improve the annotation results. Experiments conducted on the TREC VID data set have demonstrated its effectiveness.