Colour-based object recognition for video annotation

We propose a colour-based object recognition method for video annotation. The semantic gap between image measurements and symbolic labelling is bridged by assuming the existence of objects whose appearance can be associated with some desired image categories (labels). A colour-based method, the multimodal neighbourhood signature (MNS) is used. We propose an automatic method for learning the object representation from multiple images. A new MNS matching strategy is also introduced, making use of a K-class classifier based on a binary feature vector computed from the object's MNS signature. In the experimental section, the proposed method is evaluated for annotating sport video keyframes using raw broadcast video material provided by the BBC. Despite the poor quality of some of the images and a wide range of appearance variations (occlusion, illumination and viewpoint change, camera noise and cluttered back-ground to name a few), correct (average 85%) object recognition and sport classification was achieved for a set of four selected objects/sports.