Learning regional semantic concepts from incomplete annotation

For multimedia retrieval to be effective, the semantic gap needs to be bridged. Statistical learning techniques provide a robust framework for learning representations of semantic concepts from visual features. The bottleneck is the need to annotate a large number of training samples to construct robust models. We present a novel approach where the annotations may be entered at coarser spatial granularity while the concept may still be learnt at finer granularity. This can speed up annotation significantly and provide bootstrapping. We show that it is possible to learn representations of concepts occurring at the regional level by using annotations for several images, where the annotations are provided only at the global level. The disambiguation can be handled by the multiple instance learning paradigm. We demonstrate this using the TREC 2001 corpus for the concept sky.

[1]  J.R. Smith,et al.  Learning visual models of semantic concepts , 2003, Proceedings 2003 International Conference on Image Processing (Cat. No.03CH37429).

[2]  John R. Smith,et al.  Learning to annotate video databases , 2001, IS&T/SPIE Electronic Imaging.

[3]  John R. Smith,et al.  Modeling semantic concepts to support query by keywords in video , 2002, Proceedings. International Conference on Image Processing.

[4]  Tomás Lozano-Pérez,et al.  A Framework for Multiple-Instance Learning , 1997, NIPS.

[5]  W. Eric L. Grimson,et al.  A framework for learning query concepts in image classification , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[6]  Thomas G. Dietterich,et al.  Solving the Multiple Instance Problem with Axis-Parallel Rectangles , 1997, Artif. Intell..

[7]  John R. Smith,et al.  Learning semantic multimedia representations from a small set of examples , 2002, Proceedings. IEEE International Conference on Multimedia and Expo.

[8]  Brendan J. Frey,et al.  Probabilistic multimedia objects (multijects): a novel approach to video indexing and retrieval in multimedia systems , 1998, Proceedings 1998 International Conference on Image Processing. ICIP98 (Cat. No.98CB36269).