Semantic image retrieval through human subject segmentation and characterization

Video databases can be searched for visual content by searching over automatically extracted key frames rather than the complete video sequence. Many video materials used in the humanities and social sciences contain a preponderance of shots of people. In this paper, we describe our work in semantic image retrieval of person-rich scenes (key frames) for video databases and libraries. We use an approach called retrieval through segmentation. A key-frame image is first segmented into human subjects and background. We developed a specialized segmentation technique that utilizes both human flesh-tone detection and contour analysis. Experimental results show that this technique can effectively segment images in a low time complexity. Once the image has been segmented, we can then extract features or pose queries about both the people and the background. We propose a retrieval framework that is based on the segmentation results and the extracted features of people and background.

[1]  Wayne H. Wolf,et al.  Hidden Markov model parsing of video programs , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[2]  Hong Heather Yu,et al.  Scenic classification methods for image and video databases , 1995, Other Conferences.

[3]  Alain Pirotte,et al.  Advances in Database Technology — EDBT '92 , 1992, Lecture Notes in Computer Science.

[4]  Alex Pentland,et al.  Pfinder: real-time tracking of the human body , 1996, Proceedings of the Second International Conference on Automatic Face and Gesture Recognition.

[5]  Bo Tao,et al.  Image retrieval with templates of arbitrary size , 1997, Electronic Imaging.

[6]  Yihong Gong,et al.  Image retrieval based on color features: an evaluation study , 1995, Other Conferences.

[7]  Christos Faloutsos,et al.  QBIC project: querying images by content, using color, texture, and shape , 1993, Electronic Imaging.

[8]  David B. Cooper,et al.  Recognition and positioning of rigid objects using algebraic moment invariants , 1991, Optics & Photonics.

[9]  Brian Scassellati,et al.  Retrieving images by 2D shape: a comparison of computation methods with human perceptual judgments , 1994, Electronic Imaging.

[10]  Toshikazu Kato,et al.  Query by Visual Example - Content based Image Retrieval , 1992, EDBT.

[11]  Shih-Fu Chang,et al.  Single color extraction and image query , 1995, Proceedings., International Conference on Image Processing.

[12]  Shih-Fu Chang,et al.  Quad-tree segmentation for texture-based image query , 1994, MULTIMEDIA '94.

[13]  T.-Y. Hou,et al.  Medical image retrieval by spatial features , 1992, [Proceedings] 1992 IEEE International Conference on Systems, Man, and Cybernetics.