Tracking faces of arbitrary views for video annotation

We proposed an omni-face tracking system for video annotation in this paper, which is designed to find faces from arbitrary views in complex scenes. The face detector first locates potential faces in the input by performing skin-tone detection. The subsequent processing consists of two largely independent components, the frontal face module and the side- view face module, responsible for finding frontal-view and side-view faces, respectively. The frontal face module uses a region-based approach wherein regions of skin-tone pixels are analyzed for gross oval shape and the presence of facial features. In contrast, the side-view face module follows an edge-based approach to look for curves similar to a side-view profile. To extract the trajectories of faces, the temporal continuity between consecutive frames within the video shots is considered to speed up the tracking process. The main contribution of this work is being able to find faces irrespective of their poses, whereas contemporary systems deal with frontal-view faces only. Information regarding to human faces is encoded in XML format for semantic video content representation. The effectiveness of human face for video annotation is demonstrated in a TV program classification system that categories the input video clip into predefined types. It is shown that the classification accuracy is improved saliently by the employment of face information.

[1]  Gang Wei,et al.  Detection of side-view faces in color images , 2000, Proceedings Fifth IEEE Workshop on Applications of Computer Vision.

[2]  G. Medioni,et al.  Corner detection and curve representation using cubic B-splines , 1986, Proceedings. 1986 IEEE International Conference on Robotics and Automation.

[3]  Gang Wei,et al.  TV program classification based on face and text processing , 2000, 2000 IEEE International Conference on Multimedia and Expo. ICME2000. Proceedings. Latest Advances in the Fast Changing World of Multimedia (Cat. No.00TH8532).

[4]  Takeo Kanade,et al.  Name-It: Naming and Detecting Faces in News Videos , 1999, IEEE Multim..

[5]  L. Rabiner,et al.  An introduction to hidden Markov models , 1986, IEEE ASSP Magazine.

[6]  Rama Chellappa,et al.  Human and machine recognition of faces: a survey , 1995, Proc. IEEE.

[7]  Takeo Kanade,et al.  Neural Network-Based Face Detection , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[8]  Chitra Dorai,et al.  On the evolution of videotext description scheme and its validation experiments for MPEG-7 , 2000, MULTIMEDIA '00.

[9]  Gang Wei,et al.  Video classification based on HMM using text and faces , 2000, 2000 10th European Signal Processing Conference.

[10]  Gang Wei,et al.  Face detection for image annotation , 1999, Pattern Recognition Letters.

[11]  Nilesh V. Patel,et al.  Compressed Video Processing for Cut Detection , 1996 .

[12]  Daniel P. Huttenlocher,et al.  Comparing Images Using the Hausdorff Distance , 1993, IEEE Trans. Pattern Anal. Mach. Intell..

[13]  Gang Wei,et al.  Omni-face detection for video/image content description , 2000, MULTIMEDIA '00.