Broadcast news parsing using visual cues: a robust face detection approach

Automatic content-based analysis and indexing of broadcast news recordings or digitized news archives is becoming an important tool in the framework of many multimedia interactive services such as news summarization, browsing, retrieval and news-on-demand (NoD) applications. Existing approaches have achieved high performance in such applications but heavily rely on textual cues such as closed caption tokens and teletext transcripts. We present an efficient technique for temporal segmentation and parsing of news recordings based on visual cues that can either be employed as a stand-alone application for non-closed captioned broadcasts or integrated with audio and textual cues of existing systems. The technique involves robust face detection by means of color segmentation, skin color matching and shape processing, and is able to identify typical news instances like anchor persons, reports and outdoor shots.

[1]  Takeo Kanade,et al.  Semantic analysis for video contents extraction—spotting by association in news video , 1997, MULTIMEDIA '97.

[2]  ChangShih-Fu,et al.  A highly efficient system for automatic face region detection in MPEG video , 1997 .

[3]  Stefanos D. Kollias,et al.  A stochastic framework for optimal key frame extraction from MPEG video databases , 1999, 1999 IEEE Third Workshop on Multimedia Signal Processing (Cat. No.99TH8451).

[4]  Shih-Fu Chang,et al.  A highly efficient system for automatic face region detection in MPEG video , 1997, IEEE Trans. Circuits Syst. Video Technol..

[5]  Yannis Avrithis,et al.  Efficient face detection for multimedia applications , 2000, Proceedings 2000 International Conference on Image Processing (Cat. No.00CH37101).

[6]  Karen Spärck Jones,et al.  Automatic content-based retrieval of broadcast news , 1995, MULTIMEDIA '95.

[7]  SamalAshok,et al.  Automatic recognition and analysis of human faces and facial expressions , 1992 .

[8]  Patrick Bouthemy,et al.  A unified approach to shot change detection and camera motion characterization , 1999, IEEE Trans. Circuits Syst. Video Technol..

[9]  Fernando Pereira,et al.  MPEG-4: Context and objectives , 1997, Signal Process. Image Commun..

[10]  A. Kosmala,et al.  A New Approach To Content-Based Video Indexing Using Hidden Markov Models , 1997 .

[11]  Bernard Merialdo,et al.  Automatic indexing of TV News , 1997 .

[12]  Ashok Samal,et al.  Automatic recognition and analysis of human faces and facial expressions: a survey , 1992, Pattern Recognit..

[13]  Stefanos D. Kollias,et al.  Face extraction from non-uniform background and recognition in compressed domain , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[14]  Mark T. Maybury,et al.  Broadcast news navigation using story segmentation , 1997, MULTIMEDIA '97.

[15]  Ioannis Pitas,et al.  Audio-visual content analysis for content-based video indexing , 1999, Proceedings IEEE International Conference on Multimedia Computing and Systems.