Hybrid Rule-Based/Neural Approach for Segmentation of MPEG Compressed Video

An approach for video segmentation into shots and sub-shots that works directly in the MPEG compressed domain is presented. It is based only on the information about macroblock coding mode and motion vectors in P and B frames. The system follows a two-pass scheme and has a hybrid rule-based/neural structure. A rough scan over the P frames locates the potential shot boundaries and the solution is then refined by a precise scan over the B frames of the respective neighborhoods. The “simpler” boundaries are recognized by the rule-based module, while the decisions for the “complex” ones are refined by the neural part. The latter is also used to distinguish dissolves from object and camera motions and to further divide shots into sub-shots. The experiments demonstrate high speed and classification accuracy without computationally expensive calculations and need for many thresholds.

[1]  Arding Hsu,et al.  Image processing on compressed data for large video databases , 1993, MULTIMEDIA '93.

[2]  John S. Boreczky,et al.  A hidden Markov model framework for video segmentation using audio and image features , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[3]  Edward J. Delp,et al.  Video scene change detection using the generalized sequence trace , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[4]  Gozde Bozdagi Akar,et al.  Feature-based hierarchical video segmentation , 1997, Proceedings of International Conference on Image Processing.

[5]  Ramin Zabih,et al.  A feature-based algorithm for detecting and classifying production effects , 1999, Multimedia Systems.

[6]  Yoshinobu Tonomura,et al.  Video handling based on structured information for hypermedia systems , 1991 .

[7]  Shih-Fu Chang,et al.  Scene change detection in an MPEG-compressed video sequence , 1995, Electronic Imaging.

[8]  Edward J. Delp,et al.  A fast algorithm for video parsing using MPEG compressed sequences , 1995, Proceedings., International Conference on Image Processing.

[9]  Ramesh C. Jain,et al.  Knowledge-guided parsing in video databases , 1993, Electronic Imaging.

[10]  Junji Maeda Method for extracting camera operations in order to describe subscenes in video sequences , 1994, Electronic Imaging.

[11]  Borko Furht,et al.  Video and Image Processing in Multimedia Systems , 1995 .

[12]  Ramesh C. Jain,et al.  Dynamic vision , 1988, [1988 Proceedings] 9th International Conference on Pattern Recognition.

[13]  Akio Nagasaka,et al.  Automatic Video Indexing and Full-Video Search for Object Appearances , 1991, VDB.

[14]  Walter Bender,et al.  Salient video stills: content and context preserved , 1993, MULTIMEDIA '93.

[15]  Boon-Lock Yeo,et al.  Rapid scene analysis on compressed video , 1995, IEEE Trans. Circuits Syst. Video Technol..

[16]  Ullas Gargi,et al.  Evaluation of video sequence indexing and hierarchical video indexing , 1995, Electronic Imaging.

[17]  John P. Oakley,et al.  Storage and Retrieval for Image and Video Databases , 1993 .

[18]  Wei Xiong,et al.  Net comparison: a fast and effective method for classifying image sequences , 1995, Electronic Imaging.

[19]  Ullas Gargi,et al.  Performance characterization and comparison of video indexing algorithms , 1998, Proceedings. 1998 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No.98CB36231).

[20]  Irena Koprinska,et al.  Detecting and classifying video shot boundaries in MPEG compressed sequences , 1998, 9th European Signal Processing Conference (EUSIPCO 1998).

[21]  Ramesh C. Jain,et al.  Production model based digital video segmentation , 1995, Multimedia Tools and Applications.

[22]  Aiko M. Hormann,et al.  Programs for Machine Learning. Part I , 1962, Inf. Control..

[23]  Atreyi Kankanhalli,et al.  Automatic partitioning of full-motion video , 1993, Multimedia Systems.

[24]  Nilesh V. Patel,et al.  Video shot detection and characterization for video databases , 1997, Pattern Recognit..

[25]  Philippe Aigrain,et al.  The automatic real-time analysis of film editing and transition effects and its applications , 1994, Comput. Graph..

[26]  Nilesh V. Patel,et al.  Statistical approach to scene change detection , 1995, Electronic Imaging.

[27]  J. Astola,et al.  Vector median filters , 1990, Proc. IEEE.

[28]  Teuvo Kohonen,et al.  The self-organizing map , 1990 .

[29]  John S. Boreczky,et al.  Comparison of video shot boundary detection techniques , 1996, Electronic Imaging.

[30]  Stephen W. Smoliar,et al.  Video parsing and browsing using compressed data , 1995, Multimedia Tools and Applications.

[31]  Thomas D. C. Little,et al.  A Survey of Technologies for Parsing and Indexing Digital Video1 , 1996, J. Vis. Commun. Image Represent..

[32]  Yoshinobu Tonomura,et al.  Video tomography: an efficient method for camerawork extraction and motion analysis , 1994, MULTIMEDIA '94.

[33]  I. K. Sethi,et al.  Hierarchical Classifier Design Using Mutual Information , 1982, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[34]  Hideo Hashimoto,et al.  Video indexing using motion vectors , 1992, Other Conferences.

[35]  Behzad Shahraray,et al.  Scene change detection and content-based sampling of video sequences , 1995, Electronic Imaging.

[36]  Ramesh Jain,et al.  Storage and Retrieval for Image and Video Databases III , 1995 .

[37]  Sholom M. Weiss,et al.  Computer Systems That Learn , 1990 .