Time-space acoustical feature for fast video copy detection

We propose a new time-space acoustical feature for fast video copy detection to search a video segment for a number of video streams to find illegal video copies on Internet video site and so on. We extract a small number of feature vectors from acoustically peculiar points that express the point of local maximum/minimum in the time sequence of acoustical power envelopes in video data. The relative values of the feature points are extracted, so called time-space acoustical feature, because the volume in the video stream differs in different recording environments. The features can be obtained quickly compared with representative features such as MFCC, and they require a short processing time for matching because the number and the dimension of each feature vector are both small. The accuracy and the computation time of the proposed method is evaluated using recorded TV movie programs for input data, and a 30 sec. −3 min. segment in DVD for reference data, assuming a copyright holder of a movie searches the illegal copies for video streams. We could confirm that the proposed method completed all processes within the computation time of the former feature extraction with 93.2% of F-measure in 3 minutes video segment detection.

[1]  Shi-wook Lee,et al.  An algorithm for similar utterance section extraction for managing spoken documents , 2005, Multimedia Systems.

[2]  Kunio Kashino,et al.  Time-series active search for quick retrieval of audio and video , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[3]  Wessel Kraaij,et al.  TNO at TRECVID2008 Combining Audio and Video Fingerprinting for Robust Copy Detection , 2008, TRECVID.

[4]  Peter Knees,et al.  Augmenting Text-based Music Retrieval with Audio Similarity: Advantages and Limitations , 2009, ISMIR.

[5]  Neil J. Hurley,et al.  Performance of Philips Audio Fingerprinting Under Additive Noise , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[6]  Pedro Cano,et al.  A Review of Audio Fingerprinting , 2005, J. VLSI Signal Process..

[7]  Yoshiaki Itoh,et al.  Highlight scene extraction of sports broadcasts using sports news programs , 2008, 2008 IEEE 10th Workshop on Multimedia Signal Processing.

[8]  Masahide Sugiyama,et al.  A new implementation of similar segment search in an arbitrary number of time-series , 2008, 2008 8th IEEE International Conference on Computer and Information Technology.

[9]  Siripinyo Chantamunee,et al.  University of Sheffield at TRECVID 2008: Rushes Summarisation and Video Copy Detection , 2008, TRECVID.