Multimedia edges: finding hierarchy in all dimensions

This paper describes a new unified representation for the informa¿tion in a video. We reduce the dimensionality of the signal with either a singular-value decomposition (on the semantic and image data) or mel-frequency cepstral coefficients (on the audio data) and then concatenate the vectors to form a multi-dimensional represen¿tation of the video. Using scale-space techniques we find large jumps in the video's path, which we call edges. We use these tech¿niques to analyze the temporal properties of the audio and image data in a video. This analysis creates a hierarchical segmentation of the video, or a table-of-contents, from the audio, semantic and image data.

[1]  Richard A. Harshman,et al.  Indexing by Latent Semantic Analysis , 1990, J. Am. Soc. Inf. Sci..

[2]  Yee Leung,et al.  Clustering by Scale-Space Filtering , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[3]  Malcolm Slaney,et al.  Hierarchical segmentation using latent semantic indexing in scale space , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[4]  R. Lyon Speech recognition in scale space , 1987, ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[5]  Yiming Yang,et al.  Topic Detection and Tracking Pilot Study Final Report , 1998 .

[6]  Dragutin Petkovic,et al.  "What is in that Video Anyway?" In Search of Better Browsing , 1999, ICMCS, Vol. 1.

[7]  Malcolm Slaney,et al.  BabyEars: A recognition system for affective vocalizations , 2003, Speech Commun..

[8]  Jonathan Foote,et al.  Visualizing music and audio using self-similarity , 1999, MULTIMEDIA '99.

[9]  Pak Chung Wong,et al.  TOPIC ISLANDS/sup TM/-a wavelet-based text visualization system , 1998 .

[10]  Jeffrey C. Reynar Statistical Models for Topic Segmentation , 1999, ACL.

[11]  Andrew P. Witkin,et al.  Uniqueness of the Gaussian Kernel for Scale-Space Filtering , 1986, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  株式会社呉造船所造船設計部 貨車航送船「CITY OF NEW ORLEANS」号について , 1959 .

[13]  Susan T. Dumais,et al.  Improving the retrieval of information from external sources , 1991 .

[14]  Pak Chung Wong,et al.  TOPIC ISLANDS/sup TM/-a wavelet-based text visualization system , 1998, Proceedings Visualization '98 (Cat. No.98CB36276).

[15]  Freddy Y. Y. Choi Advances in domain independent linear text segmentation , 2000, ANLP.

[16]  Marti A. Hearst Multi-Paragraph Segmentation Expository Text , 1994, ACL.

[17]  Andrew P. Witkin,et al.  Scale-space filtering: A new approach to multi-scale description , 1984, ICASSP.

[18]  Malcolm Slaney,et al.  Baby Ears: a recognition system for affective vocalizations , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).