Saying What it Means: Semi-Automated (News) Media Annotation

This paper considers the automated and semi-automated annotation of audiovisual media in a new type of production framework, A4SM (Authoring System for Syntactic, Semantic and Semiotic Modelling). We present the architecture of the framework, describe a prototypical camera, a handheld device for basic semantic annotation, and an editing suite to demonstrate how video material can be annotated in real time and how this information can not only be used for retrieval but also can be used during the different phases of the production process itself. We then outline the underlying XML Schema based content description structures of A4SM and discuss the pros and cons of our approach of evolving semantic networks as the basis for audio-visual content description.

[1]  Alan P. Parkes An artificial intelligence approach to the conceptual description of videodisc images , 1988 .

[2]  Amit Srivastava,et al.  Integrated technologies for indexing spoken language , 2000, CACM.

[3]  P. Bloom,et al.  High-quality digital audio in the entertainment industry: An overview of achievements and challenges , 1985, IEEE ASSP Magazine.

[4]  Umberto Eco,et al.  A theory of semiotics , 1976, Advances in semiotics.

[5]  Frank Nack,et al.  Everything You Wanted to Know About MPEG-7: Part 1 , 1999, IEEE Multim..

[6]  Craig Lindley,et al.  Environments for the production and maintenance of interactive stories , 2000 .

[7]  Alberto Del Bimbo,et al.  Semantics in Visual Information Retrieval , 1999, IEEE Multim..

[8]  Jane Hunter,et al.  A Comparison of Schemas for Video Metadata Representation , 1999, Comput. Networks.

[9]  Douglas Keislar,et al.  Content-Based Classification, Search, and Retrieval of Audio , 1996, IEEE Multim..

[10]  Craig A. Lindley A Video Annoation Methodology for Interactive Video Sequence Generation , 2001, Digital Content Creation.

[11]  Shih-Fu Chang,et al.  Overview of the MPEG-7 standard , 2001, IEEE Trans. Circuits Syst. Video Technol..

[12]  Glorianna Davenport,et al.  The Stratification System A Design Environment for Random Access Video , 2005 .

[13]  Frank Nack,et al.  Everything You Wanted to Know About MPEG-7: Part 2 , 1999, IEEE Multim..

[14]  Glorianna Davenport,et al.  The Stratification System - A Design Emvironment for Random Access , 1992, NOSSDAV.

[15]  John F. Sowa,et al.  Conceptual Structures: Information Processing in Mind and Machine , 1983 .

[16]  Marc Davis,et al.  Media streams: representing video for retrieval and repurposing , 1994, MULTIMEDIA '94.

[17]  Alan P. Parkes,et al.  The Application of Video Semantics and Theme Representation in Automated Video Editing , 2004, Multimedia Tools and Applications.

[18]  Wolfgang Effelsberg,et al.  Automatic audio content analysis , 1997, MULTIMEDIA '96.

[19]  Kevin M. Brooks,et al.  Metalinear cinematic narrative : theory, process, and tool , 1999 .

[20]  Frank G. Halasz,et al.  Reflections on NoteCards: seven issues for the next generation of hypermedia systems , 1987, CACM.

[21]  Karen Spärck Jones,et al.  Audio Indexing and Retrieval of Complete Broadcoast News Shows , 2000, RIAO.

[22]  Akio Nagasaka,et al.  Automatic Video Indexing and Full-Video Search for Object Appearances , 1991, VDB.

[23]  R. Arnheim Art and Visual Perception, a Psychology of the Creative Eye , 1967 .

[24]  G. Halasz Frank,et al.  Reflections on NoteCards: seven issues for the next generation of hypermedia systems , 1987, CACM.

[25]  Keiji Hirata Towards Formalizing Jazz Piano Knowledge with a Deductive Object-Oriented Approach , 1995 .

[26]  François Pachet,et al.  A taxonomy of musical genres , 2000, RIAO.

[27]  Max Mühlhäuser,et al.  Design Patterns for Interactive Musical Systems , 1998, IEEE Multim..

[28]  Simone Santini,et al.  Integrated browsing and querying for image databases , 2000, IEEE MultiMedia.

[29]  Howard D. Wactlar,et al.  Complementary video and audio analysis for broadcast news archives , 2000, CACM.

[30]  P. Beek,et al.  Text of 15938-5 FCD Information Technology-Multimedia Content Description Interface-Pard 5 Multimedia Description Schemes , 2001 .

[31]  Alberto Del Bimbo,et al.  Content based annotation and retrieval of news videos , 2000, 2000 IEEE International Conference on Multimedia and Expo. ICME2000. Proceedings. Latest Advances in the Fast Changing World of Multimedia (Cat. No.00TH8532).

[32]  Jane Hunter,et al.  Combining RDF and XML schemas to enhance interoperability between metadata application profiles , 2001, WWW '01.

[33]  Ichiro Ide,et al.  An attribute based news video indexing , 2001, MULTIMEDIA '01.

[34]  Yihong Gong,et al.  Automatic parsing of news video , 1994, 1994 Proceedings of IEEE International Conference on Multimedia Computing and Systems.

[35]  David Pye,et al.  AT_TV: Broadcast Television and Radio Retrieval , 2000, RIAO.

[36]  Yukinobu Taniguchi,et al.  Structured Video Computing , 1994, IEEE MultiMedia.

[37]  Alberto Del Bimbo,et al.  Visual information retrieval , 1999 .

[38]  Jorma Tarhio,et al.  Searching monophonic patterns within polyphonic sources , 2000 .

[39]  Philippe Aigrain,et al.  Medium knowledge-based macro-segmentation of video into sequences , 1997 .

[40]  Alan P. Parkes Settings and the setting structure: the description and automated propagation of networks for perusing videodisk image states , 1989, SIGIR '89.

[41]  R. Arnheim,et al.  Art and Visual Perception: A Psychology of the Creative Eye. , 1956 .

[42]  Judy Robertson,et al.  Real-time music generation for a virtual environment , 1998 .

[43]  Nicola Orio,et al.  SMILE: a System for Content-based Musical Information Retrieval Environments , 2000, RIAO.

[44]  Jean-Luc Gauvain,et al.  Transcribing broadcast news for audio and video indexing , 2000, CACM.

[45]  Boon-Lock Yeo,et al.  Video browsing using clustering and scene transitions on compressed sequences , 1995, Electronic Imaging.