论文信息 - CONTENTUS—technologies for next generation multimedia libraries

CONTENTUS—technologies for next generation multimedia libraries

An ever-growing amount of digitized content urges libraries and archives to integrate new media types from a large number of origins such as publishers, record labels and film archives, into their existing collections. This is a challenging task, since the multimedia content itself as well as the associated metadata is inherently heterogeneous—the different sources lead to different data structures, data quality and trustworthiness. This paper presents the contentus approach towards an automated media processing chain for cultural heritage organizations and content holders. Our workflow allows for unattended processing from media ingest to availability thorough our search and retrieval interface. We aim to provide a set of tools for the processing of digitized print media, audio/visual, speech and musical recordings. Media specific functionalities include quality control for digitization of still image and audio/visual media and restoration of the most common quality issues encountered with these media. Furthermore, the contentus tools include modules for content analysis like segmentation of printed, audio and audio/visual media, optical character recognition (OCR), speech-to-text transcription, speaker recognition and the extraction of musical features from audio recordings, all aimed at a textual representation of information inherent within the media assets. Once the information is extracted and transcribed in textual form, media independent processing modules offer extraction and disambiguation of named entities and text classification. All contentus modules are designed to be flexibly recombined within a scalable workflow environment using cloud computing techniques. In the next step analyzed media assets can be retrieved and consumed through a search interface using all available metadata. The search engine combines Semantic Web technologies for representing relations between the media and entities such as persons, locations and organizations with a full-text approach for searching within transcribed information gathered through the preceding processing steps. The contentus unified search interface integrates text, images, audio and audio/visual content. Queries can be narrowed and expanded in an exploratory manner, search results can be refined by disambiguating entities and topics. Further, semantic relationships become not only apparent, but can also be navigated.

[1] Mark J. Huiskes,et al. The MIR flickr retrieval evaluation , 2008, MIR '08.

[2] Wolfgang Nejdl,et al. PHAROS - Platform For Search of Audiovisual Resources Across Online Spaces , 2006, SAMT.

[3] Changsong Liu,et al. Form frame line detection with directional single-connected chain , 2001, Proceedings of Sixth International Conference on Document Analysis and Recognition.

[4] Jan Hannemann,et al. Linked Data for Libraries , 2010 .

[5] Thomas M. Breuel,et al. Two Geometric Algorithms for Layout Analysis , 2002, Document Analysis Systems.

[6] Andrew McCallum,et al. Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[7] Douglas A. Reynolds,et al. Speaker Verification Using Adapted Gaussian Mixture Models , 2000, Digit. Signal Process..

[8] Ramesh A. Gopinath,et al. Improved speaker segmentation and segments clustering using the bayesian information criterion , 1999, EUROSPEECH.

[9] Doris Baum. Topic-based speaker recognition for German parliamentary speeches , 2009, 2009 IEEE Workshop on Automatic Speech Recognition & Understanding.

[10] Dan Roth,et al. Design Challenges and Misconceptions in Named Entity Recognition , 2009, CoNLL.

[11] Edoardo Greppi. FAO (Food and Agriculture Organization of the United Nations) , 1981 .

[12] Carol Peters,et al. The MultiMatch Prototype: Multilingual/Multimedia Search for Cultural Heritage Objects , 2008, ECDL.

[13] Yiannis Kompatsiaris,et al. Semantic Multimedia and Ontologies: Theory and Applications , 2008 .

[14] Ingeborg Sølvberg,et al. Semantic Data Integration Framework in Peer-to-Peer based Digital Libraries , 2005, J. Digit. Inf. Manag..

[15] Adrian Ulges,et al. A System That Learns to Tag Videos by Watching Youtube , 2008, ICVS.

[16] Patrick Ndjiki-Nya,et al. Fully automatic inpainting method for complex image content , 2009, 2009 10th Workshop on Image Analysis for Multimedia Interactive Services.

[17] Ilaria Bartolini,et al. Shiatsu: semantic-based hierarchical automatic tagging of videos by segmentation using cuts , 2010, AIEMPro '10.

[18] Dennis Koelma,et al. The MediaMill TRECVID 2008 Semantic Video Search Engine , 2008, TRECVID.

[19] Jan Hannemann,et al. CONTENTUS - Towards Semantic Multimedia Libraries , 2010 .

[20] Shih-Fu Chang,et al. Enabling MPEG-7 structural and semantic descriptions in retrieval applications , 2007, J. Assoc. Inf. Sci. Technol..

[21] Douglas B. Terry,et al. Using collaborative filtering to weave an information tapestry , 1992, CACM.

[22] Yiannis Kompatsiaris,et al. Advances in semantic multimedia analysis for personalised content access , 2006, 2006 IEEE International Symposium on Circuits and Systems.

[23] Christoph Seibert,et al. Constant-Time Locally Optimal Adaptive Binarization , 2009, 2009 10th International Conference on Document Analysis and Recognition.

[24] G Stix,et al. The mice that warred. , 2001, Scientific American.

[25] N. Otsu. A threshold selection method from gray level histograms , 1979 .

[26] Stefan Müller,et al. Scratch detection supported by coherency analysis of motion vector fields , 2009, 2009 16th IEEE International Conference on Image Processing (ICIP).

[27] Ugo Corda. Multimedia Semantics from MPEG-7 Metadata to Semantic Web Ontologies , 2008 .

[28] Harald Sack,et al. Exploratory Semantic Video Search with yovisto , 2010, 2010 IEEE Fourth International Conference on Semantic Computing.

[29] Michael G. Strintzis,et al. Capturing MPEG-7 Semantics , 2007, MTSR.

[30] T. D. Wilson,et al. Review of: Witten, Ian H., Bainbridge, David and Nichols, David M. How to build a digital library, 2nd ed. Burlington, MA: Morgan-Kaufmann, 2010 , 2011, Inf. Res..

[31] Christian Petersohn. Fraunhofer HHI at TRECVID 2004: Shot Boundary Detection System , 2004, TRECVID.

[32] Tony Willis. BPEL 100 Success Secrets - Business Process Execution Language for Web Services- THE XML-based language for the formal specification of business processes, ... protocols and SOA based integration , 2008 .

[33] Ioannis Pratikakis,et al. Automatic Table Detection in Document Images , 2005, ICAPR.

[34] Petasis George,et al. Semi-automated ontology learning : the BOEMIE approach , 2009 .

[35] Ian H. Witten,et al. How to Build a Digital Library, Second Edition , 2009 .

[36] Apostolos Antonacopoulos,et al. ICDAR 2009 Page Segmentation Competition , 2003, 2009 10th International Conference on Document Analysis and Recognition.

[37] Yiannis Kompatsiaris,et al. Semantic Multimedia and Ontologies , 2008 .

[38] Christian Petersohn,et al. Temporal video structuring for preservation and annotation of video content , 2009, 2009 16th IEEE International Conference on Image Processing (ICIP).

[39] Stefan Eickeler,et al. A new quality assessment and improvement system for print media , 2012, EURASIP J. Adv. Signal Process..

[40] Marcel Worring,et al. Concept-Based Video Retrieval , 2009, Found. Trends Inf. Retr..

[41] Taghi M. Khoshgoftaar,et al. A Survey of Collaborative Filtering Techniques , 2009, Adv. Artif. Intell..

[42] Nenghai Yu,et al. Flickr distance , 2008, ACM Multimedia.

[43] Lina J. Karam,et al. A No-Reference Objective Image Sharpness Metric Based on the Notion of Just Noticeable Blur (JNB) , 2009, IEEE Transactions on Image Processing.

[44] Thomas M. Breuel,et al. High Performance Document Layout Analysis , 2003 .

[45] Dietrich Schüller. International Association of Sound and Audiovisual Archives , 2007 .

[46] Yiannis Kompatsiaris,et al. A Survey of Semantic Image and Video Annotation Tools , 2011, Knowledge-Driven Multimedia Information Extraction and Ontology Evolution.

[47] Anil K. Jain,et al. Document Representation and Its Application to Page Decomposition , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[48] Ian H. Witten,et al. How to Build a Digital Library , 2002 .

[49] Mannes Poel,et al. Multimedia Semantic Syndication for Enhanced News Services (MESH) , 2006 .

[50] Joachim Köhler,et al. DiSCo - A German Evaluation Corpus for Challenging Problems in the Broadcast Domain , 2010, LREC.

[51] Ramanathan V. Guha,et al. Semantic search , 2003, WWW '03.

[52] Marcel Worring,et al. Semantic Image and Video Indexing in Broad Domains , 2007, IEEE Trans. Multim..

[53] Francisco Curbera,et al. Web Services Business Process Execution Language Version 2.0 , 2007 .

[54] B. S. Manjunath,et al. Introduction to MPEG-7: Multimedia Content Description Interface , 2002 .

[55] Huajun Chen,et al. The Semantic Web , 2011, Lecture Notes in Computer Science.

[56] Arnold W. M. Smeulders,et al. Visual-Concept Search Solved? , 2010, Computer.

[57] J. Waitelonis,et al. More than the Sum of its Parts : CONTENTUS – A Semantic Multimodal Search User Interface , 2010 .

[58] George Buchanan,et al. Semantics in Greenstone , 2009, Semantic Digital Libraries.

[59] Rong Yan,et al. A review of text and image retrieval approaches for broadcast news video , 2007, Information Retrieval.

[60] Marios C. Angelides,et al. From MPEG-7 user interaction tools to hanging basket models: bridging the gap , 2009, Multimedia Tools and Applications.

[61] Dean Allemang,et al. Chapter 1 – What is the Semantic Web? , 2011 .

[62] Getaneh Alemu,et al. Linked data for libraries: benefits of a conceptual shift from library-specific record structures to RDF-based data models , 2012 .

[63] Steffen Staab,et al. Semantic Multimedia: First International Conference on Semantic and Digital Media Technologies, SAMT 2006Athens, Greece, December 6-8, 2006Proceedings (Lecture Notes in Computer Science) , 2007 .

[64] Hsin-Min Wang,et al. BIC-Based Speaker Segmentation Using Divide-and-Conquer Strategies With Application to Speaker Diarization , 2010, IEEE Transactions on Audio, Speech, and Language Processing.

[65] Chrisa Tsinaraki,et al. MPEG-7 and the Semantic Web , 2007 .

[66] Yaron Goland,et al. Web Services Business Process Execution Language , 2009, Encyclopedia of Database Systems.

[67] Patrick Ndjiki-Nya,et al. Restoration of digitized video sequences: An efficient drop-out detection and removal framework , 2009, 2009 16th IEEE International Conference on Image Processing (ICIP).

[68] Chrisa Tsinaraki,et al. An MPEG-7 query language and a user preference model that allow semantic retrieval and filtering of multimedia content , 2007, Multimedia Systems.

[69] a. hess. CONTENTUS – Towards Semantic Multimedia Libraries , 2010 .

[70] Joachim Köhler,et al. Constrained Subword Units for Speaker Recognition , 2010, Odyssey.

[71] Basilios Gatos,et al. Page Segmentation Competition , 2007, Ninth International Conference on Document Analysis and Recognition (ICDAR 2007).

[72] Paul Over,et al. Evaluation campaigns and TRECVid , 2006, MIR '06.