L'innovazione possibile nella prospettiva del MultiMedia Information Retrieval (MMIR)

Principles and practice of the MultiMedia Information Retrieval (MMIR), the organic complex of Visual Retrieval (VR), Video Retrieval (VDR), Audio Retrieval (AR) and Text Retrieval (TR) systems, are well known to computer scientists, engineers and mathematicians, but is now necessary that librarians too became familiar with MMIR technologies. The fields interested in the innovations of the MMIR are many and different, from the Medicine to the Geography, from the Engineering to the Visual Arts and the Music, each one introducing specific demands and challenges. This articles outlines the state of the art of MMIR systems, with an introduction about the evolution of the multimedia retrieval concept across the classical Information Retrieval (IR) architectures of past and current multimedia archives. The point is that in the information searching area it may result limitative to operate in terms of a generic IR. In the traditional practice, every kind of documental search is compelled to the conditions of a search through textual language; it is necessary, on the contrary, to consider a broader criterion of MMIR, by which every kind of digital document is processed and searched through the elements of language more proper to its own nature. It is then possible to differentiate, in a more general methodology of multimedia searching, a method of TR based on textual information for the search of textual documents, from a method of VR based on visual data for the search of visual documents, of VDR based on video data for the search of videos, and of AR based on sounds for audio documents. In databases where the content of the documents is substantially textual, it is appropriate that the keys of access will be terms and phrases extracted by the inside of that content; in multimedia databases, instead, it is inaccurate to attribute, from the outside, a textual description to contents that are well-grounded on a different structure of sense . Besides, if in the case of texts can be also suitable the method to analyse their concept and to attribute it a descriptor, this is not equally effective for images or sounds, where the subjective limits of the analysis are bigger, and not always concepts are of more interest than the concrete content of the documents, like forms, colours, movements, noises or music. Such innovative and efficient systems have an information retrieval approach that treats directly the objective content of the documents, and for this it is defined content-based , in opposition to the traditional systems of indexing and searching based on terms describing of such material content, defined term-based . So, it is possible to retrieve multimedia documents by applying storing and retrieval techniques that operate directly on the audiovisual contents within database digital objects. Thanks to possibilities and tools offered by digital technologies, MMIR systems allow the retrieval of still images, audiovisual pieces and audio contents exploiting language-specific features of each document. According to similarity and other methods such as approximation and relationships of measures and values, users can perform queries by using figures, textures, shapes, colours, sounds, frames, movements, etc as retrieval keys. The MMIR is a revolutionary metasystem , very specialized for an efficient treatment of digital multimedia objects. It is necessary to admit, however, that a good level of precision in the retrieval of documents can be reached only using in combination techniques and technologies of search based either on the definition of the concepts, through controlled terms, or on the representation of the content, through visual, audio and audiovisual elements. Both systems are able, in fact, to be harmonized. Term-based query can be a good preliminary method to select a part of the huge quantities of documents in regard to thematic areas, titles or authors. It can also be an ultimate way of cleaning up the inevitable specific noise of a content-based query. But, above all, the two procedures can operate in constant interaction, with an only search form, composing a query that by combining figures, sounds and texts is useful for searching very complex documents. The article is divided in three parts dedicated to each MMIR sub-system, and introduce to Visual Retrieval, Video Retrieval and Audio Retrieval. Each part presents an overview on methods and techniques currently available and a brief survey of research areas and oputputs from national and international researchers, experts and scholars from the computer science and information retrieval community. The scope of the authors is to contribute to the dissemination of MultiMedia Information Retrieval concepts and systems in Italy, envisioning their application by public and private organizations in a variety of fields within advanced documentation.