Integrated Multimodal Data Mining Using Self-Organizing Maps

We argue that current data mining systems have failed to keep up with the rapid increase in multimedia data available to the public and that improved content-based indexing methods should be utilized to solve this problem. This entails bridging the semantic gap between the high-level semantic concepts of humans and the low-level statistical representations us ed by computer systems. Furthermore, the problem of multimodal fusion of information coming from different data types must be solved in a multimedia scenario. PicSOM, a content-based information retrieval system described in this paper, addresses both these problems in a consistent manner by introducing multimodal object hierarchies and utilizing the strong data mining and discovery properties of Self-Organizing Maps (SOMs). We demonstrate this by providing two real-world examples. First we show how semantic associations emerge from images and subjective evaluations of personal items collected at an art installation. Secondly we review our results from the TRECVID video retrieval evaluations, and how our methodology helped in finding semantically similar objects from a large multimodal database.

[1]  King-Sun Fu,et al.  Query-by-Pictorial-Example , 1980, IEEE Trans. Software Eng..

[2]  Alireza Khotanzad,et al.  Invariant Image Recognition by Zernike Moments , 1990, IEEE Trans. Pattern Anal. Mach. Intell..

[3]  Teuvo Kohonen,et al.  Self-Organizing Maps , 2010 .

[4]  Amarnath Gupta,et al.  Visual information retrieval , 1997, CACM.

[5]  James M. Rehg,et al.  Vision-based speaker detection using Bayesian networks , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[6]  Erkki Oja,et al.  Self-Organising Maps as a Relevance Feedback Technique in Content-Based Image Retrieval , 2001, Pattern Analysis & Applications.

[7]  John P. Eakins,et al.  Towards intelligent image retrieval , 2002, Pattern Recognit..

[8]  Erkki Oja,et al.  PicSOM-self-organizing image retrieval with MPEG-7 content descriptors , 2002, IEEE Trans. Neural Networks.

[9]  George Legrady Pockets Full of Memories: an interactive museum installation , 2002 .

[10]  Erkki Oja,et al.  Inter-Query Relevance Learning in PicSOM for Content-Based Image Retrieval , 2003 .

[11]  Ishwar K. Sethi,et al.  Multimedia content processing through cross-modal association , 2003, MULTIMEDIA '03.

[12]  Erkki Oja,et al.  Statistical Shape Features for Content-Based Image Retrieval , 2004, Journal of Mathematical Imaging and Vision.

[13]  Erkki Oja,et al.  Use of Image Subset Features in Image Retrieval with Self-Organizing Maps , 2004, CIVR.

[14]  Jorma Laaksonen,et al.  PicSOM Experiments in TRECVID 2018 , 2015, TRECVID.

[15]  Jorma Laaksonen,et al.  Analysis of Semantic Information Available in an Image Collection Augmented with Auxiliary Data , 2006, AIAI.