Information fusion in content based image retrieval: A comprehensive overview

An overview of information fusion in Content Based Image Retrieval (CBIR).Analysis of each component in the fusion processing pipeline.Classification of the main categories in which fusion approaches can be grouped.Details of some representative method for each fusion category. An ever increasing part of communication between persons involve the use of pictures, due to the cheap availability of powerful cameras on smartphones, and the cheap availability of storage space. The rising popularity of social networking applications such as Facebook, Twitter, Instagram, and of instant messaging applications, such as WhatsApp, WeChat, is the clear evidence of this phenomenon, due to the opportunity of sharing in real-time a pictorial representation of the context each individual is living in. The media rapidly exploited this phenomenon, using the same channel, either to publish their reports, or to gather additional information on an event through the community of users. While the real-time use of images is managed through metadata associated with the image (i.e., the timestamp, the geolocation, tags, etc.), their retrieval from an archive might be far from trivial, as an image bears a rich semantic content that goes beyond the description provided by its metadata. It turns out that after more than 20 years of research on Content-Based Image Retrieval (CBIR), the giant increase in the number and variety of images available in digital format is challenging the research community. It is quite easy to see that any approach aiming at facing such challenges must rely on different image representations that need to be conveniently fused in order to adapt to the subjectivity of image semantics. This paper offers a journey through the main information fusion ingredients that a recipe for the design of a CBIR system should include to meet the demanding needs of users.

[1]  Bir Bhanu,et al.  Probabilistic Feature Relevance Learning for Content-Based Image Retrieval , 1999, Comput. Vis. Image Underst..

[2]  Romaric Besançon,et al.  Cross-Media Feedback Strategies: Merging Text and Image Information to Improve Image Retrieval , 2004, CLEF.

[3]  Yimin Wu,et al.  Interactive pattern analysis for relevance feedback in multimedia information retrieval , 2004, Multimedia Systems.

[4]  Bart Thomee,et al.  A picture is worth a thousand words : content-based image retrieval techniques , 2010 .

[5]  Yiannis S. Boutalis,et al.  CEDD: Color and Edge Directivity Descriptor: A Compact Descriptor for Image Indexing and Retrieval , 2008, ICVS.

[6]  James Ze Wang,et al.  Image retrieval: Ideas, influences, and trends of the new age , 2008, CSUR.

[7]  Hichem Frigui,et al.  Fusion of multi-modal features for efficient content-based image retrieval , 2008, 2008 IEEE International Conference on Fuzzy Systems (IEEE World Congress on Computational Intelligence).

[8]  Christos Faloutsos,et al.  MindReader: Querying Databases Through Multiple Examples , 1998, VLDB.

[9]  Victor S. Lempitsky,et al.  Neural Codes for Image Retrieval , 2014, ECCV.

[10]  S. Sclaroff,et al.  Combining textual and visual cues for content-based image retrieval on the World Wide Web , 1998, Proceedings. IEEE Workshop on Content-Based Access of Image and Video Libraries (Cat. No.98EX173).

[11]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[12]  Andrew Zisserman,et al.  Efficient Visual Search for Objects in Videos , 2008, Proceedings of the IEEE.

[13]  Trevor Darrell,et al.  DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition , 2013, ICML.

[14]  David A. Cohn,et al.  Improving generalization with active learning , 1994, Machine Learning.

[15]  Thomas Mensink,et al.  Improving the Fisher Kernel for Large-Scale Image Classification , 2010, ECCV.

[16]  Alberto Del Bimbo,et al.  Dynamic Pictorially Enriched Ontologies for Digital Video Libraries , 2009, IEEE MultiMedia.

[17]  Tian Zhang,et al.  BIRCH: an efficient data clustering method for very large databases , 1996, SIGMOD '96.

[18]  Lei Zhu,et al.  Theory of keyblock-based image retrieval , 2002, TOIS.

[19]  Stefan Carlsson,et al.  CNN Features Off-the-Shelf: An Astounding Baseline for Recognition , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops.

[20]  Andrew Zisserman,et al.  Triangulation Embedding and Democratic Aggregation for Image Search , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[21]  Hugo Jair Escalante,et al.  Late fusion of heterogeneous methods for multimedia image retrieval , 2008, MIR '08.

[22]  Marcel Worring,et al.  Similarity learning via dissimilarity space in CBIR , 2006, MIR '06.

[23]  Henning Müller,et al.  Fusion Techniques for Combining Textual and Visual Information Retrieval , 2010, ImageCLEF.

[24]  Thomas S. Huang,et al.  Relevance Feedback Techniques in Image Retrieval , 2001, Principles of Visual Information Retrieval.

[25]  Stéphane Marchand-Maillet,et al.  Learning User Queries in Multimodal Dissimilarity Spaces , 2005, Adaptive Multimedia Retrieval.

[26]  Thomas S. Huang,et al.  Unifying Keywords and Visual Contents in Image Retrieval , 2002, IEEE Multim..

[27]  Christophe Moulin,et al.  Fisher Linear Discriminant Analysis for text-image combination in multimedia information retrieval , 2014, Pattern Recognit..

[28]  Anastasios Tefas,et al.  Relevance Feedback in Deep Convolutional Neural Networks for Content Based Image Retrieval , 2016, SETN.

[29]  Chunyan Miao,et al.  Online multimodal deep similarity learning with application to image retrieval , 2013, ACM Multimedia.

[30]  Hélio Pedrini,et al.  Efficient fusion of multidimensional descriptors for image retrieval , 2014, 2014 IEEE International Conference on Image Processing (ICIP).

[31]  Xuelong Li,et al.  Asymmetric bagging and random subspace for support vector machines-based relevance feedback in image retrieval , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[32]  Nicu Sebe,et al.  Content-based multimedia information retrieval: State of the art and challenges , 2006, TOMCCAP.

[33]  Thomas S. Huang,et al.  Relevance feedback: a power tool for interactive content-based image retrieval , 1998, IEEE Trans. Circuits Syst. Video Technol..

[34]  Ji Wan,et al.  Deep Learning for Content-Based Image Retrieval: A Comprehensive Study , 2014, ACM Multimedia.

[35]  Oge Marques Visual Information Retrieval: The State of the Art , 2016, IT Professional.

[36]  Ivan Laptev,et al.  Learning and Transferring Mid-level Image Representations Using Convolutional Neural Networks , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[37]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[38]  Ingemar J. Cox,et al.  The Bayesian image retrieval system, PicHunter: theory, implementation, and psychophysical experiments , 2000, IEEE Trans. Image Process..

[39]  Chunyan Miao,et al.  Online Multi-Modal Distance Metric Learning with Application to Image Retrieval , 2016, IEEE Transactions on Knowledge and Data Engineering.

[40]  Matti Pietikäinen,et al.  Multiresolution Gray-Scale and Rotation Invariant Texture Classification with Local Binary Patterns , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[41]  Mohan S. Kankanhalli,et al.  Multimodal fusion for multimedia analysis: a survey , 2010, Multimedia Systems.

[42]  Juan Domingo,et al.  Combining similarity measures in content-based image retrieval , 2008, Pattern Recognit. Lett..

[43]  Geoffrey E. Hinton,et al.  Deep Boltzmann Machines , 2009, AISTATS.

[44]  Emmanuel Dellandréa,et al.  Overview of the ImageCLEF 2015 Scalable Image Annotation, Localization and Sentence Generation task , 2015, CLEF.

[45]  Djemel Ziou,et al.  Relevance feedback for CBIR: a new approach based on probabilistic feature weighting with positive and negative examples , 2006, IEEE Transactions on Image Processing.

[46]  Ricardo da Silva Torres,et al.  Evaluating Retrieval Effectiveness of Descriptors for Searching in Large Image Databases , 2011, J. Inf. Data Manag..

[47]  Thomas S. Huang,et al.  One-class SVM for learning in image retrieval , 2001, Proceedings 2001 International Conference on Image Processing (Cat. No.01CH37205).

[48]  Rohini K. Srihari,et al.  Automatic Indexing and Content-Based Retrieval of Captioned Images , 1995, Computer.

[49]  Zhiyong Wang,et al.  Browse-to-Search: Interactive Exploratory Search with Visual Entities , 2014, TOIS.

[50]  Bart Thomee,et al.  Interactive search in image retrieval: a survey , 2012, International Journal of Multimedia Information Retrieval.

[51]  Pascal Vincent,et al.  Representation Learning: A Review and New Perspectives , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[52]  Michael R. Lyu,et al.  Group-based relevance feedback with support vector machine ensembles , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..

[53]  Li Deng,et al.  A tutorial survey of architectures, algorithms, and applications for deep learning , 2014, APSIPA Transactions on Signal and Information Processing.

[54]  Matthieu Cord,et al.  An efficient system for combining complementary kernels in complex visual categorization tasks , 2010, 2010 IEEE International Conference on Image Processing.

[55]  Juhan Nam,et al.  Multimodal Deep Learning , 2011, ICML.

[56]  Rong Jin,et al.  Semisupervised SVM batch mode active learning with applications to image retrieval , 2009, TOIS.

[57]  Paul Clough,et al.  ImageCLEF: Experimental Evaluation in Visual Information Retrieval , 2010 .

[58]  Christoph Meinel,et al.  A deep semantic framework for multimodal representation learning , 2016, Multimedia Tools and Applications.

[59]  Krista A. Ehinger,et al.  SUN database: Large-scale scene recognition from abbey to zoo , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[60]  Nello Cristianini A different way of thinking , 2016 .

[61]  Michael J. Swain,et al.  WebSeer: An Image Search Engine for the World Wide Web , 1996 .

[62]  Bernd Girod,et al.  Mobile Visual Search , 2011, IEEE Signal Processing Magazine.

[63]  Edward Y. Chang,et al.  Support vector machine active learning for image retrieval , 2001, MULTIMEDIA '01.

[64]  Marcel Worring,et al.  Content-Based Image Retrieval at the End of the Early Years , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[65]  Bo Luo,et al.  iLike: Bridging the Semantic Gap in Vertical Image Search by Integrating Text and Visual Features , 2013, IEEE Transactions on Knowledge and Data Engineering.

[66]  Satpute Bhagyashri MULTIMEDIA INFORMATION RETRIEVAL BASED ON LATE SEMANTIC FUSION APPROACHES- EXPERIMENTS ON A WIKIPEDIA IMAGE COLLECTION , 2015 .

[67]  Koby Crammer,et al.  A needle in a haystack: local one-class optimization , 2004, ICML.

[68]  Vladimir Risojevic,et al.  Fusion of Global and Local Descriptors for Remote Sensing Image Classification , 2013, IEEE Geoscience and Remote Sensing Letters.

[69]  Shuang Liang,et al.  Sketch retrieval and relevance feedback with biased SVM classification , 2008, Pattern Recognit. Lett..

[70]  Paul A. Viola,et al.  Boosting Image Retrieval , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[71]  Marco La Cascia,et al.  Unifying Textual and Visual Cues for Content-Based Image Retrieval on the World Wide Web , 1999, Comput. Vis. Image Underst..

[72]  Roberto Tronci,et al.  Diversity in Ensembles of Codebooks for Visual Concept Detection , 2013, ICIAP.

[73]  Thomas S. Huang,et al.  Optimizing learning in image retrieval , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[74]  Giorgio Giacinto,et al.  Neighborhood-based feature weighting for relevance feedback in content-based retrieval , 2009, 2009 10th Workshop on Image Analysis for Multimedia Interactive Services.

[75]  Mingjing Li,et al.  Relevance Feedback and Learning in Content-Based Image Search , 2004, World Wide Web.

[76]  Ana M. García-Serrano,et al.  Multimedia Information Retrieval Based on Late Semantic Fusion Approaches: Experiments on a Wikipedia Image Collection , 2013, IEEE Transactions on Multimedia.

[77]  Thomas S. Huang,et al.  Relevance feedback in image retrieval: A comprehensive review , 2003, Multimedia Systems.

[78]  B. Reljin,et al.  Adaptive Content-Based Image Retrieval with Relevance Feedback , 2005, EUROCON 2005 - The International Conference on "Computer as a Tool".

[79]  David A. Forsyth,et al.  Learning the semantics of words and pictures , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[80]  Ernest Valveny,et al.  Leveraging category-level labels for instance-level image retrieval , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[81]  Moncef Gabbouj,et al.  Feature selection for content-based image retrieval , 2008, Signal Image Video Process..

[82]  Hsin-Liang Chen,et al.  A socio-technical perspective of museum practitioners' image-using behaviors , 2007, Electron. Libr..

[83]  Honggang Zhang,et al.  Matching Image with Multiple Local Features , 2010, 2010 20th International Conference on Pattern Recognition.

[84]  Giorgio Giacinto,et al.  Dissimilarity Representation in Multi-feature Spaces for Image Retrieval , 2011, ICIAP.

[85]  Lawrence D. Jackel,et al.  Handwritten Digit Recognition with a Back-Propagation Network , 1989, NIPS.

[86]  Erkki Oja,et al.  PicSOM-self-organizing image retrieval with MPEG-7 content descriptors , 2002, IEEE Trans. Neural Networks.

[87]  Bo Zhang,et al.  Support vector machine learning for image retrieval , 2001, Proceedings 2001 International Conference on Image Processing (Cat. No.01CH37205).

[88]  Qiang Yang,et al.  A unified framework for semantics and feature based relevance feedback in image retrieval systems , 2000, ACM Multimedia.

[89]  Luis A. Leiva,et al.  A relevant image search engine with late fusion: mixing the roles of textual and visual descriptors , 2011, IUI '11.

[90]  Fabio Roli,et al.  Dissimilarity Representation of Images for Relevance Feedback in Content-Based Image Retrieval , 2003, MLDM.

[91]  Ludmila I. Kuncheva,et al.  Combining Pattern Classifiers: Methods and Algorithms , 2004 .

[92]  Bir Bhanu,et al.  Integrating relevance feedback techniques for image retrieval using reinforcement learning , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[93]  Weifeng Zhang,et al.  Image scene categorization using multi-bag-of-features , 2011, 2011 International Conference on Machine Learning and Cybernetics.

[94]  Rob Fergus,et al.  Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[95]  Giorgio Giacinto,et al.  A nearest-neighbor approach to relevance feedback in content based image retrieval , 2007, CIVR '07.

[96]  James M. Keller,et al.  Information fusion in computer vision using the fuzzy integral , 1990, IEEE Trans. Syst. Man Cybern..

[97]  Toshikazu Kato,et al.  Database architecture for content-based image retrieval , 1992, Electronic Imaging.

[98]  Xi Zhang,et al.  Feature integration analysis of bag-of-features model for image retrieval , 2013, Neurocomputing.

[99]  Edward A. Fox,et al.  A genetic programming framework for content-based image retrieval , 2009, Pattern Recognit..

[100]  Lu Liu,et al.  Content-based image retrieval using color and texture fused features , 2011, Math. Comput. Model..

[101]  Antonio Torralba,et al.  Ieee Transactions on Pattern Analysis and Machine Intelligence 1 80 Million Tiny Images: a Large Dataset for Non-parametric Object and Scene Recognition , 2022 .

[102]  Subhash C. Bagui,et al.  Combining Pattern Classifiers: Methods and Algorithms , 2005, Technometrics.

[103]  Francesco G. B. De Natale,et al.  A hybrid approach for retrieving diverse social images of landmarks , 2015, 2015 IEEE International Conference on Multimedia and Expo (ICME).

[104]  Nitish Srivastava,et al.  Multimodal learning with deep Boltzmann machines , 2012, J. Mach. Learn. Res..

[105]  Martha Larson,et al.  User Intent in Multimedia Search , 2016, ACM Comput. Surv..

[106]  Jason Weston,et al.  Large scale image annotation: learning to rank with joint word-image embeddings , 2010, Machine Learning.

[107]  Gang Li,et al.  Integrating Local One-Class Classifiers for Image Retrieval , 2006, ADMA.

[108]  Yi Li,et al.  ARISTA - image search to annotation on billions of web photos , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[109]  Thierry Pun,et al.  Performance evaluation in content-based image retrieval: overview and proposals , 2001, Pattern Recognit. Lett..

[110]  Edward Y. Chang,et al.  Multimodal concept-dependent active learning for image retrieval , 2004, MULTIMEDIA '04.

[111]  Roberto Tronci,et al.  Performance Evaluation of Relevance Feedback for Image Retrieval by "Real-World" Multi-Tagged Image Datasets , 2012, Int. J. Multim. Data Eng. Manag..

[112]  Petros Daras,et al.  A unified framework for multimodal retrieval , 2013, Pattern Recognit..

[113]  Jian Wang,et al.  Cross-Modal Retrieval via Deep and Bidirectional Representation Learning , 2016, IEEE Transactions on Multimedia.

[114]  Tao Mei,et al.  Contextual decomposition of multi-label images , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[115]  Yelena Yesha,et al.  Fuzzy SVM Ensembles for Relevance Feedback in Image Retrieval , 2006, CIVR.

[116]  Thomas S. Huang,et al.  Content-based image retrieval with relevance feedback in MARS , 1997, Proceedings of International Conference on Image Processing.

[117]  Matthew B. Blaschko,et al.  Combining Local and Global Image Features for Object Class Recognition , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Workshops.

[118]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[119]  J. J. Rocchio,et al.  Relevance feedback in information retrieval , 1971 .