Multimodal Information Access and Retrieval Notable Work and Milestones

There is information explosion in this era; documents of all genres are available at the tip of our hands. The proliferation of digital data like images, audios and videos both on the Internet and on user's personal computers, have led the researchers across the globe to develop an efficient access and retrieval technique that could meet the real time need of the user. Since information exists in divergent modalities, therefore we are in need of a model where different modalities could be used to get access of information present in single modality. The explosion of data together with user's instantaneous need of accurate information signify that Multimodal Information Access and Retrieval techniques have emerged as a big research challenge in all domains including medical, defense, and aircraft etc. This paper introduces important hurdles encountered by MIAR systems and prominent research work which have been undertaken to address these challenges. Further it highlights problems in MIAR systems which are still not addressed. The objective of this work is to lay foundation for researchers willing to work in this area so that they could have deep knowledge of this challenging and exciting research field.

[1]  Meng Xie,et al.  Learning context-content similarity for image retrieval , 2017, UbiComp/ISWC Adjunct.

[2]  Petros Daras,et al.  A unified framework for multimodal retrieval , 2013, Pattern Recognit..

[3]  Stéphane Marchand-Maillet,et al.  Information Fusion in Multimedia Information Retrieval , 2007, Adaptive Multimedia Retrieval.

[4]  Chang-Tsun Li,et al.  Trademark image retrieval using synthetic features for describing global shape and interior structure , 2009, Pattern Recognit..

[5]  Christoph Meinel,et al.  A deep semantic framework for multimodal representation learning , 2016, Multimedia Tools and Applications.

[6]  François Pachet,et al.  A scale-free distribution of false positives for a large class of audio similarity measures , 2008, Pattern Recognit..

[7]  Toshikazu Kato,et al.  Database architecture for content-based image retrieval , 1992, Electronic Imaging.

[8]  Richard A. Bolt,et al.  “Put-that-there”: Voice and gesture at the graphics interface , 1980, SIGGRAPH '80.

[9]  Xi Zhang,et al.  Feature integration analysis of bag-of-features model for image retrieval , 2013, Neurocomputing.

[10]  Pepe Siy,et al.  Robust shape similarity retrieval based on contour segmentation polygonal multiresolution and elastic matching , 2005, Pattern Recognit..

[11]  Jonathon S. Hare,et al.  Mind the gap: another look at the problem of the semantic gap in image retrieval , 2006, Electronic Imaging.

[12]  Feiran Huang,et al.  Image-text sentiment analysis via deep multimodal attentive fusion , 2019, Knowl. Based Syst..

[13]  Alaa Mohamed Riad,et al.  A Literature Review of Image Retrieval based on Semantic Concept , 2012 .

[14]  Christophe Moulin,et al.  Fisher Linear Discriminant Analysis for text-image combination in multimedia information retrieval , 2014, Pattern Recognit..

[15]  Nicu Sebe,et al.  Content-based multimedia information retrieval: State of the art and challenges , 2006, TOMCCAP.

[16]  Juhan Nam,et al.  Multimodal Deep Learning , 2011, ICML.

[17]  Michael G. Strintzis,et al.  3D object retrieval using the 3D shape impact descriptor , 2009, Pattern Recognit..

[18]  Francesco G. B. De Natale,et al.  A hybrid approach for retrieving diverse social images of landmarks , 2015, 2015 IEEE International Conference on Multimedia and Expo (ICME).

[19]  Marcel Worring,et al.  Content-Based Image Retrieval at the End of the Early Years , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[20]  Rohini K. Srihari,et al.  Automatic Indexing and Content-Based Retrieval of Captioned Images , 1995, Computer.

[21]  Amel Znaidia,et al.  Handling imperfections for multimodal image annotation , 2014 .

[22]  Hamid Reza Shahdoosti,et al.  Multimodal image fusion using sparse representation classification in tetrolet domain , 2018, Digit. Signal Process..

[23]  Roger Levy,et al.  On the Role of Correlation and Abstraction in Cross-Modal Multimedia Retrieval , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[24]  James Ze Wang,et al.  Image retrieval: Ideas, influences, and trends of the new age , 2008, CSUR.

[25]  Nitish Srivastava,et al.  Multimodal learning with deep Boltzmann machines , 2012, J. Mach. Learn. Res..

[26]  Vasumathi Narayanan,et al.  A Survey of Content-Based Video Retrieval , 2008 .

[27]  Meenu Manchanda,et al.  An improved multimodal medical image fusion algorithm based on fuzzy transform , 2018, J. Vis. Commun. Image Represent..

[28]  Hugo Jair Escalante,et al.  Late fusion of heterogeneous methods for multimedia image retrieval , 2008, MIR '08.

[29]  Arun Ross,et al.  Multimodal biometrics: An overview , 2004, 2004 12th European Signal Processing Conference.

[30]  Joemon M. Jose,et al.  An adaptive technique for content-based image retrieval , 2006, Multimedia Tools and Applications.

[31]  Hélio Pedrini,et al.  Efficient fusion of multidimensional descriptors for image retrieval , 2014, 2014 IEEE International Conference on Image Processing (ICIP).

[32]  Yue Gao,et al.  3D model comparison using spatial structure circular descriptor , 2010, Pattern Recognit..

[33]  M. Rajalakshmi,et al.  A Semantic Model for Multimodal Data Mining in Healthcare Information Systems , 2014 .

[34]  Fei Su,et al.  Towards Improving Canonical Correlation Analysis for Cross-modal Retrieval , 2017, ACM Multimedia.

[35]  Deyu Meng,et al.  Bridging the Ultimate Semantic Gap: A Semantic Search Engine for Internet Videos , 2015, ICMR.

[36]  Honggang Zhang,et al.  Matching Image with Multiple Local Features , 2010, 2010 20th International Conference on Pattern Recognition.

[37]  Michael J. Swain,et al.  WebSeer: An Image Search Engine for the World Wide Web , 1996 .

[38]  Yi Yang,et al.  Ranking with local regression and global alignment for cross media retrieval , 2009, ACM Multimedia.

[39]  Xiangyang Wang,et al.  A robust digital audio watermarking based on statistics characteristics , 2009, Pattern Recognit..

[40]  Bart Thomee,et al.  A picture is worth a thousand words : content-based image retrieval techniques , 2010 .

[41]  Hongbin Zha,et al.  Joint Dictionary Learning and Semantic Constrained Latent Subspace Projection for Cross-Modal Retrieval , 2018, CIKM.

[42]  Chunyan Miao,et al.  Online Multi-Modal Distance Metric Learning with Application to Image Retrieval , 2016, IEEE Transactions on Knowledge and Data Engineering.

[43]  Sanjeev Sofat,et al.  Mining Techniques for Integrated Multimedia Repositories: A Review , 2008 .

[44]  Mohan S. Kankanhalli,et al.  Multimodal fusion for multimedia analysis: a survey , 2010, Multimedia Systems.

[45]  Roberto Tronci,et al.  Diversity in Ensembles of Codebooks for Visual Concept Detection , 2013, ICIAP.

[46]  Hala H. Zayed,et al.  Semi-Automatic Semantic Based Natural Images Retrieval System , 2016, INFOS '16.

[47]  Yiannis Kompatsiaris,et al.  Retrieval of Multimedia Objects by Fusing Multiple Modalities , 2016, ICMR.

[48]  Matthieu Cord,et al.  An efficient system for combining complementary kernels in complex visual categorization tasks , 2010, 2010 IEEE International Conference on Image Processing.

[49]  Wei-Ying Ma,et al.  Multimedia information retrieval: what is it, and why isn't anyone using it? , 2005, MIR '05.

[50]  Xiaohui Liu,et al.  Real-time traffic sign recognition from video by class-specific discriminative features , 2010, Pattern Recognit..