Content based video retrieval system based on multimodal feature grouping by KFCM clustering algorithm to promote human–computer interaction

Content Based Video Retrieval (CBVR) is so popular these days, because of the increased utilization of video based analytical systems. Video based analytics is quite effective than image analysis, as a series of actions are captured by the video. This ends up with better decision making ability. The CBVR systems play an important role in boosting the human–computer interaction. This paper presents a multimodal CBVR that takes both the visual and audio information into account for retrieving relevant videos to the user. Two modules are employed by this work to deal with video and audio data. The video data is processed to detect the significant frame from shots and is achieved by Lion Optimization Algorithm (LOA). The features are extracted from the visual data and with respect to the audio data, MHEC and LPCC features are extracted. The extracted features are clustered by Kernelized Fuzzy C Mean (KFCM) algorithm. Finally, the feature database is formed and is utilized in the query matching process during the testing phase. The performance of the proposed work is tested in terms of precision, recall, F-measure and time consumption rates. The proposed CBVR system proves better performance than the existing approaches and is evident through attained results.

[1]  Bernd Girod,et al.  Large-Scale Video Retrieval Using Image Queries , 2018, IEEE Transactions on Circuits and Systems for Video Technology.

[2]  Christoph Meinel,et al.  Content Based Lecture Video Retrieval Using Speech and Video Text Information , 2014, IEEE Transactions on Learning Technologies.

[3]  Li Li,et al.  A Survey on Visual Content-Based Video Indexing and Retrieval , 2011, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[4]  C. Packer,et al.  Female lions can identify potentially infanticidal males from their roars , 1993, Proceedings of the Royal Society of London. Series B: Biological Sciences.

[5]  Florian Heimerl,et al.  Visual Movie Analytics , 2016, IEEE Transactions on Multimedia.

[6]  Min Sheng,et al.  On the Interaction of Video Caching and Retrieving in Multi-Server Mobile-Edge Computing Systems , 2019, IEEE Wireless Communications Letters.

[7]  Yuan-Ting Zhang,et al.  Bionic wavelet transform: a new time-frequency method based on an auditory model , 2001, IEEE Trans. Biomed. Eng..

[8]  Bo Zhou,et al.  Fast key-frame image retrieval of intelligent city security video based on deep feature coding in high concurrent network environment , 2020 .

[9]  Ivan W. Selesnick,et al.  A Higher Density Discrete Wavelet Transform , 2006, IEEE Transactions on Signal Processing.

[10]  Borko Furht,et al.  Content-Based Image and Video Retrieval , 2002, Multimedia Systems and Applications Series.

[11]  Giuliano Antoniol,et al.  Linear predictive coding and cepstrum coefficients for mining time variant information from software repositories , 2005, MSR.

[12]  Chih-Yi Chiu,et al.  Content-based retrieval for human motion data , 2004, J. Vis. Commun. Image Represent..

[13]  Peng Zhang,et al.  Object coding based video authentication for privacy protection in immersive communication , 2017, J. Ambient Intell. Humaniz. Comput..

[14]  John H. L. Hansen,et al.  Mean Hilbert Envelope Coefficients (MHEC) for Robust Speaker Recognition , 2012, INTERSPEECH.

[15]  Fariborz Jolai,et al.  Lion Optimization Algorithm (LOA): A nature-inspired metaheuristic algorithm , 2016, J. Comput. Des. Eng..

[16]  George Awad,et al.  On Influential Trends in Interactive Video Retrieval: Video Browser Showdown 2015–2017 , 2018, IEEE Transactions on Multimedia.

[17]  Hans-Peter Kriegel,et al.  State-of-the-Art in Content-Based Image and Video Retrieval , 2001, Computational Imaging and Vision.

[18]  Sumana Gupta,et al.  Context Driven Optimized Perceptual Video Summarization and Retrieval , 2019, IEEE Transactions on Circuits and Systems for Video Technology.

[19]  Yun Fu,et al.  Videography-Based Unconstrained Video Analysis , 2017, IEEE Transactions on Image Processing.

[20]  Stefano Tubaro,et al.  Coding Local and Global Binary Visual Features Extracted From Video Sequences , 2015, IEEE Transactions on Image Processing.

[21]  M. Zhenjiang Zernike moment-based image shape analysis and its application , 2000 .

[22]  Chien-Li Chou,et al.  Pattern-Based Near-Duplicate Video Retrieval and Localization on Web-Scale Videos , 2015, IEEE Transactions on Multimedia.

[23]  Shouling Ji,et al.  Video Big Data Retrieval Over Media Cloud: A Context-Aware Online Learning Approach , 2019, IEEE Transactions on Multimedia.

[24]  Alireza Khotanzad,et al.  Invariant Image Recognition by Zernike Moments , 1990, IEEE Trans. Pattern Anal. Mach. Intell..

[25]  Xirong Li,et al.  Predicting Visual Features From Text for Image and Video Caption Retrieval , 2017, IEEE Transactions on Multimedia.

[26]  Victor R. L. Shen,et al.  Automatic Detection of Video Shot Boundary in Social Media Using a Hybrid Approach of HLFPN and Keypoint Matching , 2018, IEEE Transactions on Computational Social Systems.

[27]  Wen Gao,et al.  A Joint Compression Scheme of Video Feature Descriptors and Visual Content , 2017, IEEE Transactions on Image Processing.

[28]  C. Sidney Burrus,et al.  Multidimensional, mapping-based complex wavelet transforms , 2005, IEEE Transactions on Image Processing.

[29]  Chongke Bi,et al.  Dynamic Mode Decomposition Based Video Shot Detection , 2018, IEEE Access.

[30]  Venkatesh Saligrama,et al.  Retrieval in Long-Surveillance Videos Using User-Described Motion and Object Attributes , 2014, IEEE Transactions on Circuits and Systems for Video Technology.