Video Big Data Retrieval Over Media Cloud: A Context-Aware Online Learning Approach

Online video sharing (e.g., via YouTube or YouKu) has emerged as one of the most important services in the current Internet, where billions of videos on the cloud are awaiting exploration. Hence, a personalized video retrieval system is needed to help users find interesting videos from big data content. Two of the main challenges are to process the increasing amount of video big data and resolve the accompanying “cold start” issue efficiently. Another challenge is to satisfy the users’ need for personalized retrieval results, of which the accuracy is unknown. In this paper, we formulate the personalized video big data retrieval problem as an interaction between the user and the system via a stochastic process, not just a similarity matching, accuracy (feedback) model of the retrieval; introduce users’ real-time context into the retrieval system; and propose a general framework for this problem. By using a novel contextual multiarmed bandit-based algorithm to balance the accuracy and efficiency, we propose a context-based online big-data-oriented personalized video retrieval system. This system can support datasets that are dynamically increasing in size and has the property of cross-modal retrieval. Our approach provides accurate retrieval results with sublinear regret and linear storage complexity and significantly improves the learning speed. Furthermore, by learning for a cluster of similar contexts simultaneously, we can realize sublinear storage complexity with the same regret but slightly poorer performance on the “cold start” issue compared to the previous approach. We validate our theoretical results experimentally on a tremendously large dataset; the results demonstrate that the proposed algorithms outperform existing bandit-based online learning methods in terms of accuracy and efficiency and the adaptation from the bandit framework offers additional benefits.

[1]  Kuo-Chin Fan,et al.  Motion Flow-Based Video Retrieval , 2007, IEEE Transactions on Multimedia.

[2]  Riccardo Leonardi,et al.  Affective Recommendation of Movies Based on Selected Connotative Features , 2013, IEEE Transactions on Circuits and Systems for Video Technology.

[3]  Thomas S. Huang,et al.  Relevance feedback: a power tool for interactive content-based image retrieval , 1998, IEEE Trans. Circuits Syst. Video Technol..

[4]  Cedric Nishan Canagarajah,et al.  A Unified Framework for Object Retrieval and Mining , 2009, IEEE Transactions on Circuits and Systems for Video Technology.

[5]  Meng Wang,et al.  Enhancing Sketch-Based Image Retrieval by Re-Ranking and Relevance Feedback , 2016, IEEE Transactions on Image Processing.

[6]  Li Li,et al.  A Survey on Visual Content-Based Video Indexing and Retrieval , 2011, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[7]  Jie Tang,et al.  Addressing cold start in recommender systems: a semi-supervised co-training algorithm , 2014, SIGIR.

[8]  Mihaela van der Schaar,et al.  Mining the Situation: Spatiotemporal Traffic Prediction With Big Data , 2015, IEEE Journal of Selected Topics in Signal Processing.

[9]  Stathes Hadjiefthymiades,et al.  Facing the cold start problem in recommender systems , 2014, Expert Syst. Appl..

[10]  Bernd Girod,et al.  Temporal aggregation for large-scale query-by-image video retrieval , 2015, 2015 IEEE International Conference on Image Processing (ICIP).

[11]  Thomas S. Huang,et al.  Relevance feedback in image retrieval: A comprehensive review , 2003, Multimedia Systems.

[12]  Nicu Sebe,et al.  Fisher Kernel Temporal Variation-based Relevance Feedback for video retrieval , 2016, Comput. Vis. Image Underst..

[13]  Mejari Kumar,et al.  Connecting Social Media to E-Commerce: Cold-Start Product Recommendation using Microblogging Information , 2018 .

[14]  Nikolaos D. Doulamis,et al.  Generalized nonlinear relevance feedback for interactive content-based retrieval and organization , 2004, IEEE Transactions on Circuits and Systems for Video Technology.

[15]  Chao Ma,et al.  Supervised Recurrent Hashing for Large Scale Video Retrieval , 2016, ACM Multimedia.

[16]  Rong Yan,et al.  Negative pseudo-relevance feedback in content-based video retrieval , 2003, MULTIMEDIA '03.

[17]  Zi Huang,et al.  Effective Multiple Feature Hashing for Large-Scale Near-Duplicate Video Retrieval , 2013, IEEE Transactions on Multimedia.

[18]  Santanu Chaudhury,et al.  Learning ontology for personalized video retrieval , 2007, MS '07.

[19]  Daniel Thalmann,et al.  Merging trust in collaborative filtering to alleviate data sparsity and cold start , 2014, Knowl. Based Syst..

[20]  Zhouyu Fu,et al.  Semantic-Based Surveillance Video Retrieval , 2007, IEEE Transactions on Image Processing.

[21]  Ling Shao,et al.  Efficient Search and Localization of Human Actions in Video Databases , 2014, IEEE Transactions on Circuits and Systems for Video Technology.

[22]  Dipti Prasad Mukherjee,et al.  Key Frame Estimation in Video Using Randomness Measure of Feature Point Pattern , 2007, IEEE Transactions on Circuits and Systems for Video Technology.

[23]  Gang Wang,et al.  Object Instance Search in Videos via Spatio-Temporal Trajectory Discovery , 2016, IEEE Transactions on Multimedia.

[24]  Dongqing Zhang,et al.  Large-Scale Supervised Multimodal Hashing with Semantic Correlation Maximization , 2014, AAAI.

[25]  Dong Liu,et al.  Large-Scale Video Hashing via Structure Learning , 2013, 2013 IEEE International Conference on Computer Vision.

[26]  Peter Auer,et al.  Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.

[27]  Chong Luo,et al.  Multimedia Cloud Computing , 2011, IEEE Signal Processing Magazine.

[28]  Meng Wang,et al.  Play and Rewind: Optimizing Binary Representations of Videos by Self-Supervised Temporal Hashing , 2016, ACM Multimedia.

[29]  Dan Schonfeld,et al.  Real-Time Motion Trajectory-Based Indexing and Retrieval of Video Sequences , 2007, IEEE Transactions on Multimedia.

[30]  Csaba Szepesvári,et al.  –armed Bandits , 2022 .

[31]  Alessandro Lazaric,et al.  Online Stochastic Optimization under Correlated Bandit Feedback , 2014, ICML.

[32]  Mihaela van der Schaar,et al.  Active Learning in Context-Driven Stream Mining With an Application to Image Mining , 2015, IEEE Transactions on Image Processing.

[33]  Sang Uk Lee,et al.  Efficient video indexing scheme for content-based retrieval , 1999, IEEE Trans. Circuits Syst. Video Technol..

[34]  Yu He,et al.  The YouTube video recommendation system , 2010, RecSys '10.

[35]  Nikolaos D. Doulamis,et al.  Evaluation of relevance feedback schemes in content-based in retrieval systems , 2006, Signal Process. Image Commun..

[36]  Bo Zhang,et al.  Scalable Discrete Supervised Multimedia Hash Learning With Clustering , 2018, IEEE Transactions on Circuits and Systems for Video Technology.

[37]  Lorenzo Bruzzone,et al.  A Novel Active Learning Method in Relevance Feedback for Content-Based Remote Sensing Image Retrieval , 2015, IEEE Transactions on Geoscience and Remote Sensing.

[38]  Riccardo Leonardi,et al.  A Connotative Space for Supporting Movie Affective Recommendation , 2011, IEEE Transactions on Multimedia.

[39]  Yannis Avrithis,et al.  Personalized Content Retrieval in Context Using Ontological Knowledge , 2007, IEEE Transactions on Circuits and Systems for Video Technology.

[40]  Xuelong Li,et al.  Learning Discriminative Binary Codes for Large-scale Cross-modal Retrieval , 2017, IEEE Transactions on Image Processing.

[41]  Tat-Seng Chua,et al.  Addressing cold-start in app recommendation: latent user models constructed from twitter followers , 2013, SIGIR.

[42]  Nicu Sebe,et al.  Quantization-based hashing: a general framework for scalable image and video retrieval , 2018, Pattern Recognit..

[43]  Christophe Diot,et al.  Finding a needle in a haystack of reviews: cold start context-based hotel recommender system , 2012, RecSys.

[44]  Hayder Radha,et al.  Cold-Start Recommendation with Provable Guarantees: A Decoupled Approach , 2016, IEEE Transactions on Knowledge and Data Engineering.

[45]  Alda Lopes Gançarski,et al.  A Contextual-Bandit Algorithm for Mobile Context-Aware Recommender System , 2012, ICONIP.

[46]  Shih-Fu Chang,et al.  Exploiting Feature and Class Relationships in Video Categorization with Regularized Deep Neural Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[47]  B. B. Meshram,et al.  Content based video retrieval systems , 2012, ArXiv.

[48]  Meng Wang,et al.  Self-Supervised Video Hashing With Hierarchical Binary Auto-Encoder , 2018, IEEE Transactions on Image Processing.

[49]  Fei Wang,et al.  Real-time large scale near-duplicate web video retrieval , 2010, ACM Multimedia.

[50]  Wei Chu,et al.  A contextual-bandit approach to personalized news article recommendation , 2010, WWW '10.

[51]  Nicu Sebe,et al.  A Survey on Learning to Hash , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[52]  Mihaela van der Schaar,et al.  Online Learning in Large-Scale Contextual Recommender Systems , 2016, IEEE Transactions on Services Computing.

[53]  Marcel Worring,et al.  Adding Semantics to Detectors for Video Retrieval , 2007, IEEE Transactions on Multimedia.

[54]  Zhang Xiong,et al.  3D Object Retrieval With Multitopic Model Combining Relevance Feedback and LDA Model , 2015, IEEE Transactions on Image Processing.

[55]  Mihaela van der Schaar,et al.  Distributed online Big Data classification using context information , 2013, 2013 51st Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[56]  Amin Mantrach,et al.  Item cold-start recommendations: learning local collective embeddings , 2014, RecSys '14.

[57]  Sébastien Bubeck,et al.  Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems , 2012, Found. Trends Mach. Learn..

[58]  Xuelong Li,et al.  Direct kernel biased discriminant analysis: a new content-based image retrieval relevance feedback algorithm , 2006, IEEE Transactions on Multimedia.

[59]  Wesley De Neve,et al.  Near-Duplicate Video Clip Detection Using Model-Free Semantic Concept Detection and Adaptive Semantic Distance Measurement , 2012, IEEE Transactions on Circuits and Systems for Video Technology.