Deepqoe: A Unified Framework for Learning to Predict Video QoE

Motivated by the prowess of deep learning (DL) based techniques in prediction, generalization, and representation learning, we develop a novel framework called DeepQoE to predict video quality of experience (QoE). The end-to-end framework first uses a combination of DL techniques (e.g., word embeddings) to extract generalized features. Next, these features are combined and fed into a neural network for representation learning. Such representations serve as inputs for classification or regression tasks. Evaluating the performance of DeepQoE with two datasets, we show that for the small dataset, the accuracy of all shallow learning algorithms is improved by using the representation derived from DeepQoE. For the large dataset, our DeepQoE framework achieves significant performance improvement in comparison to the best baseline method (90.94% vs. 82.84%). Moreover, DeepQoE, also released as an open source tool, provides video QoE research much-needed flexibility in fitting different datasets, extracting generalized features, and learning representations.

[1]  Guangming Shi,et al.  Just Noticeable Difference Estimation for Images With Free-Energy Principle , 2013, IEEE Transactions on Multimedia.

[2]  D. Whitney,et al.  Serial dependence in visual perception , 2011 .

[3]  Srinivasan Seshan,et al.  Developing a predictive model of quality of experience for internet video , 2013, SIGCOMM.

[4]  Steven Bohez,et al.  Sensor fusion for robot control through deep reinforcement learning , 2017, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[5]  Vyas Sekar,et al.  Understanding the impact of video quality on user engagement , 2011, SIGCOMM.

[6]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[7]  Fei-Fei Li,et al.  Large-Scale Video Classification with Convolutional Neural Networks , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[8]  Yonggang Wen,et al.  QoE-Driven Cache Management for HTTP Adaptive Bit Rate Streaming Over Wireless Networks , 2012, IEEE Transactions on Multimedia.

[9]  Mohammed Ghanbari,et al.  Temporal Aspect of Perceived Quality in Mobile Video Broadcasting , 2008, IEEE Transactions on Broadcasting.

[10]  Yicong Zhou,et al.  QoE Evaluation of Multimedia Services Based on Audiovisual Quality and User Interest , 2016, IEEE Transactions on Multimedia.

[11]  Ahmet M. Kondoz,et al.  Automatic QOE Prediction in Stereoscopic Videos , 2012, 2012 IEEE International Conference on Multimedia and Expo Workshops.

[12]  Chang Wen Chen,et al.  Mobile JND: environment adapted perceptual model and mobile video quality enhancement , 2012, MMSys '12.

[13]  Mohamed-Chaker Larabi,et al.  Influence of video resolution, viewing device and audio quality on perceived multimedia quality for steaming applications , 2014, 2014 5th European Workshop on Visual Information Processing (EUVIP).

[14]  Qian Liu,et al.  QoE in Video Transmission: A User Experience-Driven Strategy , 2017, IEEE Communications Surveys & Tutorials.

[15]  Lorenzo Torresani,et al.  C3D: Generic Features for Video Analysis , 2014, ArXiv.

[16]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[17]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[18]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[19]  Yanjiao Chen,et al.  From QoS to QoE: A Tutorial on Video Quality Assessment , 2015, IEEE Communications Surveys & Tutorials.

[20]  Xin Jin,et al.  VideoSet: A large-scale compressed video quality dataset based on JND measurement , 2017, J. Vis. Commun. Image Represent..

[21]  Alan C. Bovik,et al.  Recurrent and Dynamic Models for Predicting Streaming Video Quality of Experience , 2018, IEEE Transactions on Image Processing.