Multimedia Analysis with Deep Learning

Recently, deep learning method has been attracting more and more researchers due to its great success in various computer vision tasks. Particularly, some researchers focus on the study of multimedia analysis by deep learning method, and the research tasks mainly include the following six aspects: classification, retrieval, segmentation, tracking, detection and recommendation. As far as we know, there is not any literature conducting on survey of these studies, and it is of great significance for the community to review this subject. In this paper, we discuss the application of deep learning method in the six multimedia analysis tasks, and also point out the future directions of deep learning in multimedia analysis.

[1]  Meng Wang,et al.  3D Human Activity Recognition with Reconfigurable Convolutional Neural Networks , 2014, ACM Multimedia.

[2]  Jing Xu,et al.  Deep boosting: Layered feature mining for general image classification , 2014, 2014 IEEE International Conference on Multimedia and Expo (ICME).

[3]  Min Xu,et al.  Mask Assisted Object Coding with Deep Learning for Object Retrieval in Surveillance Videos , 2014, ACM Multimedia.

[4]  Rongrong Ji,et al.  Learning High-Level Feature by Deep Belief Networks for 3-D Model Retrieval and Recognition , 2014, IEEE Transactions on Multimedia.

[5]  Wei-Ying Ma,et al.  Bag-of-Words Based Deep Neural Network for Image Retrieval , 2014, ACM Multimedia.

[6]  Chunheng Wang,et al.  Deep nonlinear metric learning with independent subspace analysis for face verification , 2012, ACM Multimedia.

[7]  Yi Yang,et al.  Dynamic Background Learning through Deep Auto-encoder Networks , 2014, ACM Multimedia.

[8]  Y. Liu,et al.  Bilinear deep learning for image classification , 2011, ACM Multimedia.

[9]  Jian Wang,et al.  Cross Modal Deep Model and Gaussian Process Based Model for MSR-Bing Challenge , 2014, ACM Multimedia.

[10]  Yann LeCun,et al.  What is the best multi-stage architecture for object recognition? , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[11]  Chunyan Miao,et al.  Online multimodal deep similarity learning with application to image retrieval , 2013, ACM Multimedia.

[12]  Hefei Ling,et al.  Inductive Transfer Deep Hashing for Image Retrieval , 2014, ACM Multimedia.

[13]  Yoshua Bengio,et al.  Deep Learning of Representations for Unsupervised and Transfer Learning , 2011, ICML Unsupervised and Transfer Learning.

[14]  Yan Liu,et al.  Semiconducting bilinear deep learning for incomplete image recognition , 2012, ICMR '12.

[15]  Ji Wan,et al.  Deep Learning for Content-Based Image Retrieval: A Comprehensive Study , 2014, ACM Multimedia.

[16]  Jun Wang,et al.  Exploring Inter-feature and Inter-class Relationships with Deep Neural Networks for Video Classification , 2014, ACM Multimedia.

[17]  Yong Peng,et al.  EEG-based emotion classification using deep belief networks , 2014, 2014 IEEE International Conference on Multimedia and Expo (ICME).

[18]  Lei Guo,et al.  Saliency detection based on feature learning using Deep Boltzmann Machines , 2014, 2014 IEEE International Conference on Multimedia and Expo (ICME).

[19]  Ye Wang,et al.  Improving Content-based and Hybrid Music Recommendation using Deep Learning , 2014, ACM Multimedia.

[20]  Wen-Huang Cheng,et al.  A robust tracking algorithm for 3D hand gesture with rapid hand motion through deep learning , 2014, 2014 IEEE International Conference on Multimedia and Expo Workshops (ICMEW).

[21]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[22]  Yuan Dong,et al.  Reducing structure of deep Convolutional Neural Networks for Huawei Accurate and Fast Mobile Video Annotation Challenge , 2014, 2014 IEEE International Conference on Multimedia and Expo Workshops (ICMEW).