Exploiting Feature and Class Relationships in Video Categorization with Regularized Deep Neural Networks
暂无分享,去创建一个
Shih-Fu Chang | Xiangyang Xue | Yu-Gang Jiang | Jun Wang | Zuxuan Wu | Shih-Fu Chang | Jun Wang | Yu-Gang Jiang | X. Xue | Zuxuan Wu
[1] Yoshua Bengio,et al. Multi-Task Learning for Stock Selection , 1996, NIPS.
[2] Rich Caruana,et al. Multitask Learning , 1998, Encyclopedia of Machine Learning and Data Mining.
[3] Milind R. Naphade,et al. A probabilistic framework for semantic video indexing, filtering, and retrieval , 2001, IEEE Trans. Multim..
[4] John R. Smith,et al. Multimedia semantic indexing using model vectors , 2003, 2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698).
[5] Barbara Caputo,et al. Recognizing human actions: a local SVM approach , 2004, ICPR 2004.
[6] Michael I. Jordan,et al. Multiple kernel learning, conic duality, and the SMO algorithm , 2004, ICML.
[7] David G. Lowe,et al. Distinctive Image Features from Scale-Invariant Keypoints , 2004, International Journal of Computer Vision.
[8] Antonio Torralba,et al. Contextual Priming for Object Detection , 2003, International Journal of Computer Vision.
[9] Ronen Basri,et al. Actions as space-time shapes , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.
[10] Thomas Serre,et al. A Biologically Inspired System for Action Recognition , 2007, 2007 IEEE 11th International Conference on Computer Vision.
[11] Jiebo Luo,et al. Kodak consumer video benchmark data set : concept definition and annotation * * , 2008 .
[12] Massimiliano Pontil,et al. Convex multi-task feature learning , 2008, Machine Learning.
[13] Tao Mei,et al. Correlative multi-label video annotation , 2007, ACM Multimedia.
[14] Andrea Vedaldi,et al. Objects in Context , 2007, 2007 IEEE 11th International Conference on Computer Vision.
[15] T. Stanford,et al. Multisensory integration: current issues from the perspective of the single neuron , 2008, Nature Reviews Neuroscience.
[16] Cordelia Schmid,et al. Learning realistic human actions from movies , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.
[17] Jean-Philippe Vert,et al. Clustered Multi-Task Learning: A Convex Formulation , 2008, NIPS.
[18] Subhransu Maji,et al. Classification using intersection kernel support vector machines is efficient , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.
[19] T. Stanford,et al. Multisensory integration: current issues from the perspective of the single neuron , 2008, Nature Reviews Neuroscience.
[20] Jiebo Luo,et al. Heterogeneous feature machines for visual recognition , 2009, 2009 IEEE 12th International Conference on Computer Vision.
[21] C. Schmid,et al. Actions in context , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.
[22] Andrew Zisserman,et al. Multiple kernels for object detection , 2009, 2009 IEEE 12th International Conference on Computer Vision.
[23] Shih-Fu Chang,et al. Short-term audio-visual atoms for generic video concept classification , 2009, ACM Multimedia.
[24] Jieping Ye,et al. Multi-Task Feature Learning Via Efficient l2, 1-Norm Minimization , 2009, UAI.
[25] Xiangyang Xue,et al. A novel audio fingerprinting method robust to time scale modification and pitch shifting , 2010, ACM Multimedia.
[26] Juan Carlos Niebles,et al. Modeling Temporal Structure of Decomposable Motion Segments for Activity Classification , 2010, ECCV.
[27] Dit-Yan Yeung,et al. A Convex Formulation for Learning Task Relationships in Multi-Task Learning , 2010, UAI.
[28] Mohan S. Kankanhalli,et al. Multimodal fusion for multimedia analysis: a survey , 2010, Multimedia Systems.
[29] Alexander C. Loui,et al. Audio-visual grouplet: temporal audio-visual interactions for general video concept classification , 2011, ACM Multimedia.
[30] Jiayu Zhou,et al. Integrating low-rank and group-sparse structures for robust multi-task learning , 2011, KDD.
[31] Alexander Zien,et al. lp-Norm Multiple Kernel Learning , 2011, J. Mach. Learn. Res..
[32] M. Kloft,et al. l p -Norm Multiple Kernel Learning , 2011 .
[33] Juhan Nam,et al. Multimodal Deep Learning , 2011, ICML.
[34] Thomas Serre,et al. HMDB: A large video database for human motion recognition , 2011, 2011 International Conference on Computer Vision.
[35] Shih-Fu Chang,et al. Consumer video understanding: a benchmark database and an evaluation of human and machine performance , 2011, ICMR.
[36] Jiayu Zhou,et al. A multi-task learning formulation for predicting disease progression , 2011, KDD.
[37] Kristen Grauman,et al. Learning with Whom to Share in Multi-task Feature Learning , 2011, ICML.
[38] G. DeAngelis,et al. A Normalization Model of Multisensory Integration , 2011, Nature Neuroscience.
[39] Hongliang Fei,et al. Structured Feature Selection and Task Relationship Inference for Multi-task Learning , 2011, ICDM.
[40] Yu-Gang Jiang,et al. SUPER: towards real-time event recognition in internet videos , 2012, ICMR.
[41] Chong-Wah Ngo,et al. Fast Semantic Diffusion for Large-Scale Context-Based Image and Video Annotation , 2012, IEEE Transactions on Image Processing.
[42] Nitish Srivastava,et al. Multimodal learning with deep Boltzmann machines , 2012, J. Mach. Learn. Res..
[43] Daoqiang Zhang,et al. Multi-modal multi-task learning for joint prediction of multiple regression and classification variables in Alzheimer's disease , 2012, NeuroImage.
[44] Chin-Hui Lee,et al. Explicit Performance Metric Optimization for Fusion-Based Video Retrieval , 2012, ECCV Workshops.
[45] Dong Liu,et al. Robust late fusion with rank minimization , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.
[46] Shuang Wu,et al. Multimodal feature fusion for robust event detection in web videos , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.
[47] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[48] Mubarak Shah,et al. UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild , 2012, ArXiv.
[49] Yung-Yu Chuang,et al. Cross-Domain Multicue Fusion for Concept-Based Video Indexing , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[50] Andrew Zisserman,et al. Three things everyone should know to improve object retrieval , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.
[51] Cordelia Schmid,et al. Action and Event Recognition with Fisher Vectors on a Compact Feature Set , 2013, 2013 IEEE International Conference on Computer Vision.
[52] Xiangyang Xue,et al. Multiple Task Learning Using Iteratively Reweighted Least Square , 2013, IJCAI.
[53] Dong Liu,et al. Large-Scale Video Hashing via Structure Learning , 2013, 2013 IEEE International Conference on Computer Vision.
[54] Samy Bengio,et al. Using Web Co-occurrence Statistics for Improving Image Categorization , 2013, ArXiv.
[55] Dong Liu,et al. Discovering joint audio–visual codewords for video event detection , 2013, Machine Vision and Applications.
[56] Florian Metze,et al. CMU-Informedia @ TRECVID 2013 Multimedia Event Detection , 2013 .
[57] Ming Yang,et al. 3D Convolutional Neural Networks for Human Action Recognition , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[58] Daniel P. W. Ellis,et al. Subband autocorrelation features for video soundtrack classification , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[59] Dong Liu,et al. Sample-Specific Late Fusion for Visual Category Recognition , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.
[60] Patrick Bouthemy,et al. Better Exploiting Motion for Better Action Recognition , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.
[61] Cordelia Schmid,et al. The AXES submissions at TRECVID 2013 , 2013, TRECVID.
[62] Thomas Mensink,et al. Image Classification with the Fisher Vector: Theory and Practice , 2013, International Journal of Computer Vision.
[63] Cordelia Schmid,et al. Action Recognition with Improved Trajectories , 2013, 2013 IEEE International Conference on Computer Vision.
[64] Nicu Sebe,et al. Feature Weighting via Optimal Thresholding for Video Analysis , 2013, 2013 IEEE International Conference on Computer Vision.
[65] Lynne E. Parker,et al. Simplex-Based 3D Spatio-temporal Feature Description for Action Recognition , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.
[66] Trevor Darrell,et al. Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.
[67] Pong C. Yuen,et al. Reduced Analytic Dependency Modeling: Robust Fusion for Visual Recognition , 2014, International Journal of Computer Vision.
[68] Marc'Aurelio Ranzato,et al. Multi-GPU Training of ConvNets , 2013, ICLR.
[69] Andrew Zisserman,et al. Improving Human Action Recognition Using Score Distribution and Ranking , 2014, ACCV.
[70] Yi Li,et al. Mariana: Tencent Deep Learning Platform and its Applications , 2014, Proc. VLDB Endow..
[71] Mubarak Shah,et al. Video Classification Using Semantic Concept Co-occurrences , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.
[72] Honglak Lee,et al. Improved Multimodal Deep Learning with Variation of Information , 2014, NIPS.
[73] Andrew Zisserman,et al. Two-Stream Convolutional Networks for Action Recognition in Videos , 2014, NIPS.
[74] Fei-Fei Li,et al. Large-Scale Video Classification with Convolutional Neural Networks , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.
[75] Bingbing Ni,et al. Beta Process Multiple Kernel Learning , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.
[76] Samy Bengio,et al. Large-Scale Object Classification Using Label Relation Graphs , 2014, ECCV.
[77] Cees Snoek,et al. COSTA: Co-Occurrence Statistics for Zero-Shot Classification , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.
[78] Bernt Schiele,et al. 2D Human Pose Estimation: New Benchmark and State of the Art Analysis , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.
[79] Jun Wang,et al. Exploring Inter-feature and Inter-class Relationships with Deep Neural Networks for Video Classification , 2014, ACM Multimedia.
[80] Bernard Ghanem,et al. ActivityNet: A large-scale video benchmark for human activity understanding , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[81] Bhiksha Raj,et al. Beyond Gaussian Pyramid: Multi-skip Feature Stacking for action recognition , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[82] Xi Wang,et al. Modeling Spatial-Temporal Clues in a Hybrid Deep Learning Framework for Video Classification , 2015, ACM Multimedia.
[83] Matthew J. Hausknecht,et al. Beyond short snippets: Deep networks for video classification , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[84] Tinne Tuytelaars,et al. Modeling video evolution for action recognition , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[85] Nitish Srivastava,et al. Unsupervised Learning of Video Representations using LSTMs , 2015, ICML.
[86] Chong-Wah Ngo,et al. Multimedia Event Detection , 2015 .
[87] Dong Liu,et al. EventNet: A Large Scale Structured Concept Library for Complex Event Detection in Video , 2015, ACM Multimedia.
[88] Dumitru Erhan,et al. Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[89] Andrew Zisserman,et al. Convolutional Two-Stream Network Fusion for Video Action Recognition , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).