How Deep Features Have Improved Event Recognition in Multimedia
暂无分享,去创建一个
[1] Farid Melgani,et al. A pool of deep models for event recognition , 2017, 2017 IEEE International Conference on Image Processing (ICIP).
[2] Andrew Zisserman,et al. Reading Text in the Wild with Convolutional Neural Networks , 2014, International Journal of Computer Vision.
[3] Dumitru Erhan,et al. Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[4] Yi Yang,et al. Uncovering the Temporal Context for Video Question Answering , 2017, International Journal of Computer Vision.
[5] Francesco G. B. De Natale,et al. Robust event discovery from photo collections using Signature Image Bases (SIBs) , 2012, Multimedia Tools and Applications.
[6] Changsheng Li,et al. Combining remote sensing and ground census data to develop new maps of the distribution of rice agriculture in China , 2002 .
[7] Li Fei-Fei,et al. ImageNet: A large-scale hierarchical image database , 2009, CVPR.
[8] Gernot A. Fink,et al. A Bag-of-Features approach to acoustic event detection , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[9] Xiu-Shen Wei,et al. Deep Spatial Pyramid Ensemble for Cultural Event Recognition , 2015, 2015 IEEE International Conference on Computer Vision Workshop (ICCVW).
[10] Yiannis S. Boutalis,et al. Selection of the proper Compact Composite Descriptor for improving content based image retrieval , 2009 .
[11] Justin Salamon,et al. A Dataset and Taxonomy for Urban Sound Research , 2014, ACM Multimedia.
[12] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[13] Yi Yang,et al. DevNet: A Deep Event Network for multimedia event detection and evidence recounting , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[14] Xiao Liu,et al. Attention Clusters: Purely Attention Based Local Feature Integration for Video Classification , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[15] Nicu Sebe,et al. Learning Deep Representations of Appearance and Motion for Anomalous Event Detection , 2015, BMVC.
[16] Matthieu Guillaumin,et al. Event Recognition in Photo Collections with a Stopwatch HMM , 2013, 2013 IEEE International Conference on Computer Vision.
[17] Ebroul Izquierdo,et al. MediaEval Benchmark: Social Event Detection in collaborative photo collections , 2011, MediaEval.
[18] Michael Riegler,et al. Social media and satellites , 2019, Multimedia Tools and Applications.
[19] Shih-Fu Chang,et al. Deep Cross Residual Learning for Multitask Visual Recognition , 2016, ACM Multimedia.
[20] G LoweDavid,et al. Distinctive Image Features from Scale-Invariant Keypoints , 2004 .
[21] Trevor Darrell,et al. Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.
[22] Larry S. Davis,et al. Selecting Relevant Web Trained Concepts for Automated Event Retrieval , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[23] Nicola Conci,et al. Convolutional Neural Networks for Disaster Images Retrieval , 2017, MediaEval.
[24] Florian Metze,et al. Detection for Real Life Audio DCASE Challenge , 2016 .
[25] Shih-Fu Chang,et al. Exploiting Feature and Class Relationships in Video Categorization with Regularized Deep Neural Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[26] Liang Lin,et al. Deep feature learning with relative distance comparison for person re-identification , 2015, Pattern Recognit..
[27] Nicolai Petkov,et al. Reliable detection of audio events in highly noisy environments , 2015, Pattern Recognit. Lett..
[28] Birger Kollmeier,et al. On the use of spectro-temporal features for the IEEE AASP challenge ‘detection and classification of acoustic scenes and events’ , 2013, 2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics.
[29] Reza Fuad Rachmadi,et al. Spatial Pyramid Convolutional Neural Network for Social Event Detection in Static Image , 2016, ArXiv.
[30] Yiannis Kompatsiaris,et al. CERTH @ MediaEval 2013 Social Event Detection Task , 2013, MediaEval.
[31] Abhinav Gupta,et al. Videos as Space-Time Region Graphs , 2018, ECCV.
[32] Ying Liu,et al. Geological Disaster Recognition on Optical Remote Sensing Images Using Deep Learning , 2016 .
[33] Otávio A. B. Penatti,et al. Exploiting ConvNet Diversity for Flooding Identification , 2017, IEEE Geoscience and Remote Sensing Letters.
[34] Ainuddin Wahid Abdul Wahab,et al. An Overview of Audio Event Detection Methods from Feature Extraction to Classification , 2017, Appl. Artif. Intell..
[35] Chuang Gan,et al. End-to-End Learning of Motion Representation for Video Understanding , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[36] Trevor Darrell,et al. Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.
[37] Xin Liu,et al. Exploiting Feature Hierarchies with Convolutional Neural Networks for Cultural Event Recognition , 2015, 2015 IEEE International Conference on Computer Vision Workshop (ICCVW).
[38] Larry S. Davis,et al. Exploiting local features from deep networks for image retrieval , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).
[39] Luc Van Gool,et al. AENet: Learning Deep Audio Features for Video Analysis , 2017, IEEE Transactions on Multimedia.
[40] Jun Wang,et al. Solving the Multiple-Instance Problem: A Lazy Learning Approach , 2000, ICML.
[41] Dahua Lin,et al. Recognize complex events from static images by fusing deep channels , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[42] Sanjay Chawla,et al. Nazr-CNN: Fine-Grained Classification of UAV Imagery for Damage Assessment , 2016, 2017 IEEE International Conference on Data Science and Advanced Analytics (DSAA).
[43] Andreas Kamilaris,et al. Disaster Monitoring using Unmanned Aerial Vehicles and Deep Learning , 2018, ArXiv.
[44] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[45] Huy Phan,et al. Robust Audio Event Recognition with 1-Max Pooling Convolutional Neural Networks , 2016, INTERSPEECH.
[46] Yongdong Zhang,et al. Deep Fusion of Multiple Semantic Cues for Complex Event Recognition , 2016, IEEE Transactions on Image Processing.
[47] Francesco G. B. De Natale,et al. A hierarchical approach to event discovery from single images using MIL framework , 2016, 2016 IEEE Global Conference on Signal and Information Processing (GlobalSIP).
[48] Xavier Serra,et al. Freesound technical demo , 2013, ACM Multimedia.
[49] Alexander G. Hauptmann,et al. MoSIFT : Recognizing Human Actions in Surveillance Videos CMU-CS-09-161 , 2009 .
[50] Florian Metze,et al. A first attempt at polyphonic sound event detection using connectionist temporal classification , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[51] Koen E. A. van de Sande,et al. Selective Search for Object Recognition , 2013, International Journal of Computer Vision.
[52] Kyogu Lee,et al. Rare Sound Event Detection Using 1D Convolutional Recurrent Neural Networks , 2017, DCASE.
[53] C.-C. Jay Kuo,et al. Where am I? Scene Recognition for Mobile Robots using Audio Features , 2006, 2006 IEEE International Conference on Multimedia and Expo.
[54] Ankit Shah,et al. DCASE2017 Challenge Setup: Tasks, Datasets and Baseline System , 2017, DCASE.
[55] G. Carbone,et al. Monitoring agricultural drought for arid and humid regions using multi-sensor remote sensing data , 2010 .
[56] Heikki Huttunen,et al. Polyphonic sound event detection using multi label deep neural networks , 2015, 2015 International Joint Conference on Neural Networks (IJCNN).
[57] Sergey Ioffe,et al. Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[58] Shu-Ching Chen,et al. Automatic Video Event Detection for Imbalance Data Using Enhanced Ensemble Deep Learning , 2017, Int. J. Semantic Comput..
[59] Michael Riegler,et al. CNN and GAN Based Satellite and Social Media Data Fusion for Disaster Detection , 2017, MediaEval.
[60] David A. Shamma,et al. YFCC100M , 2015, Commun. ACM.
[61] Lorenzo Torresani,et al. Learning Spatiotemporal Features with 3D Convolutional Networks , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).
[62] Georgios Petkos,et al. Social Event Detection at MediaEval : a three-year retrospect of tasks and results , 2014 .
[63] Michael Riegler,et al. LIRE: open source visual information retrieval , 2016, MMSys.
[64] D. T. Lee,et al. Video Event Detection via Multi-modality Deep Learning , 2014, 2014 22nd International Conference on Pattern Recognition.
[65] Karol J. Piczak. ESC: Dataset for Environmental Sound Classification , 2015, ACM Multimedia.
[66] Tuomas Virtanen,et al. Sound Event Detection in Multichannel Audio Using Spatial and Harmonic Features , 2017, DCASE.
[67] Christopher Joseph Pal,et al. Describing Videos by Exploiting Temporal Structure , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[68] Francesco G. B. De Natale,et al. A saliency-based approach to event recognition , 2018, Signal Process. Image Commun..
[69] Yu Tsao,et al. FOR TASK 3 : SOUND EVENT DETECTION IN REAL LIFE AUDIO , 2016 .
[70] Abhinav Gupta,et al. Non-local Neural Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[71] Ramakant Nevatia,et al. Video-based event recognition: activity representation and probabilistic recognition methods , 2004, Comput. Vis. Image Underst..
[72] Francesco G. B. De Natale,et al. USED: a large-scale social event detection dataset , 2016, MMSys.
[73] Dmitrii Ubskii,et al. SOUND EVENT DETECTION IN REAL-LIFE AUDIO , 2016 .
[74] Yi Yang,et al. Semantic Pooling for Complex Event Analysis in Untrimmed Videos , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[75] Mark D. McDonnell,et al. Understanding Data Augmentation for Classification: When to Warp? , 2016, 2016 International Conference on Digital Image Computing: Techniques and Applications (DICTA).
[76] Tuomas Virtanen,et al. Filterbank learning for deep neural network based polyphonic sound event detection , 2016, 2016 International Joint Conference on Neural Networks (IJCNN).
[77] Florian Metze,et al. CMU-Informedia @ TRECVID 2013 Multimedia Event Detection , 2013 .
[78] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.
[79] Ramakant Nevatia,et al. VERL: An Ontology Framework for Representing and Annotating Video Events , 2005, IEEE Multim..
[80] Ji-Hwan Kim,et al. Audio Event Classification Using Deep Neural Networks , 2015 .
[81] Sergio Escalera,et al. ChaLearn Looking at People 2015: Apparent Age and Cultural Event Recognition Datasets and Results , 2015, 2015 IEEE International Conference on Computer Vision Workshop (ICCVW).
[82] Nicu Sebe,et al. Event-based media processing and analysis: A survey of the literature , 2016, Image Vis. Comput..
[83] Tao Mei,et al. Multigranular Event Recognition of Personal Photo Albums , 2018, IEEE Transactions on Multimedia.
[84] Qiang Ji,et al. Video event recognition with deep hierarchical context model , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[85] Thomas S. Huang,et al. Album-based object-centric event recognition , 2011, 2011 IEEE International Conference on Multimedia and Expo.
[86] Xiao Liu,et al. Multimodal Keyless Attention Fusion for Video Classification , 2018, AAAI.
[87] Nadjia Benblidia,et al. Event recognition in photo albums using probabilistic graphical models and feature relevance , 2016, J. Vis. Commun. Image Represent..
[88] Moncef Gabbouj,et al. Supervised model training for overlapping sound events based on unsupervised source separation , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[89] Janto Skowronek,et al. Automatic surveillance of the acoustic activity in our living environment , 2005, 2005 IEEE International Conference on Multimedia and Expo.
[90] Dennis Koelma,et al. The ImageNet Shuffle: Reorganized Pre-training for Video Event Detection , 2016, ICMR.
[91] Minh-Son Dao,et al. A Domain-based Late-Fusion for Disaster Image Retrieval from Social Media , 2017, MediaEval.
[92] Dimitar Filev,et al. Induced ordered weighted averaging operators , 1999, IEEE Trans. Syst. Man Cybern. Part B.
[93] Yiannis Kompatsiaris,et al. Visual and Textual Analysis of Social Media and Satellite Images for Flood Detection @ Multimedia Satellite Task MediaEval 2017 , 2017, MediaEval.
[94] Amaia Salvador,et al. Cultural Event recognition with visual ConvNets and temporal models , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).
[95] Yi Yang,et al. You Lead, We Exceed: Labor-Free Video Concept Learning by Jointly Exploiting Web Videos and Images , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[96] Zhe Wang,et al. Better Exploiting OS-CNNs for Better Event Recognition in Images , 2015, 2015 IEEE International Conference on Computer Vision Workshop (ICCVW).
[97] Li Fei-Fei,et al. Every Moment Counts: Dense Detailed Labeling of Actions in Complex Videos , 2015, International Journal of Computer Vision.
[98] Yiannis S. Boutalis,et al. CEDD: Color and Edge Directivity Descriptor: A Compact Descriptor for Image Indexing and Retrieval , 2008, ICVS.
[99] Cordelia Schmid,et al. AXES at TRECVID 2012: KIS, INS, and MED , 2012, TRECVID.
[100] Heikki Huttunen,et al. Recognition of acoustic events using deep neural networks , 2014, 2014 22nd European Signal Processing Conference (EUSIPCO).
[101] Quoc V. Le,et al. AutoAugment: Learning Augmentation Policies from Data , 2018, ArXiv.
[102] Nitish Srivastava,et al. Exploiting Image-trained CNN Architectures for Unconstrained Video Classification , 2015, BMVC.
[103] Ebroul Izquierdo,et al. Social event detection and retrieval in collaborative photo collections , 2012, ICMR '12.
[104] Fei-Fei Li,et al. What, where and who? Classifying events by scene and object recognition , 2007, 2007 IEEE 11th International Conference on Computer Vision.
[105] Francesco G. B. De Natale,et al. Event recognition in personal photo collections via multiple instance learning-based classification of multiple images , 2017, J. Electronic Imaging.
[106] Chen Sun,et al. Webly-Supervised Video Recognition by Mutually Voting for Relevant Web Images and Web Video Frames , 2016, ECCV.
[107] Muhammad Hanif,et al. Flood detection using Social Media Data and Spectral Regression based Kernel Discriminant Analysis , 2017, MediaEval.
[108] Benjamin Bischke,et al. The Multimedia Satellite Task at MediaEval 2018: Emergency Response for Flooding Events , 2018 .
[109] Shengchen Li,et al. SOUND EVENT DETECTION IN REAL LIFE AUDIO USING MULTI-MODEL SYSTEM , 2017 .
[110] Alberto Del Bimbo,et al. Deep networks for audio event classification in soccer videos , 2009, 2009 IEEE International Conference on Multimedia and Expo.
[111] Andreas Dengel,et al. Contextual Enrichment of Remote-Sensed Events with Social Media Streams , 2016, ACM Multimedia.
[112] Zi Huang,et al. Robust spatial-temporal deep model for multimedia event detection , 2016, Neurocomputing.
[113] Florian Metze,et al. Recurrent Support Vector Machines for Audio-Based Multimedia Event Detection , 2016, ICMR.
[114] Lin Li,et al. Data-Driven Flood Detection using Neural Networks , 2017, MediaEval.
[115] Xiaoming Liu,et al. Sports Videos in the Wild (SVW): A video dataset for sports analysis , 2015, 2015 11th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG).
[116] Annamaria Mesaros,et al. Metrics for Polyphonic Sound Event Detection , 2016 .
[117] Farid Melgani,et al. Ensemble of Deep Models for Event Recognition , 2018, ACM Trans. Multim. Comput. Commun. Appl..
[118] Qiang Chen,et al. Network In Network , 2013, ICLR.
[119] Nicola Conci,et al. Event Recognition in Personal Photo Collections: An Active Learning Approach , 2018, Visual Information Processing and Communication.
[120] Daniel P. W. Ellis,et al. IBM Research and Columbia University TRECVID-2011 Multimedia Event Detection (MED) System , 2011, TRECVID.
[121] Lothar Thiele,et al. Efficient Convolutional Neural Network For Audio Event Detection , 2017, ArXiv.
[122] Wei Zhang,et al. Optical Flow Guided Feature: A Fast and Robust Motion Representation for Video Action Recognition , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[123] Daniel P. W. Ellis,et al. Spectral vs. spectro-temporal features for acoustic event detection , 2011, 2011 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA).
[124] Heikki Huttunen,et al. Recurrent neural networks for polyphonic sound event detection in real life recordings , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[125] Nam Soo Kim,et al. DNN-BASED SOUND EVENT DETECTION WITH EXEMPLAR-BASED APPROACH FOR NOISE REDUCTION , 2016 .
[126] T. Andringa,et al. DARES-G 1 : Database of Annotated Real-world Everyday Sounds , 2009 .
[127] Mubarak Shah,et al. Real-World Anomaly Detection in Surveillance Videos , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[128] Samira Pouyanfar,et al. Semantic Event Detection Using Ensemble Deep Learning , 2016, 2016 IEEE International Symposium on Multimedia (ISM).
[129] Liang Wang,et al. Learning Representative Deep Features for Image Set Analysis , 2015, IEEE Transactions on Multimedia.
[130] Georges Quénot,et al. TRECVID 2015 - An Overview of the Goals, Tasks, Data, Evaluation Mechanisms and Metrics , 2011, TRECVID.
[131] Bolei Zhou,et al. Learning Deep Features for Scene Recognition using Places Database , 2014, NIPS.
[132] Gernot A. Fink,et al. BAG-OF-FEATURES ACOUSTIC EVENT DETECTION FOR SENSOR NETWORKS , 2016 .
[133] Archontis Politis,et al. Multichannel Sound Event Detection Using 3D Convolutional Neural Networks for Learning Inter-channel Features , 2018, 2018 International Joint Conference on Neural Networks (IJCNN).
[134] Xinmei Tian,et al. Event recognition in personal photo collections using hierarchical model and multiple features , 2015, 2015 IEEE 17th International Workshop on Multimedia Signal Processing (MMSP).
[135] Nicolai Petkov,et al. Audio Surveillance of Roads: A System for Detecting Anomalous Sounds , 2016, IEEE Transactions on Intelligent Transportation Systems.
[136] Yi Yang,et al. A discriminative CNN video representation for event detection , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[137] Vasileios Mezaris. Social event detection at MediaEval: a 3-year retrospect of tasks and results , 2014 .
[138] Dong Liu,et al. EventNet: A Large Scale Structured Concept Library for Complex Event Detection in Video , 2015, ACM Multimedia.
[139] R. Eberhart,et al. Empirical study of particle swarm optimization , 1999, Proceedings of the 1999 Congress on Evolutionary Computation-CEC99 (Cat. No. 99TH8406).
[140] Luc Van Gool,et al. Deep Convolutional Neural Networks and Data Augmentation for Acoustic Event Detection , 2016 .
[141] Kyogu Lee,et al. Ensemble of Convolutional Neural Networks for Weakly-supervised Sound Event Detection Using Multiple Scale Input , 2017, DCASE.
[142] Zvi Kons,et al. Audio event classification using deep neural networks , 2013, INTERSPEECH.
[143] Soma Shiraishi,et al. Analysis of satellite images for disaster detection , 2016, 2016 IEEE International Geoscience and Remote Sensing Symposium (IGARSS).
[144] Florian Metze,et al. Audio-based multimedia event detection using deep recurrent neural networks , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[145] Benoit Huet,et al. Heterogeneous features and model selection for event-based media classification , 2013, ICMR.
[146] Alexander G. Hauptmann,et al. Leveraging high-level and low-level features for multimedia event detection , 2012, ACM Multimedia.
[147] Joost van de Weijer,et al. Multi-modal Deep Learning Approach for Flood Detection , 2017, MediaEval.
[148] Francesco G. B. De Natale,et al. A Comparative Study of Global and Deep Features for the Analysis of User-Generated Natural Disaster Related Images , 2018, 2018 IEEE 13th Image, Video, and Multidimensional Signal Processing Workshop (IVMSP).
[149] Xiaoqiang Lu,et al. Deep Representation for Abnormal Event Detection in Crowded Scenes , 2016, ACM Multimedia.
[150] Andrew Zisserman,et al. Representing shape with a spatial pyramid kernel , 2007, CIVR '07.
[151] Christopher Hunt,et al. Notes on the OpenSURF Library , 2009 .
[152] Andreas Dengel,et al. Detection of Flooding Events in Social Multimedia and Satellite Imagery using Deep Neural Networks , 2017, MediaEval.
[153] Yoshua Bengio,et al. Generative Adversarial Nets , 2014, NIPS.
[154] Yiannis Kompatsiaris,et al. Social Event Detection at MediaEval 2012: Challenges, Dataset and Evaluation , 2012, MediaEval.
[155] Dan Stowell,et al. Detection and classification of acoustic scenes and events: An IEEE AASP challenge , 2013, 2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics.
[156] Tomoki Toda,et al. BLSTM-HMM hybrid system combined with sound activity detection network for polyphonic Sound Event Detection , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[157] Cordelia Schmid,et al. Action recognition by dense trajectories , 2011, CVPR 2011.
[158] Tao Chen,et al. DeepSentiBank: Visual Sentiment Concept Classification with Deep Convolutional Neural Networks , 2014, ArXiv.
[159] Tomoki Toda,et al. Bidirectional LSTM-HMM Hybrid System for Polyphonic Sound Event Detection , 2016, DCASE.
[160] Alexander G. Hauptmann,et al. MoSIFT: Recognizing Human Actions in Surveillance Videos , 2009 .
[161] Graham W. Taylor,et al. Dataset Augmentation in Feature Space , 2017, ICLR.
[162] Cordelia Schmid,et al. Action Recognition with Improved Trajectories , 2013, 2013 IEEE International Conference on Computer Vision.
[163] Nojun Kwak,et al. Cultural event recognition by subregion classification with convolutional neural network , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).
[164] Luc Van Gool,et al. Transferring Deep Object and Scene Representations for Event Recognition in Still Images , 2017, International Journal of Computer Vision.
[165] Antonio Torralba,et al. SoundNet: Learning Sound Representations from Unlabeled Video , 2016, NIPS.
[166] John R. Smith,et al. Large-scale concept ontology for multimedia , 2006, IEEE MultiMedia.
[167] Tao Mei,et al. Relaxing from Vocabulary: Robust Weakly-Supervised Deep Learning for Vocabulary-Free Image Tagging , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[168] Tuomas Virtanen,et al. Acoustic event detection in real life recordings , 2010, 2010 18th European Signal Processing Conference.
[169] VirtanenTuomas,et al. Detection and Classification of Acoustic Scenes and Events , 2018 .