暂无分享,去创建一个
Björn W. Schuller | Mark D. Plumbley | Zhao Ren | Qiuqiang Kong | Mark D. Plumbley | Jing Han | Björn Schuller | Jing Han | Qiuqiang Kong | Zhao Ren
[1] Changsheng Xu,et al. Cross-Domain Feature Learning in Multimedia , 2015, IEEE Transactions on Multimedia.
[2] Colin Raffel,et al. Onsets and Frames: Dual-Objective Piano Transcription , 2017, ISMIR.
[3] Hayit Greenspan,et al. GAN-based Synthetic Medical Image Augmentation for increased CNN Performance in Liver Lesion Classification , 2018, Neurocomputing.
[4] Maarten De Vos,et al. DNN Filter Bank Improves 1-Max Pooling CNN for Single-Channel EEG Automatic Sleep Stage Classification , 2018, 2018 40th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC).
[5] Mark D. Plumbley,et al. Attention-based Atrous Convolutional Neural Networks: Visualisation and Understanding Perspectives of Acoustic Scenes , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[6] Jaume Amores,et al. Multiple instance classification: Review, taxonomy and comparative study , 2013, Artif. Intell..
[7] Yuxin Peng,et al. Life-long Cross-media Correlation Learning , 2018, ACM Multimedia.
[8] Björn W. Schuller,et al. Large-scale audio feature extraction and SVM for acoustic scene classification , 2013, 2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics.
[9] Dacheng Tao,et al. Database Saliency for Fast Image Retrieval , 2015, IEEE Transactions on Multimedia.
[10] Kun Qian,et al. Teaching Machines on Snoring: A Benchmark on Computer Audition for Snore Sound Excitation Localisation , 2018 .
[11] Prasanta Kumar Ghosh,et al. Spectrogram Enhancement Using Multiple Window Savitzky-Golay (MWSG) Filter for Robust Bird Sound Detection , 2017, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[12] Ling-Yu Duan,et al. Unified Spatio-Temporal Attention Networks for Action Recognition in Videos , 2019, IEEE Transactions on Multimedia.
[13] Shuicheng Yan,et al. Conditional Convolutional Neural Network for Modality-Aware Face Recognition , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[14] Tomoki Toda,et al. Bidirectional LSTM-HMM Hybrid System for Polyphonic Sound Event Detection , 2016, DCASE.
[15] Aren Jansen,et al. CNN architectures for large-scale audio classification , 2016, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[16] Sanjeev Khudanpur,et al. Deep Neural Network Embeddings for Text-Independent Speaker Verification , 2017, INTERSPEECH.
[17] Sercan Ömer Arik,et al. Deep Voice 2: Multi-Speaker Neural Text-to-Speech , 2017, NIPS.
[18] Eric Martinson,et al. Robotic Discovery of the Auditory Scene , 2007, Proceedings 2007 IEEE International Conference on Robotics and Automation.
[19] Sanjeev Khudanpur,et al. Deep neural network-based speaker embeddings for end-to-end speaker verification , 2016, 2016 IEEE Spoken Language Technology Workshop (SLT).
[20] Kun Qian,et al. Learning Multi-Resolution Representations for Acoustic Scene Classification via Neural Networks , 2020 .
[21] Yoshua Bengio,et al. Professor Forcing: A New Algorithm for Training Recurrent Networks , 2016, NIPS.
[22] George Papandreou,et al. Rethinking Atrous Convolution for Semantic Image Segmentation , 2017, ArXiv.
[23] Gerhard Widmer,et al. Exploiting Parallel Audio Recordings to Enforce Device Invariance in CNN-based Acoustic Scene Classification , 2019, DCASE.
[24] Björn Schuller,et al. Wavelets Revisited for the Classification of Acoustic Scenes , 2017, DCASE.
[25] Vishal M. Patel,et al. CNN-Based cascaded multi-task learning of high-level prior and density estimation for crowd counting , 2017, 2017 14th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS).
[26] Stefano Squartini,et al. A convolutional neural network approach for acoustic scene classification , 2017, 2017 International Joint Conference on Neural Networks (IJCNN).
[27] Yusuke Ijima,et al. DNN-Based Speech Synthesis Using Speaker Codes , 2018, IEICE Trans. Inf. Syst..
[28] Huibing Wang,et al. Deep CNNs With Spatially Weighted Pooling for Fine-Grained Car Recognition , 2017, IEEE Transactions on Intelligent Transportation Systems.
[29] Jian Sun,et al. Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[30] Thomas Brox,et al. U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.
[31] Yi-Hsuan Yang,et al. Weakly-supervised audio event detection using event-specific Gaussian filters and fully convolutional networks , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[32] Björn Schuller,et al. Deep Sequential Image Features on Acoustic Scene Classification , 2017, DCASE.
[33] Heiga Zen,et al. WaveNet: A Generative Model for Raw Audio , 2016, SSW.
[34] Jian Sun,et al. Convolutional neural networks at constrained time cost , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[35] Tao Xiang,et al. Bayesian Joint Modelling for Object Localisation in Weakly Labelled Images , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[36] Ronan Sicre,et al. Particular object retrieval with integral max-pooling of CNN activations , 2015, ICLR.
[37] Ashu Goyal,et al. Identification of source mobile hand sets using audio latency feature. , 2019, Forensic science international.
[38] Arkady B. Zaslavsky,et al. Context Aware Computing for The Internet of Things: A Survey , 2013, IEEE Communications Surveys & Tutorials.
[39] Eduardo Coutinho,et al. Dynamic Difficulty Awareness Training for Continuous Emotion Prediction , 2018, IEEE Transactions on Multimedia.
[40] Huy Phan,et al. Audio Scene Classification with Deep Recurrent Neural Networks , 2017, INTERSPEECH.
[41] Tuomas Virtanen,et al. A multi-device dataset for urban acoustic scene classification , 2018, DCASE.
[42] Franz Pernkopf,et al. Acoustic Scene Classification with Mismatched Recording Devices Using Mixture of Experts Layer , 2019, 2019 IEEE International Conference on Multimedia and Expo (ICME).
[43] VirtanenTuomas,et al. Detection and Classification of Acoustic Scenes and Events , 2018 .
[44] Tapio Lokki,et al. Techniques and Applications of Wearable Augmented Reality Audio , 2003 .
[45] Zhao Ren,et al. Exploring Deep Spectrum Representations via Attention-Based Recurrent and Convolutional Neural Networks for Speech Emotion Recognition , 2019, IEEE Access.
[46] Xin Xu,et al. Statistical Learning in Multiple Instance Problems , 2003 .
[47] Roberto Cipolla,et al. Multi-task Learning Using Uncertainty to Weigh Losses for Scene Geometry and Semantics , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[48] Anurag Kumar,et al. Knowledge Transfer from Weakly Labeled Audio Using Convolutional Neural Network for Sound Events and Scenes , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[49] Mark D. Plumbley,et al. Acoustic Scene Classification: Classifying environments from the sounds they produce , 2014, IEEE Signal Processing Magazine.
[50] Justin Salamon,et al. Deep Convolutional Neural Networks and Data Augmentation for Environmental Sound Classification , 2016, IEEE Signal Processing Letters.
[51] Björn Schuller,et al. Sequence to Sequence Autoencoders for Unsupervised Representation Learning from Audio , 2017, DCASE.
[52] Kun Qian,et al. Deep Scalogram Representations for Acoustic Scene Classification , 2018, IEEE/CAA Journal of Automatica Sinica.
[53] Yoshua Bengio,et al. Attention-Based Models for Speech Recognition , 2015, NIPS.
[54] Guillaume Gravier,et al. One-Step Time-Dependent Future Video Frame Prediction with a Convolutional Encoder-Decoder Neural Network , 2016, ICIAP.
[55] Alain Trémeau,et al. Multi-task, multi-domain learning: Application to semantic segmentation and pose regression , 2017, Neurocomputing.
[56] Jean Ponce,et al. A Theoretical Analysis of Feature Pooling in Visual Recognition , 2010, ICML.
[57] Donald A. Adjeroh,et al. Unified Deep Supervised Domain Adaptation and Generalization , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[58] Björn W. Schuller,et al. Learning Image-based Representations for Heart Sound Classification , 2018, DH.