暂无分享,去创建一个
Xin Shu | Dawei Liang | Yang Yang | Junhui Liu | Shaoyong Jia | Qiyue Liu
[1] Jesús Favela,et al. Scalable identification of mixed environmental sounds, recorded from heterogeneous sources , 2015, Pattern Recognit. Lett..
[2] Vesa T. Peltonen,et al. Audio-based context recognition , 2006, IEEE Transactions on Audio, Speech, and Language Processing.
[3] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[4] Aren Jansen,et al. CNN architectures for large-scale audio classification , 2016, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[5] Aren Jansen,et al. Audio Set: An ontology and human-labeled dataset for audio events , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[6] Karol J. Piczak. ESC: Dataset for Environmental Sound Classification , 2015, ACM Multimedia.
[7] Shrikanth Narayanan,et al. Environmental Sound Recognition With Time–Frequency Audio Features , 2009, IEEE Transactions on Audio, Speech, and Language Processing.
[8] Alain Rakotomamonjy,et al. Histogram of gradients of Time-Frequency Representations for Audio scene detection , 2015, ArXiv.
[9] Mark Sandler,et al. MobileNetV2: Inverted Residuals and Linear Bottlenecks , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[10] Huy Phan,et al. Representing nonspeech audio signals through speech classification models , 2015, INTERSPEECH.
[11] Antonio Torralba,et al. SoundNet: Learning Sound Representations from Unlabeled Video , 2016, NIPS.
[12] Naila Murray,et al. AVA: A large-scale database for aesthetic visual analysis , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.
[13] Chin-Hui Lee,et al. A blind segmentation approach to acoustic event detection based on i-vector , 2013, INTERSPEECH.
[14] Hemant A. Patil,et al. Unsupervised Filterbank Learning Using Convolutional Restricted Boltzmann Machine for Environmental Sound Classification , 2017, INTERSPEECH.
[15] Michael S. Bernstein,et al. ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.
[16] Baoyuan Wu,et al. Tencent ML-Images: A Large-Scale Multi-Label Image Database for Visual Representation Learning , 2019, IEEE Access.
[17] VirtanenTuomas,et al. Detection and Classification of Acoustic Scenes and Events , 2018 .
[18] Tuomas Virtanen,et al. A multi-device dataset for urban acoustic scene classification , 2018, DCASE.
[19] Fei-Fei Li,et al. Large-Scale Video Classification with Convolutional Neural Networks , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.
[20] Bolei Zhou,et al. Moments in Time Dataset: One Million Videos for Event Understanding , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[21] Kevin Wilson,et al. Looking to listen at the cocktail party , 2018, ACM Trans. Graph..
[22] Jian Sun,et al. Identity Mappings in Deep Residual Networks , 2016, ECCV.
[23] P. Barrouillet,et al. Time constraints and resource sharing in adults' working memory spans. , 2004, Journal of experimental psychology. General.
[24] Xiaoli Z. Fern,et al. Acoustic classification of multiple simultaneous bird species: a multi-instance multi-label approach. , 2012, The Journal of the Acoustical Society of America.
[25] Yu Tsao,et al. Sparse representation with temporal max-smoothing for acoustic event detection , 2015, INTERSPEECH.