Slow-Fast Auditory Streams for Audio Recognition
暂无分享,去创建一个
Dima Damen | Evangelos Kazakos | Andrew Zisserman | Arsha Nagrani | Andrew Zisserman | D. Damen | Arsha Nagrani | E. Kazakos
[1] Elia Formisano,et al. Spectro-Temporal Processing in a Two-Stream Computational Model of Auditory Cortex , 2020, Frontiers in Computational Neuroscience.
[2] Yonghong Yan,et al. Integrating the Data Augmentation Scheme with Various Classifiers for Acoustic Scene Modeling , 2019, ArXiv.
[3] Xavier Serra,et al. Timbre analysis of music audio signals with convolutional neural networks , 2017, 2017 25th European Signal Processing Conference (EUSIPCO).
[4] Akshita Gupta,et al. Acoustic Features Fusion using Attentive Multi-channel Deep Architecture , 2018, 5th International Workshop on Speech Processing in Everyday Environments (CHiME 2018).
[5] Yann LeCun,et al. A Closer Look at Spatiotemporal Convolutions for Action Recognition , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[6] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[7] Xinxing Chen,et al. Acoustic scene classification using multi-scale features , 2018, DCASE.
[8] Chuang Gan,et al. Deep Audio Priors Emerge From Harmonic Convolutional Networks , 2020, ICLR.
[9] Luc Van Gool,et al. Temporal Segment Networks: Towards Good Practices for Deep Action Recognition , 2016, ECCV.
[10] Gerhard Widmer,et al. The Receptive Field as a Regularizer in Deep Convolutional Neural Networks for Acoustic Scene Classification , 2019, 2019 27th European Signal Processing Conference (EUSIPCO).
[11] Xinyu Li,et al. Multi-stream Network With Temporal Attention For Environmental Sound Classification , 2019, INTERSPEECH.
[12] Jitendra Malik,et al. SlowFast Networks for Video Recognition , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[13] Justin Salamon,et al. Deep Convolutional Neural Networks and Data Augmentation for Environmental Sound Classification , 2016, IEEE Signal Processing Letters.
[14] Yong Jae Lee,et al. Audiovisual SlowFast Networks for Video Recognition , 2020, ArXiv.
[15] Quoc V. Le,et al. SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition , 2019, INTERSPEECH.
[16] Tan Lee,et al. Time-Frequency Feature Decomposition Based on Sound Duration for Acoustic Scene Classification , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[17] Taejin Lee,et al. Designing Acoustic Scene Classification Models with CNN Variants Technical Report , 2020 .
[18] Chin-Hui Lee,et al. Device-Robust Acoustic Scene Classification Based on Two-Stage Categorization and Data Augmentation , 2020, ArXiv.
[19] Jingyu Wang,et al. Environment Sound Classification Using a Two-Stream CNN Based on Decision-Level Fusion , 2019, Sensors.
[20] Andrew Zisserman,et al. Vggsound: A Large-Scale Audio-Visual Dataset , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[21] Kyu J. Han,et al. State-of-the-Art Speech Recognition Using Multi-Stream Self-Attention with Dilated 1D Convolutions , 2019, 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).
[22] Antonio Torralba,et al. SoundNet: Learning Sound Representations from Unlabeled Video , 2016, NIPS.
[23] Mark D. McDonnell,et al. Acoustic Scene Classification Using Deep Residual Networks with Late Fusion of Separated High and Low Frequency Paths , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[24] D. Damen,et al. Rescaling Egocentric Vision: Collection, Pipeline and Challenges for EPIC-KITCHENS-100 , 2020, International Journal of Computer Vision.
[25] Muhammad Huzaifah,et al. Comparison of Time-Frequency Representations for Environmental Sound Classification using Convolutional Neural Networks , 2017, ArXiv.
[26] Essa Yacoub,et al. Encoding of Natural Sounds at Multiple Spectral and Temporal Resolutions in the Human Auditory Cortex , 2014, PLoS Comput. Biol..
[27] Aren Jansen,et al. CNN architectures for large-scale audio classification , 2016, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).