Joint Time–Frequency Scattering
暂无分享,去创建一个
[1] B. Kollmeier,et al. Separable spectro-temporal Gabor filter bank features: Reducing the complexity of robust features for automatic speech recognition. , 2015, The Journal of the Acoustical Society of America.
[2] Roy D. Patterson,et al. Auditory images:How complex sounds are represented in the auditory system , 2000 .
[3] Beth Logan,et al. Mel Frequency Cepstral Coefficients for Music Modeling , 2000, ISMIR.
[4] B. Kollmeier,et al. Spectro-temporal modulation subspace-spanning filter bank features for robust automatic speech recognition. , 2012, The Journal of the Acoustical Society of America.
[5] H. Hermansky,et al. The modulation spectrum in the automatic recognition of speech , 1997, 1997 IEEE Workshop on Automatic Speech Recognition and Understanding Proceedings.
[6] Michal Valko,et al. Compressing the Input for CNNs with the First-Order Scattering Transform , 2018, ECCV.
[7] Sebastian Tschiatschek,et al. Frame and Segment Level Recurrent Neural Networks for Phone Classification , 2017, INTERSPEECH.
[8] Stephen McAdams,et al. A Comparison of Approaches to Timbre Descriptors in Music Information Retrieval and Music Psychology , 2016 .
[9] Judith C. Brown,et al. An efficient algorithm for the calculation of a constant Q transform , 1992 .
[10] Weihua Li,et al. Wavelet transform based convolutional neural network for gearbox fault classification , 2017, 2017 Prognostics and System Health Management Conference (PHM-Harbin).
[11] Jürgen Schmidhuber,et al. Deep learning in neural networks: An overview , 2014, Neural Networks.
[12] Matthias Mauch,et al. MedleyDB: A Multitrack Dataset for Annotation-Intensive MIR Research , 2014, ISMIR.
[13] M. Picheny,et al. Comparison of Parametric Representation for Monosyllabic Word Recognition in Continuously Spoken Sentences , 2017 .
[14] Corinna Cortes,et al. Support-Vector Networks , 1995, Machine Learning.
[15] Andrew Zisserman,et al. Look, Listen and Learn , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[16] Justin Salamon,et al. A Dataset and Taxonomy for Urban Sound Research , 2014, ACM Multimedia.
[17] Tara N. Sainath,et al. Deep Scattering Spectrum with deep neural networks , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[18] Panu Somervuo,et al. Parametric Representations of Bird Sounds for Automatic Species Recognition , 2006, IEEE Transactions on Audio, Speech, and Language Processing.
[19] Stéphane Mallat,et al. Manifold Learning for Latent Variable Inference in Dynamical Systems , 2015, IEEE Transactions on Signal Processing.
[20] Diemo Schwarz,et al. State of the Art in Sound Texture Synthesis , 2011 .
[21] Maarten Versteegh,et al. A deep scattering spectrum — Deep Siamese network pipeline for unsupervised acoustic modeling , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[22] Sadaoki Furui,et al. Speaker-independent isolated word recognition using dynamic features of speech spectrum , 1986, IEEE Trans. Acoust. Speech Signal Process..
[23] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.
[24] Richard F. Lyon,et al. On the importance of time—a temporal representation of sound , 1993 .
[25] Ronen Talmon,et al. Dynamical system classification with diffusion embedding for ECG-based person identification , 2017, Signal Process..
[26] Toshiya Hachisuka,et al. Wavelet Convolutional Neural Networks , 2018, ArXiv.
[27] Hsiao-Wuen Hon,et al. Speaker-independent phone recognition using hidden Markov models , 1989, IEEE Trans. Acoust. Speech Signal Process..
[28] Pedro J. Moreno,et al. On the use of support vector machines for phonetic classification , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).
[29] Geoffrey E. Hinton,et al. On the importance of initialization and momentum in deep learning , 2013, ICML.
[30] Stéphane Mallat,et al. Group Invariant Scattering , 2011, ArXiv.
[31] Stéphane Mallat,et al. Invariant Scattering Convolution Networks , 2012, IEEE transactions on pattern analysis and machine intelligence.
[32] Irène Waldspurger,et al. Exponential decay of scattering coefficients , 2016, 2017 International Conference on Sampling Theory and Applications (SampTA).
[33] Les E. Atlas,et al. A non-uniform modulation transform for audio coding with increased time resolution , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..
[34] Mark A. Richards,et al. Fundamentals of Radar Signal Processing , 2005 .
[35] Karol J. Piczak. ESC: Dataset for Environmental Sound Classification , 2015, ACM Multimedia.
[36] Xavier Serra,et al. Cross-Collection Evaluation for Music Classification Tasks , 2016, ISMIR.
[37] Antonio Torralba,et al. SoundNet: Learning Sound Representations from Unlabeled Video , 2016, NIPS.
[38] Nima Mesgarani,et al. Discrimination of speech from nonspeech based on multiscale spectro-temporal Modulations , 2006, IEEE Transactions on Audio, Speech, and Language Processing.
[39] Chih-Jen Lin,et al. LIBSVM: A library for support vector machines , 2011, TIST.
[40] Kevin Gimpel,et al. Discriminative segmental cascades for feature-rich phone recognition , 2015, 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU).
[41] Yann LeCun,et al. Convolutional networks and applications in vision , 2010, Proceedings of 2010 IEEE International Symposium on Circuits and Systems.
[42] Michael S. Lewicki,et al. Efficient auditory coding , 2006, Nature.
[43] Joakim Andén,et al. Deep Scattering Spectrum , 2013, IEEE Transactions on Signal Processing.
[44] Stéphane Mallat,et al. Understanding deep convolutional networks , 2016, Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences.
[45] Cheng Shi,et al. 3D multi-resolution wavelet convolutional neural networks for hyperspectral image classification , 2017, Inf. Sci..
[46] B. Kollmeier,et al. Modeling auditory processing of amplitude modulation. I. Detection and masking with narrow-band carriers. , 1997, The Journal of the Acoustical Society of America.
[47] Joakim Andén,et al. Scattering Transform for Intrapartum Fetal Heart Rate Variability Fractal Analysis: A Case-Control Study , 2014, IEEE Transactions on Biomedical Engineering.
[48] Powen Ru,et al. Multiresolution spectrotemporal analysis of complex sounds. , 2005, The Journal of the Acoustical Society of America.
[49] VirtanenTuomas,et al. Detection and Classification of Acoustic Scenes and Events , 2018 .
[50] Geoffrey E. Hinton,et al. Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.
[51] Thomas Grill,et al. Inside the spectrogram: Convolutional Neural Networks in audio processing , 2017, 2017 International Conference on Sampling Theory and Applications (SampTA).
[52] Hermann Ney,et al. Acoustic modeling with deep neural networks using raw time signal for LVCSR , 2014, INTERSPEECH.
[53] Arthur Flexer,et al. Basic filters for convolutional neural networks applied to music: Training or design? , 2017, Neural Computing and Applications.
[54] S. Mallat. A wavelet tour of signal processing , 1998 .
[55] Benjamin Schrauwen,et al. Transfer Learning by Supervised Pre-training for Audio-based Music Classification , 2014, ISMIR.
[56] Gaël Richard,et al. Temporal Integration for Audio Classification With Application to Musical Instrument Classification , 2009, IEEE Transactions on Audio, Speech, and Language Processing.
[57] David Gelbart,et al. Improving word accuracy with Gabor feature extraction , 2002, INTERSPEECH.
[58] Stéphane Mallat,et al. Audio Texture Synthesis with Scattering Moments , 2013, ArXiv.
[59] Tara N. Sainath,et al. Learning the speech front-end with raw waveform CLDNNs , 2015, INTERSPEECH.
[60] Eero P. Simoncelli,et al. Article Sound Texture Perception via Statistics of the Auditory Periphery: Evidence from Sound Synthesis , 2022 .
[61] Guigang Zhang,et al. Deep Learning , 2016, Int. J. Semantic Comput..
[62] Geoffrey E. Hinton,et al. Speech recognition with deep recurrent neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[63] Stéphane Mallat,et al. Rotation, Scaling and Deformation Invariant Scattering for Texture Discrimination , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.
[64] Justin Salamon,et al. Deep Convolutional Neural Networks and Data Augmentation for Environmental Sound Classification , 2016, IEEE Signal Processing Letters.
[65] Matthew E. P. Davies,et al. Transfer Learning In Mir: Sharing Learned Latent Representations For Music Audio Classification And Similarity , 2013, ISMIR.
[66] Sergey Zagoruyko,et al. Scaling the Scattering Transform: Deep Hybrid Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).