Deep Scattering Spectrum
暂无分享,去创建一个
[1] Daniel P. W. Ellis,et al. Classifying soundtracks with audio texture features , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[2] Tara N. Sainath,et al. A convex hull approach to sparse representations for exemplar-based speech recognition , 2011, 2011 IEEE Workshop on Automatic Speech Recognition & Understanding.
[3] Dong Yu,et al. Context-Dependent Pre-Trained Deep Neural Networks for Large-Vocabulary Speech Recognition , 2012, IEEE Transactions on Audio, Speech, and Language Processing.
[4] Eero P. Simoncelli,et al. Article Sound Texture Perception via Statistics of the Auditory Periphery: Evidence from Sound Synthesis , 2022 .
[5] Juan Pablo Bello,et al. Learning a robust Tonnetz-space transform for automatic chord recognition , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[6] Malcolm Slaney,et al. Solving Demodulation as an Optimization Problem , 2010, IEEE Transactions on Audio, Speech, and Language Processing.
[7] Geoffrey E. Hinton,et al. Speech recognition with deep recurrent neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[8] B. Kollmeier,et al. Modeling auditory processing of amplitude modulation. I. Detection and masking with narrow-band carriers. , 1997, The Journal of the Acoustical Society of America.
[9] Powen Ru,et al. Multiresolution spectrotemporal analysis of complex sounds. , 2005, The Journal of the Acoustical Society of America.
[10] Pedro J. Moreno,et al. On the use of support vector machines for phonetic classification , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).
[11] H. Hermansky,et al. The modulation spectrum in the automatic recognition of speech , 1997, 1997 IEEE Workshop on Automatic Speech Recognition and Understanding Proceedings.
[12] Jae Lim,et al. Signal estimation from modified short-time Fourier transform , 1984 .
[13] S. W. Beet,et al. Visual representations of speech signals , 1993 .
[14] Joakim Andén,et al. Multiscale Scattering for Audio Classification , 2011, ISMIR.
[15] Andrew K. Halberstadt. Heterogeneous acoustic measurements and multiple classifiers for speech recognition , 1999 .
[16] Richard E. Turner,et al. Probabilistic amplitude and frequency demodulation , 2011, NIPS.
[17] Bob L. Sturm. An analysis of the GTZAN music genre dataset , 2012, MIRUM '12.
[18] Joakim Andén,et al. Scattering transform for intrapartum fetal heart rate characterization and acidosis detection , 2013, 2013 35th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC).
[19] Geoffroy Peeters,et al. Audio identification based on spectral modeling of bark-bands energy and synchronization through onset detection , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[20] Hsiao-Wuen Hon,et al. Speaker-independent phone recognition using hidden Markov models , 1989, IEEE Trans. Acoust. Speech Signal Process..
[21] S. Mallat. A wavelet tour of signal processing , 1998 .
[22] Tara N. Sainath,et al. Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups , 2012, IEEE Signal Processing Magazine.
[23] Joan Bruna. Scattering Representations for Recognition , 2013 .
[24] Roy D. Patterson,et al. Auditory images:How complex sounds are represented in the auditory system , 2000 .
[25] Geoffrey E. Hinton,et al. Acoustic Modeling Using Deep Belief Networks , 2012, IEEE Transactions on Audio, Speech, and Language Processing.
[26] Kun-Ming Yu,et al. Automatic Music Genre Classification Based on Modulation Spectral Analysis of Spectral and Cepstral Features , 2009, IEEE Transactions on Multimedia.
[27] Les E. Atlas,et al. A non-uniform modulation transform for audio coding with increased time resolution , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..
[28] Stéphane Mallat,et al. Group Invariant Scattering , 2011, ArXiv.
[29] Douglas Eck,et al. Learning Features from Music Audio with Deep Belief Networks , 2010, ISMIR.
[30] Stéphane Mallat,et al. Invariant Scattering Convolution Networks , 2012, IEEE transactions on pattern analysis and machine intelligence.
[31] George Tzanetakis,et al. Proceedings of the second international ACM workshop on Music information retrieval with user-centered and multimodal strategies , 2011, MM 2011.
[32] David Wessel,et al. Analyzing Drum Patterns Using Conditional Deep Belief Networks , 2012, ISMIR.
[33] Michael S. Lewicki,et al. Efficient auditory coding , 2006, Nature.
[34] Les E. Atlas,et al. Coherent envelope detection for modulation filtering of speech , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..
[35] George Tzanetakis,et al. Musical genre classification of audio signals , 2002, IEEE Trans. Speech Audio Process..
[36] Xu Chen,et al. Music genre classification using multiscale scattering and sparse representations , 2013, 2013 47th Annual Conference on Information Sciences and Systems (CISS).
[37] Yann LeCun,et al. Convolutional networks and applications in vision , 2010, Proceedings of 2010 IEEE International Symposium on Circuits and Systems.
[38] Juhan Nam,et al. Learning Sparse Feature Representations for Music Annotation and Retrieval , 2012, ISMIR.
[39] Stéphane Mallat,et al. Phase Retrieval for the Cauchy Wavelet Transform , 2014, ArXiv.
[40] Alexandre d'Aspremont,et al. Phase recovery, MaxCut and complex semidefinite programming , 2012, Math. Program..
[41] Stéphane Mallat,et al. Rotation, Scaling and Deformation Invariant Scattering for Texture Discrimination , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.
[42] Les E. Atlas,et al. Scalable and progressive audio codec , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).
[43] Honglak Lee,et al. Unsupervised feature learning for audio classification using convolutional deep belief networks , 2009, NIPS.
[44] Li Deng,et al. A deep convolutional neural network using heterogeneous pooling for trading acoustic invariance with phonetic confusion , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[45] Richard F. Lyon,et al. On the importance of time—a temporal representation of sound , 1993 .
[46] Hung-An Chang,et al. Hierarchical large-margin Gaussian mixture models for phonetic classification , 2007, 2007 IEEE Workshop on Automatic Speech Recognition & Understanding (ASRU).
[47] Les E. Atlas,et al. EURASIP Journal on Applied Signal Processing 2003:7, 668–675 c ○ 2003 Hindawi Publishing Corporation Joint Acoustic and Modulation Frequency , 2003 .
[48] Joakim Andén,et al. Representing environmental sounds using the separable scattering transform , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[49] Nima Mesgarani,et al. Discrimination of speech from nonspeech based on multiscale spectro-temporal Modulations , 2006, IEEE Transactions on Audio, Speech, and Language Processing.
[50] Chih-Jen Lin,et al. LIBSVM: A library for support vector machines , 2011, TIST.
[51] Yann LeCun,et al. Unsupervised Learning of Sparse Features for Scalable Audio Classification , 2011, ISMIR.
[52] L. Lucy. An iterative technique for the rectification of observed distributions , 1974 .
[53] Yonina C. Eldar,et al. Phase Retrieval via Matrix Completion , 2011, SIAM Rev..