Discriminative frequency filter banks learning with neural networks
暂无分享,去创建一个
[1] Bin Gao,et al. Cochleagram-based audio pattern separation using two-dimensional non-negative matrix factorization with automatic sparsity adaptation. , 2014, The Journal of the Acoustical Society of America.
[2] Biing-Hwang Juang,et al. An application of discriminative feature extraction to filter-bank-based speech recognition , 2001, IEEE Trans. Speech Audio Process..
[3] Jun Guo,et al. DNN Filter Bank Cepstral Coefficients for Spoofing Detection , 2017, IEEE Access.
[4] S Rosen,et al. Auditory filter nonlinearity at 2 kHz in normal hearing listeners. , 1998, The Journal of the Acoustical Society of America.
[5] Roy D. Patterson,et al. A Dynamic Compressive Gammachirp Auditory Filterbank , 2006, IEEE Transactions on Audio, Speech, and Language Processing.
[6] Tara N. Sainath,et al. Learning filter banks within a deep neural network framework , 2013, 2013 IEEE Workshop on Automatic Speech Recognition and Understanding.
[7] Mark D. Plumbley,et al. Deep Neural Network Baseline for DCASE Challenge 2016 , 2016, DCASE.
[8] Seiichi Nakagawa,et al. A deep neural network integrated with filterbank learning for speech recognition , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[9] Roger Hsiao,et al. Discriminative training of auditory filters of different shapes for robust speech recognition , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..
[10] P ? ? ? ? ? ? ? % ? ? ? ? , 1991 .
[11] S. S. Stevens,et al. The Relation of Pitch to Frequency: A Revised Scale , 1940 .
[12] Daniele Battaglino,et al. Acoustic scene classification using convolutional neural networks , 2016 .
[13] Steffen Roch,et al. C* - Algebras and Numerical Analysis , 2000 .
[14] Yoon Kim,et al. Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.
[15] Adi Ben-Israel,et al. Generalized inverses: theory and applications , 1974 .
[16] M. James,et al. The generalised inverse , 1978, The Mathematical Gazette.
[17] Richard Lippmann,et al. Speech recognition by machines and humans , 1997, Speech Commun..
[19] Jont B. Allen,et al. Short term spectral analysis, synthesis, and modification by discrete Fourier transform , 1977 .
[20] Jason Weston,et al. Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..
[21] Hemant A. Patil,et al. Novel Unsupervised Auditory Filterbank Learning Using Convolutional RBM for Speech Recognition , 2016, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[22] Meir Tzur,et al. Speech reconstruction from mel frequency cepstral coefficients and pitch frequency , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).
[23] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[24] Alain Biem,et al. Feature extraction based on minimum classification error/generalized probabilistic descent method , 1993, 1993 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[25] Nicki Holighaus,et al. Theory, implementation and applications of nonstationary Gabor frames , 2011, J. Comput. Appl. Math..
[26] Thibaud Necciari,et al. A Perceptually Motivated Filter Bank with Perfect Reconstruction for Audio Signal Processing , 2016, ArXiv.
[27] E. Lopez-Poveda,et al. A human nonlinear cochlear filterbank. , 2001, The Journal of the Acoustical Society of America.
[28] Steve Young,et al. The HTK book , 1995 .
[29] P. P. Vaidyanathan,et al. New results and open problems on nonuniform filter-banks , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).
[30] Xu Shao,et al. Speech reconstruction from mel-frequency cepstral coefficients using a source-filter model , 2002, INTERSPEECH.
[31] R. Fay,et al. Auditory perception of sound sources , 2007 .
[32] Richard F. Lyon,et al. History and future of auditory filter models , 2010, Proceedings of 2010 IEEE International Symposium on Circuits and Systems.
[33] David L. Donoho,et al. De-noising by soft-thresholding , 1995, IEEE Trans. Inf. Theory.
[34] Takumi Kobayashi,et al. Discriminatively learned filter bank for acoustic features , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[35] E. Zwicker,et al. Analytical expressions for critical‐band rate and critical bandwidth as a function of frequency , 1980 .
[36] Stan Davis,et al. Comparison of Parametric Representations for Monosyllabic Word Recognition in Continuously Spoken Se , 1980 .
[37] Jonathan Le Roux,et al. Consistent Wiener Filtering for Audio Source Separation , 2013, IEEE Signal Processing Letters.
[38] Ingrid Daubechies,et al. The wavelet transform, time-frequency localization and signal analysis , 1990, IEEE Trans. Inf. Theory.
[39] Alain Rakotomamonjy,et al. Histogram of gradients of Time-Frequency Representations for Audio scene detection , 2015, ArXiv.
[40] Huy Phan,et al. Audio Scene Classification with Deep Recurrent Neural Networks , 2017, INTERSPEECH.
[41] Piotr Majdak,et al. A time-frequency method for increasing the signal-to-noise ratio in system identification with exponential sweeps , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[42] Tsuyoshi Murata,et al. {m , 1934, ACML.
[43] Kai Yu,et al. Deep features for automatic spoofing detection , 2016, Speech Communication.
[44] Alfred Mertins,et al. Analysis and design of gammatone signal models. , 2009, The Journal of the Acoustical Society of America.
[45] Huy Phan,et al. Improved Audio Scene Classification Based on Label-Tree Embeddings and Convolutional Neural Networks , 2017, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[46] Tara N. Sainath,et al. Learning the speech front-end with raw waveform CLDNNs , 2015, INTERSPEECH.
[47] David G. Stork,et al. Pattern Classification (2nd ed.) , 1999 .
[48] Brian R Glasberg,et al. Derivation of auditory filter shapes from notched-noise data , 1990, Hearing Research.
[49] Pravin Varaiya,et al. Bounded-input bounded-output stability of nonlinear time-varying differential systems. , 1966 .
[50] DeLiang Wang,et al. CASA-Based Robust Speaker Identification , 2012, IEEE Transactions on Audio, Speech, and Language Processing.
[51] Hemant A. Patil,et al. Filterbank learning using Convolutional Restricted Boltzmann Machine for speech recognition , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[52] David G. Stork,et al. Pattern Classification , 1973 .
[53] Tuomas Virtanen,et al. TUT database for acoustic scene classification and sound event detection , 2016, 2016 24th European Signal Processing Conference (EUSIPCO).
[54] Bo Chen,et al. Robust deep feature for spoofing detection - the SJTU system for ASVspoof 2015 challenge , 2015, INTERSPEECH.
[55] Biing-Hwang Juang,et al. Discriminative feature extraction for speech recognition , 1993, Neural Networks for Signal Processing III - Proceedings of the 1993 IEEE-SP Workshop.
[56] Frank Lad,et al. Two Moments of the Logitnormal Distribution , 2008, Commun. Stat. Simul. Comput..
[57] Xu Shao,et al. Prediction of Fundamental Frequency and Voicing From Mel-Frequency Cepstral Coefficients for Unconstrained Speech Reconstruction , 2007, IEEE Transactions on Audio, Speech, and Language Processing.
[58] Antonio M. Peinado,et al. An application of minimum classification error to feature space transformations for speech recognition , 1996, Speech Commun..