Learning vocal mode classifiers from heterogeneous data sources
暂无分享,去创建一个
[1] Olli Viikki,et al. Cepstral domain segmental feature vector normalization for noise robust speech recognition , 1998, Speech Commun..
[2] Olli Viikki,et al. On combining vocal tract length normalisation and speaker adaptation for noise robust speech recognition , 1999, EUROSPEECH.
[3] Theodoros Giannakopoulos. pyAudioAnalysis: An Open-Source Python Library for Audio Signal Analysis , 2015, PloS one.
[4] José L. Pérez-Córdoba,et al. Histogram equalization of speech representation for robust speech recognition , 2005, IEEE Transactions on Speech and Audio Processing.
[5] Hermann Ney,et al. Quantile based histogram equalization for online applications , 2002, INTERSPEECH.
[6] Yusuke Kida,et al. Voice Activity Detection: Merging Source and Filter-based Information , 2016, IEEE Signal Processing Letters.
[7] Heikki Huttunen,et al. Polyphonic sound event detection using multi label deep neural networks , 2015, 2015 International Joint Conference on Neural Networks (IJCNN).
[8] José Miguel Díaz-Báñez,et al. Characterization and Similarity in A Cappella Flamenco Cantes , 2010, ISMIR.
[9] Hermann Ney,et al. Enhanced histogram normalization in the acoustic feature space , 2002, INTERSPEECH.
[10] Florian Metze,et al. A comparison of Deep Learning methods for environmental sound detection , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[11] David Pearce,et al. The aurora experimental framework for the performance evaluation of speech recognition systems under noisy conditions , 2000, INTERSPEECH.
[12] Fred Cummins,et al. Speaker Identification Using Instantaneous Frequencies , 2008, IEEE Transactions on Audio, Speech, and Language Processing.
[13] S. Molau,et al. Feature space normalization in adverse acoustic conditions , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..
[14] Karol J. Piczak. ESC: Dataset for Environmental Sound Classification , 2015, ACM Multimedia.
[15] Ning Ma,et al. The PASCAL CHiME speech separation and recognition challenge , 2013, Comput. Speech Lang..
[16] John Salvatier,et al. Theano: A Python framework for fast computation of mathematical expressions , 2016, ArXiv.
[17] Gautham J. Mysore,et al. Speaker and noise independent voice activity detection , 2013, INTERSPEECH.
[18] Simon Haykin,et al. Neural Networks: A Comprehensive Foundation , 1998 .
[19] Matthias Mauch,et al. MedleyDB: A Multitrack Dataset for Annotation-Intensive MIR Research , 2014, ISMIR.
[20] Justin Salamon,et al. A Dataset and Taxonomy for Urban Sound Research , 2014, ACM Multimedia.
[21] Hermann Ney,et al. Quantile based histogram equalization for noise robust large vocabulary speech recognition , 2006, IEEE Transactions on Audio, Speech, and Language Processing.
[22] Tobias Watzka,et al. Proceedings of the Detection and Classification of Acoustic Scenes and Events 2018 Workshop (DCASE2018) , 2018 .