A Study of Child Speech Extraction Using Joint Speech Enhancement and Separation in Realistic Conditions
暂无分享,去创建一个
Chin-Hui Lee | Lei Sun | Jun Du | Alejandrina Cristia | Xin Wang | Chin-Hui Lee | Jun Du | Lei Sun | Alejandrina Cristia | Xin Wang | Alejandrina Cristià
[1] Jill Gilkerson,et al. Transcriptional Analyses of the LENA Natural Language Corpus , 2009 .
[2] Jun Du,et al. SNR-Based Progressive Learning of Deep Neural Network for Speech Enhancement , 2016, INTERSPEECH.
[3] Treebank Penn,et al. Linguistic Data Consortium , 1999 .
[4] Kenneth Ward Church,et al. The Second DIHARD Diarization Challenge: Dataset, task, and baselines , 2019, INTERSPEECH.
[5] Patrick Kenny,et al. Front-End Factor Analysis for Speaker Verification , 2011, IEEE Transactions on Audio, Speech, and Language Processing.
[7] Li-Rong Dai,et al. A Regression Approach to Speech Enhancement Based on Deep Neural Networks , 2015, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[8] Anne S Warlaumont,et al. Infant-adult vocal interaction dynamics depend on infant vocal type, child-directedness of adult speech, and timeframe. , 2019, Infant behavior & development.
[9] Alejandrina Cristia,et al. A step-by-step guide to collecting and analyzing long-format speech environment (LFSE) recordings , 2019, Collabra: Psychology.
[10] Geoffrey Zweig,et al. An introduction to computational networks and the computational network toolkit (invited talk) , 2014, INTERSPEECH.
[11] Hao Zheng,et al. AISHELL-1: An open-source Mandarin speech corpus and a speech recognition baseline , 2017, 2017 20th Conference of the Oriental Chapter of the International Coordinating Committee on Speech Databases and Speech I/O Systems and Assessment (O-COCOSDA).
[12] Jesper Jensen,et al. An Algorithm for Intelligibility Prediction of Time–Frequency Weighted Noisy Speech , 2011, IEEE Transactions on Audio, Speech, and Language Processing.
[13] F. Tion,et al. Reliability of the LENA TM Language Environment Analysis System in Young Children's Natural Home Environment , 2009 .
[14] Sanjeev Khudanpur,et al. Librispeech: An ASR corpus based on public domain audio books , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[15] Ronald Rousseau,et al. Similarity measures in scientometric research: The Jaccard index versus Salton's cosine formula , 1989, Inf. Process. Manag..
[16] Geoffrey E. Hinton,et al. Visualizing Data using t-SNE , 2008 .
[17] DeLiang Wang,et al. Ideal ratio mask estimation using deep neural networks for robust speech recognition , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[18] Björn W. Schuller,et al. Discriminatively trained recurrent neural networks for single-channel speech separation , 2014, 2014 IEEE Global Conference on Signal and Information Processing (GlobalSIP).
[19] Yee Whye Teh,et al. A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.
[20] Jun Du,et al. A Progressive Deep Learning Approach to Child Speech Separation , 2018, 2018 11th International Symposium on Chinese Spoken Language Processing (ISCSLP).
[21] Alejandrina Cristià,et al. Talker Diarization in the Wild: the Case of Child-centered Daylong Audio-recordings , 2018, INTERSPEECH.
[22] Jun Du,et al. Speech separation based on improved deep neural networks with dual outputs of speech features for both target and interfering speakers , 2014, The 9th International Symposium on Chinese Spoken Language Processing.