The use of Voice Source Features for Sung Speech Recognition
暂无分享,去创建一个
[1] Yiming Wang,et al. Purely Sequence-Trained Neural Networks for ASR Based on Lattice-Free MMI , 2016, INTERSPEECH.
[2] Syed Shahnawazuddin,et al. Pitch-Normalized Acoustic Features for Robust Children's Speech Recognition , 2017, IEEE Signal Processing Letters.
[3] E. Thomas Doherty,et al. Acoustic characteristics of vocal oscillations: Vibrato, exaggerated vibrato, trill, and trillo , 1988 .
[4] Hervé Bourlard,et al. Using pitch frequency information in speech recognition , 2003, INTERSPEECH.
[5] J. M. Troost,et al. Ascending and Descending Melodic Intervals: Statistical Findings and Their Perceptual Relevance , 1989 .
[6] Ye Wang,et al. The NUS sung and spoken lyrics corpus: A quantitative comparison of singing and speech , 2013, 2013 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference.
[7] Benoît Favre,et al. Speaker adaptation of DNN-based ASR with i-vectors: does it actually adapt models to speakers? , 2014, INTERSPEECH.
[8] Thierry Dutoit,et al. A comparative study of pitch extraction algorithms on a large variety of singing sounds , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[9] Jyh-Shing Roger Jang,et al. On the Improvement of Singing Voice Separation for Monaural Recordings Using the MIR-1K Dataset , 2010, IEEE Transactions on Audio, Speech, and Language Processing.
[10] Lin-Shan Lee,et al. Transcribing Lyrics from Commercial Song Audio: the First Step Towards Singing Content Processing , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[11] Anna M. Kruspe,et al. Bootstrapping a System for Phoneme Recognition and Keyword Spotting in Unaccompanied Singing , 2016, ISMIR.
[12] Abeer Alwan,et al. Joint Robust Voicing Detection and Pitch Estimation Based on Residual Harmonics , 2019, INTERSPEECH.
[13] Jon Barker,et al. Automatic Lyric Transcription from Karaoke Vocal Tracks: Resources and a Baseline System , 2019, INTERSPEECH.
[14] Haihua Xu,et al. Mandarin tone modeling using recurrent neural networks , 2017, ArXiv.
[15] J. Merrill,et al. Vocal Features of Song and Speech: Insights from Schoenberg's Pierrot Lunaire , 2017, Front. Psychol..
[16] Yiming Wang,et al. Semi-Orthogonal Low-Rank Matrix Factorization for Deep Neural Networks , 2018, INTERSPEECH.
[17] Tuomas Virtanen,et al. Recognition of phonemes and words in singing , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.
[18] Mari Ostendorf,et al. Modeling lexical tones for mandarin large vocabulary continuous speech recognition , 2006 .
[19] Daniel Povey,et al. The Kaldi Speech Recognition Toolkit , 2011 .
[20] Seiichi Nakagawa,et al. Lyric recognition in monophonic singing using pitch-dependent DNN , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[21] Jessica M. Foxton,et al. SPEECH INTONATION PERCEPTION DEFICITS IN MUSICAL TONE DEAFNESS (CONGENITAL AMUSIA) , 2008 .
[22] C. Mezzedimi,et al. Singing voice: acoustic parameters after vocal warm-up and cool-down , 2018, Logopedics, phoniatrics, vocology.
[23] João Paulo Teixeira,et al. CENTERIS 2013-Conference on ENTERprise Information Systems / HCIST 2013-International Conference on Health and Social Care Information Systems and Technologies Vocal Acoustic Analysis-Jitter , Shimmer and HNR Parameters , 2013 .
[24] Syed Shahnawazuddin,et al. Pitch-Adaptive Front-End Features for Robust Children's ASR , 2016, INTERSPEECH.
[25] Kyu J. Han,et al. The CAPIO 2017 Conversational Speech Recognition System , 2017, ArXiv.
[26] Meysam Asgari,et al. Improving the accuracy and the robustness of harmonic model for pitch estimation , 2013, INTERSPEECH.
[27] Patrick Kenny,et al. Front-End Factor Analysis for Speaker Verification , 2011, IEEE Transactions on Audio, Speech, and Language Processing.
[28] J. Sundberg,et al. The Science of Singing Voice , 1987 .
[29] David Talkin,et al. A Robust Algorithm for Pitch Tracking ( RAPT ) , 2005 .
[30] Sanjeev Khudanpur,et al. Librispeech: An ASR corpus based on public domain audio books , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[31] Paul Boersma,et al. Praat, a system for doing phonetics by computer , 2002 .
[33] Mireia Farrús,et al. Jitter and shimmer measurements for speaker recognition , 2007, INTERSPEECH.