Automatic Lyric Transcription from Karaoke Vocal Tracks: Resources and a Baseline System
暂无分享,去创建一个
[1] Sanjeev Khudanpur,et al. Librispeech: An ASR corpus based on public domain audio books , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[2] Andreas Stolcke,et al. SRILM - an extensible language modeling toolkit , 2002, INTERSPEECH.
[3] Marly Mageau,et al. Foreign Accents in Song and Speech , 2016 .
[4] Keikichi Hirose,et al. Results of aligning and reformatting the dictionary as a corpus of joint sequences . A ‘ , ’ indicates a oneto-many relationship , while ‘ , 2016 .
[5] Tuomas Virtanen,et al. Recognition of phonemes and words in singing , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.
[6] Paul Deléglise,et al. TED-LIUM: an Automatic Speech Recognition dedicated corpus , 2012, LREC.
[7] Lauren Brittany Collister,et al. Comparison of Word Intelligibility in Spoken and Sung Phrases , 2008 .
[8] Sanjeev Khudanpur,et al. End-to-end Speech Recognition Using Lattice-free MMI , 2018, INTERSPEECH.
[9] Lin-Shan Lee,et al. Transcribing Lyrics from Commercial Song Audio: the First Step Towards Singing Content Processing , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[10] Andy Gibson,et al. Production and perception of vowels in New Zealand popular music , 2010 .
[11] Yiming Wang,et al. Purely Sequence-Trained Neural Networks for ASR Based on Lattice-Free MMI , 2016, INTERSPEECH.
[12] Anna M. Kruspe,et al. Bootstrapping a System for Phoneme Recognition and Keyword Spotting in Unaccompanied Singing , 2016, ISMIR.
[13] Carlos Gussenhoven,et al. Singing Your Accent Away, and Why It Works , 2011, ICPhS.
[14] Yiming Wang,et al. Semi-Orthogonal Low-Rank Matrix Factorization for Deep Neural Networks , 2018, INTERSPEECH.
[15] Daniel Povey,et al. The Kaldi Speech Recognition Toolkit , 2011 .
[16] Seiichi Nakagawa,et al. Speech analysis of sung-speech and lyric recognition in monophonic singing , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[17] Janet M. Baker,et al. The Design for the Wall Street Journal-based CSR Corpus , 1992, HLT.
[18] Kyu J. Han,et al. The CAPIO 2017 Conversational Speech Recognition System , 2017, ArXiv.
[19] Monika Konert-Panek,et al. Overshooting Americanisation. Accent stylisation in pop singing – acoustic properties of the bath and trap vowels in focus , 2017 .