暂无分享,去创建一个
[1] Jonathan G. Fiscus,et al. Darpa Timit Acoustic-Phonetic Continuous Speech Corpus CD-ROM {TIMIT} | NIST , 1993 .
[2] Matthias Sperber,et al. Self-Attentional Acoustic Models , 2018, INTERSPEECH.
[3] Alexei Baevski,et al. Effectiveness of self-supervised pre-training for speech recognition , 2019, ArXiv.
[4] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[5] Maria Leonor Pacheco,et al. of the Association for Computational Linguistics: , 2001 .
[6] Alexei Baevski,et al. vq-wav2vec: Self-Supervised Learning of Discrete Speech Representations , 2019, ICLR.
[7] Ke Xu,et al. Self-Attention Attribution: Interpreting Information Interactions Inside Transformer , 2020, AAAI.
[8] Jonathan G. Fiscus,et al. DARPA TIMIT:: acoustic-phonetic continuous speech corpus CD-ROM, NIST speech disc 1-1.1 , 1993 .
[9] Unto K. Laine,et al. An improved speech segmentation quality measure: the r-value , 2009, INTERSPEECH.
[10] Furu Wei,et al. Visualizing and Understanding the Effectiveness of BERT , 2019, EMNLP.
[11] Xiangang Li,et al. Improving Transformer-based Speech Recognition Using Unsupervised Pre-training , 2019, ArXiv.
[12] Hung-yi Lee,et al. Mockingjay: Unsupervised Speech Representation Learning with Deep Bidirectional Transformer Encoders , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[13] Sanjeev Khudanpur,et al. Librispeech: An ASR corpus based on public domain audio books , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[14] K. Sri Rama Murty,et al. Unsupervised Segmentation of Speech Signals Using Kernel-Gram Matrices , 2017, NCVPRIPG.
[15] Dipanjan Das,et al. BERT Rediscovers the Classical NLP Pipeline , 2019, ACL.
[16] Omer Levy,et al. What Does BERT Look at? An Analysis of BERT’s Attention , 2019, BlackboxNLP@ACL.
[17] Pengwei Wang,et al. Large-Scale Unsupervised Pre-Training for End-to-End Spoken Language Understanding , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[18] Najim Dehak,et al. Unsupervised Acoustic Segmentation and Clustering Using Siamese Network Embeddings , 2019, INTERSPEECH.
[19] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[20] Nikos Fakotakis,et al. Phonetic segmentation using multiple speech features , 2008, Int. J. Speech Technol..
[21] Alex Wang,et al. What do you learn from context? Probing for sentence structure in contextualized word representations , 2019, ICLR.
[22] 知秀 柴田. 5分で分かる!? 有名論文ナナメ読み:Jacob Devlin et al. : BERT : Pre-training of Deep Bidirectional Transformers for Language Understanding , 2020 .
[23] Cassia Valentini-Botinhao,et al. Blind speech segmentation using spectrogram image-based features and Mel cepstral coefficients , 2016, 2016 IEEE Spoken Language Technology Workshop (SLT).
[24] Anna Rumshisky,et al. Revealing the Dark Secrets of BERT , 2019, EMNLP.
[25] Jan Niehues,et al. Very Deep Self-Attention Networks for End-to-End Speech Recognition , 2019, INTERSPEECH.
[26] Kevin Gimpel,et al. ALBERT: A Lite BERT for Self-supervised Learning of Language Representations , 2019, ICLR.
[27] Odette Scharenborg,et al. Unsupervised speech segmentation: an analysis of the hypothesized phone boundaries. , 2010, The Journal of the Acoustical Society of America.
[28] Morgan Sonderegger,et al. Montreal Forced Aligner: Trainable Text-Speech Alignment Using Kaldi , 2017, INTERSPEECH.
[29] Omer Levy,et al. RoBERTa: A Robustly Optimized BERT Pretraining Approach , 2019, ArXiv.
[30] Alexander Löser,et al. VisBERT: Hidden-State Visualizations for Transformers , 2020, WWW.
[31] Yiming Yang,et al. XLNet: Generalized Autoregressive Pretraining for Language Understanding , 2019, NeurIPS.
[32] Guangsen Wang,et al. Speech-XLNet: Unsupervised Acoustic Model Pretraining For Self-Attention Networks , 2020, INTERSPEECH.
[33] Hitoshi Yamamoto,et al. Attention Mechanism in Speaker Recognition: What Does it Learn in Deep Speaker Embedding? , 2018, 2018 IEEE Spoken Language Technology Workshop (SLT).