暂无分享,去创建一个
[1] Oriol Vinyals,et al. Representation Learning with Contrastive Predictive Coding , 2018, ArXiv.
[2] Chengzhu Yu,et al. Unsupervised Speech Recognition via Segmental Empirical Output Distribution Matching , 2018, ICLR.
[3] Odette Scharenborg,et al. Finding Maximum Margin Segments in Speech , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.
[4] Morgan Sonderegger,et al. Montreal Forced Aligner: Trainable Text-Speech Alignment Using Kaldi , 2017, INTERSPEECH.
[5] Mark Hasegawa-Johnson,et al. Accurate speech segmentation by mimicking human auditory processing , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[6] Omer Levy,et al. RoBERTa: A Robustly Optimized BERT Pretraining Approach , 2019, ArXiv.
[7] Andrew L. Maas. Rectifier Nonlinearities Improve Neural Network Acoustic Models , 2013 .
[8] Cynthia G. Clopper,et al. Automatic measurement of vowel duration via structured prediction , 2016, The Journal of the Acoustical Society of America.
[9] Joseph Keshet,et al. Phoneme Boundary Detection Using Learnable Segmental Features , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[10] Joseph Keshet,et al. Automatic analysis of slips of the tongue: Insights into the cognitive architecture of speech production , 2016, Cognition.
[11] Jonathan G. Fiscus,et al. Darpa Timit Acoustic-Phonetic Continuous Speech Corpus CD-ROM {TIMIT} | NIST , 1993 .
[12] Bernd Pompino-Marschall,et al. Theoretical principles concerning segmentation, labelling strategies and levels of categorical annotation for spoken language database systems , 1993, EUROSPEECH.
[13] Jörg Franke,et al. Phoneme Boundary Detection using Deep Bidirectional LSTMs , 2016, ITG Symposium on Speech Communication.
[14] Mohammad Hossein Moattar,et al. A review on speaker diarization systems and approaches , 2012, Speech Commun..
[15] Yoram Singer,et al. Phoneme alignment based on discriminative learning , 2005, INTERSPEECH.
[16] Okko Johannes Räsänen,et al. Improving Phoneme segmentation with Recurrent Neural Networks , 2016, ArXiv.
[17] Hsiao-Chuan Wang,et al. Blind phone segmentation based on spectral change detection using Legendre polynomial approximation. , 2015, The Journal of the Acoustical Society of America.
[18] Unto K. Laine,et al. An improved speech segmentation quality measure: the r-value , 2009, INTERSPEECH.
[19] Richard M. Schwartz,et al. Transcribing radio news , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.
[20] Lawrence R. Rabiner,et al. On the Relation between Maximum Spectra Boundaries , 2006 .
[21] Okko Johannes Räsänen,et al. Blind Phoneme Segmentation With Temporal Prediction Errors , 2016, ACL.
[22] Gayatri M. Bhandari,et al. Audio Segmentation for Speech Recognition Using Segment Features , 2014 .
[23] Okko Johannes Räsänen,et al. Basic cuts revisited: Temporal segmentation of speech into phone-like units with statistical learning at a pre-linguistic level , 2014, CogSci.
[24] Joseph Keshet,et al. AUTOMATIC TOOLS FOR ANALYZING SPOKEN HEBREW , 2016 .
[25] Unto K. Laine,et al. Blind Segmentation of Speech Using Non-Linear Filtering Methods , 2011 .
[26] Hung-yi Lee,et al. Gate Activation Signal Analysis for Gated Recurrent Neural Networks and its Correlation with Phoneme Boundaries , 2017, INTERSPEECH.
[27] Constantine Kotropoulos,et al. Phonemic segmentation using the generalised Gamma distribution and small sample Bayesian information criterion , 2008, Speech Commun..
[28] Geoffrey E. Hinton,et al. A Simple Framework for Contrastive Learning of Visual Representations , 2020, ICML.
[29] Samy Bengio,et al. Discriminative keyword spotting , 2009, Speech Commun..
[30] Joseph Keshet,et al. Vowel duration measurement using deep neural networks , 2015, 2015 IEEE 25th International Workshop on Machine Learning for Signal Processing (MLSP).
[31] William D. Raymond,et al. The Buckeye corpus of conversational speech: labeling conventions and a test of transcriber reliability , 2005, Speech Commun..
[32] Aapo Hyvärinen,et al. Noise-contrastive estimation: A new estimation principle for unnormalized statistical models , 2010, AISTATS.
[33] Sanjeev Khudanpur,et al. Librispeech: An ASR corpus based on public domain audio books , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[34] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[35] Ronan Collobert,et al. wav2vec: Unsupervised Pre-training for Speech Recognition , 2019, INTERSPEECH.
[36] Carla Teixeira Lopes,et al. TIMIT Acoustic-Phonetic Continuous Speech Corpus , 2012 .