暂无分享,去创建一个
[1] Okko Johannes Räsänen,et al. Computational modeling of phonetic and lexical learning in early language acquisition: Existing models and future directions , 2012, Speech Commun..
[2] Karen Livescu,et al. Unsupervised Pre-Training of Bidirectional Speech Encoders via Masked Reconstruction , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[3] Aren Jansen,et al. The Zero Resource Speech Challenge 2015: Proposed Approaches and Results , 2016, SLTU.
[4] Thomas Schatz,et al. Neural network vs. HMM speech recognition systems as models of human cross-linguistic phonetic perception , 2018 .
[5] Herman Kamper,et al. Unsupervised Feature Learning for Speech Using Correspondence and Siamese Networks , 2020, IEEE Signal Processing Letters.
[6] Gaetan Hadjeres,et al. Vector Quantized Contrastive Predictive Coding for Template-based Music Generation , 2020, ArXiv.
[7] Sakriani Sakti,et al. The Zero Resource Speech Challenge 2019: TTS without T , 2019, INTERSPEECH.
[8] Sanjeev Khudanpur,et al. Unsupervised Learning of Acoustic Sub-word Units , 2008, ACL.
[9] Alexander Kain,et al. Spectral voice conversion for text-to-speech synthesis , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).
[10] Haizhou Li,et al. VQVAE Unsupervised Unit Discovery and Multi-scale Code2Spec Inverter for Zerospeech Challenge 2019 , 2019, INTERSPEECH.
[11] Hao Tang,et al. An Unsupervised Autoregressive Model for Speech Representation Learning , 2019, INTERSPEECH.
[12] Oriol Vinyals,et al. Neural Discrete Representation Learning , 2017, NIPS.
[13] Thomas Hain,et al. Unsupervised Acoustic Unit Representation Learning for Voice Conversion using WaveNet Auto-encoders , 2020, INTERSPEECH.
[14] Lorenzo Rosasco,et al. Discovering discrete subword units with binarized autoencoders and hidden-Markov-model encoders , 2015, INTERSPEECH.
[15] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[16] Armand Joulin,et al. Libri-Light: A Benchmark for ASR with Limited or No Supervision , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[17] James R. Glass,et al. Unsupervised Lexicon Discovery from Acoustic Input , 2015, TACL.
[18] Satoshi Nakamura,et al. Learning Supervised Feature Transformations on Zero Resources for Improved Acoustic Unit Discovery , 2018, IEICE Trans. Inf. Syst..
[19] Zhizheng Wu,et al. Merlin: An Open Source Neural Network Speech Synthesis System , 2016, SSW.
[20] S. Sakti,et al. Development of HMM-based Indonesian Speech Synthesis , 2008 .
[21] Aren Jansen,et al. The zero resource speech challenge 2017 , 2017, 2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).
[22] Micha Elsner,et al. Measuring the perceptual availability of phonological features during language acquisition using unsupervised binary stochastic autoencoders , 2019, NAACL.
[23] Ron J. Weiss,et al. Unsupervised Speech Representation Learning Using WaveNet Autoencoders , 2019, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[24] Yoshua Bengio,et al. Estimating or Propagating Gradients Through Stochastic Neurons for Conditional Computation , 2013, ArXiv.
[25] Nicolas Usunier,et al. Joint Learning of Speaker and Phonetic Similarities with Siamese Networks , 2016, INTERSPEECH.
[26] Thomas Drugman,et al. Towards Achieving Robust Universal Neural Vocoding , 2018, INTERSPEECH.
[27] James R. Glass,et al. A Nonparametric Bayesian Approach to Acoustic Model Discovery , 2012, ACL.
[28] Herbert Gish,et al. Unsupervised training of an HMM-based self-organizing unit recognizer with applications to topic classification and keyword discovery , 2014, Comput. Speech Lang..
[29] James Glass,et al. Learning Hierarchical Discrete Linguistic Units from Visually-Grounded Speech , 2020, ICLR.
[30] F. Pellegrino,et al. Different languages, similar encoding efficiency: Comparable information rates across the human communicative niche , 2019, Science Advances.
[31] Oriol Vinyals,et al. Representation Learning with Contrastive Predictive Coding , 2018, ArXiv.
[32] Lin-Shan Lee,et al. Multi-target Voice Conversion without Parallel Data by Adversarially Learning Disentangled Audio Representations , 2018, INTERSPEECH.
[33] Aren Jansen,et al. Evaluating speech features with the minimal-pair ABX task: analysis of the classical MFC/PLP pipeline , 2013, INTERSPEECH.
[34] Satoshi Nakamura,et al. Development of Indonesian Large Vocabulary Continuous Speech Recognition System within A-STAR Project , 2008, IJCNLP.
[35] Hao Wu,et al. Mixed Precision Training , 2017, ICLR.
[36] Alexei Baevski,et al. vq-wav2vec: Self-Supervised Learning of Discrete Speech Representations , 2019, ICLR.
[37] Armand Joulin,et al. Unsupervised Pretraining Transfers Well Across Languages , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[38] J. Flanagan. Speech Analysis, Synthesis and Perception , 1971 .
[39] Ewald van der Westhuizen,et al. Unsupervised acoustic unit discovery for speech synthesis using discrete latent-variable neural networks , 2019, INTERSPEECH.
[40] Heiga Zen,et al. WaveNet: A Generative Model for Raw Audio , 2016, SSW.
[41] Aurko Roy,et al. Fast Decoding in Sequence Models using Discrete Latent Variables , 2018, ICML.
[42] Adam Finkelstein,et al. Fftnet: A Real-Time Speaker-Dependent Neural Vocoder , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[43] Lukás Burget,et al. Variational Inference for Acoustic Unit Discovery , 2016, Workshop on Spoken Language Technologies for Under-resourced Languages.