暂无分享,去创建一个
Ewald van der Westhuizen | Ryan Eloff | André Nortje | Benjamin van Niekerk | Avashna Govender | Leanne Nortje | Arnu Pretorius | Elan Van Biljon | Lisa van Staden | Herman Kamper | H. Kamper | E. V. D. Westhuizen | Arnu Pretorius | B. V. Niekerk | A. Nortje | Leanne Nortje | Avashna Govender | Ryan Eloff
[1] Lin-Shan Lee,et al. Multi-target Voice Conversion without Parallel Data by Adversarially Learning Disentangled Audio Representations , 2018, INTERSPEECH.
[2] Aren Jansen,et al. Segmental acoustic indexing for zero resource keyword search , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[3] Aren Jansen,et al. The Zero Resource Speech Challenge 2015: Proposed Approaches and Results , 2016, SLTU.
[4] Alan W. Black,et al. Automatic discovery of a phonetic inventory for unwritten languages for statistical speech synthesis , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[5] Karen Livescu,et al. Query-by-Example Search with Discriminative Neural Acoustic Word Embeddings , 2017, INTERSPEECH.
[6] Alexander Kain,et al. Spectral voice conversion for text-to-speech synthesis , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).
[7] Micha Elsner,et al. Measuring the perceptual availability of phonological features during language acquisition using unsupervised binary stochastic autoencoders , 2019, NAACL.
[8] Tapani Raiko,et al. Techniques for Learning Binary Stochastic Feedforward Neural Networks , 2014, ICLR.
[9] Oriol Vinyals,et al. Neural Discrete Representation Learning , 2017, NIPS.
[10] Max Welling,et al. Auto-Encoding Variational Bayes , 2013, ICLR.
[11] Lorenzo Rosasco,et al. Discovering discrete subword units with binarized autoencoders and hidden-Markov-model encoders , 2015, INTERSPEECH.
[12] Sakriani Sakti,et al. The Zero Resource Speech Challenge 2019: TTS without T , 2019, INTERSPEECH.
[13] Yu Zhang,et al. Unsupervised Learning of Disentangled and Interpretable Representations from Sequential Data , 2017, NIPS.
[14] Adam Finkelstein,et al. Fftnet: A Real-Time Speaker-Dependent Neural Vocoder , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[15] Heiga Zen,et al. WaveNet: A Generative Model for Raw Audio , 2016, SSW.
[16] David Minnen,et al. Variable Rate Image Compression with Recurrent Neural Networks , 2015, ICLR.
[17] Lukás Burget,et al. Variational Inference for Acoustic Unit Discovery , 2016, Workshop on Spoken Language Technologies for Under-resourced Languages.
[18] Ron J. Weiss,et al. Unsupervised Speech Representation Learning Using WaveNet Autoencoders , 2019, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[19] Ben Poole,et al. Categorical Reparameterization with Gumbel-Softmax , 2016, ICLR.
[20] Yoshua Bengio,et al. Estimating or Propagating Gradients Through Stochastic Neurons for Conditional Computation , 2013, ArXiv.
[21] Zhizheng Wu,et al. Merlin: An Open Source Neural Network Speech Synthesis System , 2016, SSW.
[22] S. Sakti,et al. Development of HMM-based Indonesian Speech Synthesis , 2008 .
[23] Giorgio Metta,et al. An auto-encoder based approach to unsupervised learning of subword units , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[24] Thomas L. Griffiths,et al. Learning phonetic categories by learning a lexicon , 2009 .
[25] Florian Metze,et al. Linguistic Unit Discovery from Multi-Modal Inputs in Unwritten Languages: Summary of the “Speaking Rosetta” JSALT 2017 Workshop , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[26] Aren Jansen,et al. The zero resource speech challenge 2017 , 2017, 2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).
[27] Tetsuji Ogawa,et al. Speaker Invariant Feature Extraction for Zero-Resource Languages with Adversarial Learning , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[28] Thomas Schatz,et al. Neural network vs. HMM speech recognition systems as models of human cross-linguistic phonetic perception , 2018 .
[29] James R. Glass,et al. Unsupervised spoken keyword spotting via segmental DTW on Gaussian posteriorgrams , 2009, 2009 IEEE Workshop on Automatic Speech Recognition & Understanding.
[30] David Minnen,et al. Full Resolution Image Compression with Recurrent Neural Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[31] Stan Davis,et al. Comparison of Parametric Representations for Monosyllabic Word Recognition in Continuously Spoken Se , 1980 .
[32] Ewan Dunbar,et al. A hybrid dynamic time warping-deep neural network architecture for unsupervised acoustic modeling , 2015, INTERSPEECH.
[33] Geoffrey E. Hinton,et al. On rectified linear units for speech processing , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[34] Kenneth Ward Church,et al. A summary of the 2012 JHU CLSP workshop on zero resource speech technologies and models of early language acquisition , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[35] Aren Jansen,et al. Weak top-down constraints for unsupervised acoustic model training , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[36] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[37] James R. Glass,et al. Unsupervised Lexicon Discovery from Acoustic Input , 2015, TACL.
[38] Aren Jansen,et al. Evaluating speech features with the minimal-pair ABX task: analysis of the classical MFC/PLP pipeline , 2013, INTERSPEECH.
[39] Satoshi Nakamura,et al. Development of Indonesian Large Vocabulary Continuous Speech Recognition System within A-STAR Project , 2008, IJCNLP.
[40] Okko Johannes Räsänen,et al. Computational modeling of phonetic and lexical learning in early language acquisition: Existing models and future directions , 2012, Speech Commun..
[41] Aren Jansen,et al. A comparison of neural network methods for unsupervised representation learning on the zero resource speech challenge , 2015, INTERSPEECH.
[42] Yee Whye Teh,et al. The Concrete Distribution: A Continuous Relaxation of Discrete Random Variables , 2016, ICLR.
[43] Daniel Rueckert,et al. Real-Time Single Image and Video Super-Resolution Using an Efficient Sub-Pixel Convolutional Neural Network , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[44] Tanja Schultz,et al. Automatic speech recognition for under-resourced languages: A survey , 2014, Speech Commun..
[45] Satoshi Nakamura,et al. Feature optimized DPGMM clustering for unsupervised subword modeling: A contribution to zerospeech 2017 , 2017, 2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).