暂无分享,去创建一个
Liang Tao | Hon Keung Kwan | Jian Zhou | Teng Gao | Huabin Wang | H. Kwan | L. Tao | Hua-bin Wang | Jian Zhou | Teng Gao | Qing Pan
[1] Tomoki Toda,et al. NAM-to-speech conversion with Gaussian mixture models , 2005, INTERSPEECH.
[2] Smita Krishnaswamy,et al. TraVeLGAN: Image-To-Image Translation by Transformation Vector Learning , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[3] Wenming Zheng,et al. Whisper to Normal Speech Conversion Using Sequence-to-Sequence Mapping Model With Auditory Attention , 2019, IEEE Access.
[4] Yoshua Bengio,et al. Generative Adversarial Nets , 2014, NIPS.
[5] Han Zhang,et al. Self-Attention Generative Adversarial Networks , 2018, ICML.
[6] Kazuya Takeda,et al. Acoustic analysis and recognition of whispered speech , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[7] Miroslaw Bober,et al. Siamese Network of Deep Fisher-Vector Descriptors for Image Retrieval , 2017, ArXiv.
[8] Deniz Başkent,et al. Pitch and spectral resolution: A systematic comparison of bottom-up cues for top-down repair of degraded speech. , 2016, The Journal of the Acoustical Society of America.
[9] Marco Pasini. MelGAN-VC: Voice Conversion and Audio Style Transfer on arbitrarily long samples using Spectrograms , 2019, ArXiv.
[10] A. Gray,et al. Distance measures for speech processing , 1976 .
[11] Bin Ma,et al. Voice conversion: From spoken vowels to singing vowels , 2010, 2010 IEEE International Conference on Multimedia and Expo.
[12] Jae S. Lim,et al. Signal estimation from modified short-time Fourier transform , 1983, ICASSP.
[13] Ian Vince McLoughlin,et al. Reconstruction of Normal Sounding Speech for Laryngectomy Patients Through a Modified CELP Codec , 2010, IEEE Transactions on Biomedical Engineering.
[14] Yuichi Yoshida,et al. Spectral Normalization for Generative Adversarial Networks , 2018, ICLR.
[15] J. Berger,et al. P.563—The ITU-T Standard for Single-Ended Speech Quality Assessment , 2006, IEEE Transactions on Audio, Speech, and Language Processing.
[16] Jesper Jensen,et al. A short-time objective intelligibility measure for time-frequency weighted noisy speech , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.
[17] Hirokazu Kameoka,et al. Parallel-Data-Free Voice Conversion Using Cycle-Consistent Adversarial Networks , 2017, ArXiv.
[18] Esa Rahtu,et al. Siamese network features for image matching , 2016, 2016 23rd International Conference on Pattern Recognition (ICPR).
[19] Lior Wolf,et al. Unsupervised Cross-Domain Image Generation , 2016, ICLR.
[20] Prasanta Kumar Ghosh,et al. Whispered Speech to Neutral Speech Conversion Using Bidirectional LSTMs , 2018, INTERSPEECH.
[21] Hon Keung Kwan,et al. Multimodal Voice Conversion Under Adverse Environment Using a Deep Convolutional Neural Network , 2019, IEEE Access.
[22] Sugato Chakravarty,et al. Method for the subjective assessment of intermedi-ate quality levels of coding systems , 2001 .
[23] W. Heeren. Vocalic correlates of pitch in whispered versus normal speech. , 2015, The Journal of the Acoustical Society of America.
[24] Hemant A. Patil,et al. Effectiveness of Cross-Domain Architectures for Whisper-to-Normal Speech Conversion , 2019, 2019 27th European Signal Processing Conference (EUSIPCO).
[25] 拓海 杉山,et al. “Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks”の学習報告 , 2017 .
[26] R. Kubichek,et al. Mel-cepstral distance measure for objective speech quality assessment , 1993, Proceedings of IEEE Pacific Rim Conference on Communications Computers and Signal Processing.