暂无分享,去创建一个
[1] Kou Tanaka,et al. Cyclegan-VC2: Improved Cyclegan-based Non-parallel Voice Conversion , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[2] S. R. Livingstone,et al. The Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS): A dynamic, multimodal set of facial and vocal expressions in North American English , 2018, PloS one.
[3] Masanori Morise,et al. WORLD: A Vocoder-Based High-Quality Speech Synthesis System for Real-Time Applications , 2016, IEICE Trans. Inf. Syst..
[4] Björn Schuller,et al. Computational Paralinguistics , 2013 .
[5] Joon Son Chung,et al. Utterance-level Aggregation for Speaker Recognition in the Wild , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[6] Joon Son Chung,et al. VoxCeleb2: Deep Speaker Recognition , 2018, INTERSPEECH.
[7] George Trigeorgis,et al. Adieu features? End-to-end speech emotion recognition using a deep convolutional recurrent network , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[8] Björn Schuller,et al. Emotional expression in psychiatric conditions: New technology for clinicians , 2018, Psychiatry and clinical neurosciences.
[9] Lauri Juvela,et al. Speaking Style Conversion from Normal to Lombard Speech Using a Glottal Vocoder and Bayesian GMMs , 2017, INTERSPEECH.
[10] Marilyn A. Walker,et al. Using Linguistic Cues for the Automatic Recognition of Personality in Conversation and Text , 2007, J. Artif. Intell. Res..
[11] Steve J. Young,et al. A system for transforming the emotion in speech: combining data-driven conversion techniques for prosody and voice quality , 2007, INTERSPEECH.
[12] Haizhou Li,et al. Spoofing and countermeasures for speaker verification: A survey , 2015, Speech Commun..
[13] Yoshua Bengio,et al. Generative Adversarial Nets , 2014, NIPS.
[14] L. Cosmides,et al. Adaptations in humans for assessing physical strength from the voice , 2010, Proceedings of the Royal Society B: Biological Sciences.
[15] 拓海 杉山,et al. “Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks”の学習報告 , 2017 .
[16] Andrew Zisserman,et al. Multi-task Self-Supervised Visual Learning , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[17] Hideki Kawahara,et al. Fast and Reliable F0 Estimation Method Based on the Period Extraction of Vocal Fold Vibration of Singing Voice and Speech , 2009 .
[18] B. Yegnanarayana,et al. Voice conversion: Factors responsible for quality , 1985, ICASSP '85. IEEE International Conference on Acoustics, Speech, and Signal Processing.
[19] Fabio Valente,et al. The INTERSPEECH 2013 computational paralinguistics challenge: social signals, conflict, emotion, autism , 2013, INTERSPEECH.
[20] R. Krauss,et al. Inferring speakers’ physical attributes from their voices , 2002 .
[21] Todor Ganchev,et al. Estimation of unknown speaker’s height from speech , 2009, Int. J. Speech Technol..
[22] Masanori Morise,et al. CheapTrick, a spectral envelope estimator for high-quality speech synthesis , 2015, Speech Commun..
[23] Linlin Chen,et al. Hidebehind: Enjoy Voice Input with Voiceprint Unclonability and Anonymity , 2018, SenSys.
[24] Scott R. Peppet. Regulating the Internet of Things: First Steps Toward Managing Discrimination, Privacy, Security & Consent , 2014 .
[25] Masanori Morise,et al. D4C, a band-aperiodicity estimator for high-quality speech synthesis , 2016, Speech Commun..
[26] Constantinos Patsakis,et al. Monkey Says, Monkey Does: Security and Privacy on Voice Assistants , 2017, IEEE Access.