暂无分享,去创建一个
Ryosuke Yamanishi | Yoichi Yamashita | Shinnosuke Takamichi | Takahiro Fukumori | Yuki Okamoto | Keisuke Imoto
[1] Xin Wang,et al. Zero-Shot Multi-Speaker Text-To-Speech with State-Of-The-Art Neural Speaker Embeddings , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[2] Chen Fang,et al. Visual to Sound: Generating Natural Sound for Videos in the Wild , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[3] Naga K. Govindaraju,et al. Sound synthesis for impact sounds in video games , 2011, SI3D.
[4] Tuomas Virtanen,et al. Automated audio captioning with recurrent neural networks , 2017, 2017 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA).
[5] Félix Gontier,et al. Privacy Aware Acoustic Scene Synthesis Using Deep Spectral Feature Inversion , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[6] Liyuan Liu,et al. On the Variance of the Adaptive Learning Rate and Beyond , 2019, ICLR.
[7] Shrikanth S. Narayanan,et al. Vector-based Representation and Clustering of Audio Using Onomatopoeia Words , 2006, AAAI Fall Symposium: Aurally Informed Performance.
[8] Yong Xu,et al. Acoustic Scene Generation with Conditional Samplernn , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[9] Ryosuke Yamanishi,et al. Overview of Tasks and Investigation of Subjective Evaluation Methods in Environmental Sound Synthesis and Conversion , 2019, ArXiv.
[10] Satoshi Nakamura,et al. Sound scene data collection in real acoustical environments , 1999 .
[11] Jae S. Lim,et al. Signal estimation from modified short-time Fourier transform , 1983, ICASSP.
[12] Ryosuke Yamanishi,et al. RWCP-SSD-Onomatopoeia: Onomatopoeic Word Dataset for Environmental Sound Synthesis , 2020, ArXiv.
[13] Quoc V. Le,et al. Sequence to Sequence Learning with Neural Networks , 2014, NIPS.
[14] Kunio Kashino,et al. Neural Audio Captioning Based on Conditional Sequence-to-Sequence Model , 2019, DCASE.
[15] Samy Bengio,et al. Tacotron: Towards End-to-End Speech Synthesis , 2017, INTERSPEECH.
[16] Patrick Nguyen,et al. Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis , 2018, NeurIPS.
[17] Wei Ping,et al. Multi-Speaker End-to-End Speech Synthesis , 2019, ArXiv.
[18] Justin Salamon,et al. Scaper: A library for soundscape synthesis and augmentation , 2017, 2017 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA).
[19] Satoshi Nakamura,et al. Acoustical Sound Database in Real Environments for Sound Scene Understanding and Hands-Free Speech Recognition , 2000, LREC.
[20] Kai Wang,et al. Efficient sound synthesis for natural scenes , 2017, 2017 IEEE Virtual Reality (VR).
[21] D. Rocchesso,et al. On the effectiveness of vocal imitations and verbal descriptions of sounds. , 2014, The Journal of the Acoustical Society of America.
[22] Kunio Kashino,et al. Generating Sound Words from Audio Signals of Acoustic Events with Sequence-to-Sequence Model , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).