TransVoice: Real-Time Voice Conversion for Augmenting Near-Field Speech Communication

Despite promising initial studies, a speaker's original voice can cause problems when it comes to the application of real-time voice conversion (data-driven speaker conversion) technology in our daily lives, specifically in our near-field communication, because the overlapping speech degrades the sense of immersion to the converted speech. We present TransVoice, a real-time voice conversion system that physically confines original speech with a mask-shaped device. Our preliminary study shows the proposed device can reduce the volume of original speech significantly, while it ameliorates the deteriorated conversion quality of the deep neural network (DNN) thanks to an integrated filter that weakens the low frequency range. We discuss novel applications using TransVoice that can augment our communication.