Bone-conducted (BC) speech in an extremely noisy environment is stable against surrounding noise so that it may be able to be used instead of air-conducted (AC) speech for communication. However, it has very poor sound quality and its intelligibility is degraded when transmitted through bone conduction. Therefore, voice-quality and the intelligibility of BC speech need to be blindly improved in actual speech communication and this is a challenging new topic in the speech signalprocessing field. We proposed an LP-based model to restore BC speech to improve its voice-quality in a previous study. While other methods such as Long-term Fourier transform need to use numerous AC speech parameters to restore BC speech, the proposed model can blindly restore BC speech by predicting BCLP coefficients from AC-LP coefficients. We improved the proposed model by (1) extending long-term processing to framebasis processing, (2) using LSF coefficients on LP representation, and (3) using a recurrent neural network for predicting parameters. We evaluated the improved model in comparison with other models to find out whether the model could adequately improve voice quality and the intelligibility of BC speech, using objective measures (LSD, MCD, and LCD) and carrying out Modified Rhyme Tests (MRTs). An evaluation of these three improvements to the LP-based model proved the practicability of blind-BC restoration.
[1]
Thang tat Vu,et al.
A Study on an LP-based Model for Restoring Bone-conducted Speech
,
2006,
2006 First International Conference on Communications and Electronics.
[2]
Heekuck Oh,et al.
Neural Networks for Pattern Recognition
,
1993,
Adv. Comput..
[3]
Zicheng Liu,et al.
A graphical model for multi-sensory speech processing in air-and-bone conductive microphones
,
2005,
INTERSPEECH.
[4]
Tomoki Toda,et al.
Improving body transmitted unvoiced speech with statistical voice conversion
,
2006,
INTERSPEECH.
[5]
Tetsuya Shimamura,et al.
Reconstruction filter design for bone-conducted speech
,
2004,
INTERSPEECH.
[6]
D S Brungart.
Evaluation of speech intelligibility with the coordinate response measure.
,
2001,
The Journal of the Acoustical Society of America.
[7]
Ian T. Nabney,et al.
Netlab: Algorithms for Pattern Recognition
,
2002
.