Inverse filter based excitation model for HMM-based speech synthesis system

Even today, the speech generated by hidden Markov model (HMM)-based speech synthesis system (HTS) still has the buzziness due to the improper modelling of the excitation signal. This study proposes an efficient excitation modelling approach for improving the quality of HTS. In the proposed method, the residual signal obtained from inverse filter is parameterised as excitation features. HMMs are used to model these excitation parameters. During synthesis, the excitation signal is constructed by overlap adding the natural residual segments, and the excitation signal is further modified as per the target source features generated from HMMs. The proposed approach is incorporated in the HTS. Performance evaluation results indicate that the proposed method enhances the quality of synthesis, and is better than the state-of-the-art approaches used for modelling the excitation signal.