A pitch synchronous scheme for very low bit rate speech coding

Fixed length speech frames are analysed to generate LPC coefficients describing the short term spectrum. Voiced frames are examined to determine sets of excitation points corresponding to instants of glottal closure. A voiced excitation signal is constructed in the form of a prototype pulse train, each pulse positioned at an excitation point. The amplitudes of the pulses are assumed to vary linearly across a voiced speech frame and an optimal linear regression line describing the variation is found for each frame. A new and reliable excitation point finder algorithm is based on a model of a glottal cycle as consisting of closed, opening and closing phases. The vocal tract resonances are excited mainly at glottal closure with less significant secondary excitation at opening. The excitation signal is convolved with the LPC synthesis filter impulse response to produce frames or reconstructed voiced speech. A hybrid coder which switches between this pitch synchronous coding scheme for voiced frames and CELP for coding unvoiced speech frames has been tested in simulation. With the parameters quantised to achieve a bit rate of 3.1 kbit/s, good quality decoded speech has been obtained. >