On parameter filtering in continuous subword-unit-based speech recognition

Simple IIR or FIR filters have been widely used in isolated or connected word recognition tasks to filter the time sequence of speech spectral parameters, since, despite their simplicity, they significantly improve recognition performance. Those filters, when applied to continuous speech recognition, where phoneme-sized modelling units are used, induce spectral transition spreading and a cross-boundary effect. The authors show how the use of context-dependent units reduces the side effects of the filters and may result in improved recognition performance. When dynamic parameters are not used, filtering seems to be especially useful, even for clean speech, and when they are, filters do well under unmatched training and testing conditions.