论文信息 - Equalizing sub-band error rates in speaker recognition

Equalizing sub-band error rates in speaker recognition

Recent work on ASR by [1] [2] shows that band splitting gives recognition accuracy comparable with the conventional ful band. Sub-bands have different bandwidth spaced on a mel scale. Interestingly in the contex of speaker recognition i mproved accuracy has been reported in the case of a full-band approach using a linear scale. We demonstrate that both of these scales are likely to be suboptimum in the context of band splitting. We then describe, h ow sub-band error profiles can lead to a new scale, which is betwe en a linear and a mel spacing, giving both an equalised sub-band error profile and an improved overall recognition accuracy.

Roland Auckenthaler | John S. D. Mason

[1] John S. D. Mason,et al. Optimization of perceptually-based spectral transforms in speaker identification , 1991, EUROSPEECH.

[2] Misha Pavel,et al. Towards ASR on partially corrupted speech , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[3] Hervé Bourlard,et al. A mew ASR approach based on independent processing and recombination of partial frequency bands , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.