Equalizing sub-band error rates in speaker recognition
暂无分享,去创建一个
Recent work on ASR by [1] [2] shows that band splitting gives recognition accuracy comparable with the conventional ful band. Sub-bands have different bandwidth spaced on a mel scale. Interestingly in the contex of speaker recognition i mproved accuracy has been reported in the case of a full-band approach using a linear scale. We demonstrate that both of these scales are likely to be suboptimum in the context of band splitting. We then describe, h ow sub-band error profiles can lead to a new scale, which is betwe en a linear and a mel spacing, giving both an equalised sub-band error profile and an improved overall recognition accuracy.
[1] John S. D. Mason,et al. Optimization of perceptually-based spectral transforms in speaker identification , 1991, EUROSPEECH.
[2] Misha Pavel,et al. Towards ASR on partially corrupted speech , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.
[3] Hervé Bourlard,et al. A mew ASR approach based on independent processing and recombination of partial frequency bands , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.