WARPING FUNCTION FOR SUB-BAND ERROR EQUALISATION IN SPEAKER RECOGNITION

It is possible that sub-band processing might provide benefits over the conventional full-band approach, in speech and speaker recognition [1] [2] [3] [4]. In our previous work [4] we examined error rates across sub-bands when using the standard mel frequency warping and a linear frequency scale showing an optimum was likely to lie somewhere between the two cases. Here, two new warping functions are derived which show not only a more even distribution of sub-band errors, but also an overall improvement in recognition rates when compared with the standard mel case. This is particularly true for a set of female speakers where the mel scale is shown to be distinctly sub-optimal, and error rates are reduced by more than 50%.