Evaluating feature set performance using the f-ratio and j-measures

Several methods of measuring the class separability in a feature space used to model speech sounds are described. A simple one-dimensional feature space is considered first where class discrimination is measured using the F-ratio. Using a conventional feature set comprising static, velocity and acceleration MFCCs a ranking of the discriminative ability of each coefficient is made for both a digit and alphabet vocabulary. These rankings are shown to be quite similar for the two vocabularies. Discrimination measures are extended to multi-dimensional feature spaces using the J-measures. It is postulated that high correlation exists between feature sets which have a good measured class discrimination and those which give good recognition accuracy. Experiments are presented which measure this correlation and use it to predict recognition accuracy for a given set of features. These estimates are shown to be accurate for previously unseen combinations of features. A brief analysis of the effect linear discriminant analysis on the feature space is made using these measures of separability. It is shown that LDA and separability measures are closely linked.