Wordspotter training using figure-of-merit back propagation

A new approach to wordspotter training is presented which directly maximizes the figure of merit (FOM) defined as the average detection rate over a specified range of false alarm rates. This systematic approach to discriminant training for wordspotters eliminates the necessity of ad hoc thresholds and tuning. It improves the FOM of wordspotters tested using cross-validation on the credit-card speech corpus training conversations by 4 to 5 percentage points to roughly 70%. This improved performance requires little extra complexity during wordspotting and only two extra passes through the training data during training. The FOM gradient is computed analytically for each putative hit, back-propagated through HMM word models using the Viterbi alignment, and used to adjust RBF hidden node centers and state-weights associated with every node in HMM keyword models.<<ETX>>

[1]  Hynek Hermansky,et al.  RASTA-PLP speech analysis technique , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[2]  Elliot Singer,et al.  A speech recognizer using radial basis function neural networks in an HMM framework , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[3]  Richard Lippmann,et al.  Hybrid neural-network/HMM approaches to wordspotting , 1993, 1993 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[4]  Herbert Gish,et al.  Phonetic training and language modeling for word spotting , 1993, 1993 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[5]  Richard Lippmann,et al.  A Boundary Hunting Radial Basis Function Classifier which Allocates Centers Constructively , 1992, NIPS.

[6]  Alexander H. Waibel,et al.  Improving the MS-TDNN for word spotting , 1993, 1993 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[7]  Richard P. Lippmann,et al.  LNKnet: Neural Network, Machine-Learning, and Statistical Software for Pattern Classification , 1993 .