Experimental Evaluation of Probabilistic Similarity for Spoken Term Detection

In this paper, the use of probabilistic similarity and the likelihood ratio for spoken term detection is investigated. The object of spoken term detection is to rank retrieved spoken terms according to their distance from a query. First, we evaluate several probabilistic similarity functions for use as a sophisticated distance. In particular, we investigate probabilistic similarity for Gaussian mixture models using the closedform solutions and pseudo-sampling approximation of Kullback–Leibler divergence. And then we propose additive scoring factors based on the likelihood ratio of each individual subword. An experimental evaluation demonstrates that we can achieve an improved detection performance by using probabilistic similarity functions and applying the likelihood ratio.

[1]  Kohji Fukunaga,et al.  Introduction to Statistical Pattern Recognition-Second Edition , 1990 .

[2]  Shi-wook Lee,et al.  Combining multiple subword representations for open-vocabulary spoken document retrieval , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[3]  R. A. Leibler,et al.  On Information and Sufficiency , 1951 .

[4]  H. Jeffreys An invariant form for the prior probability in estimation problems , 1946, Proceedings of the Royal Society of London. Series A. Mathematical and Physical Sciences.

[5]  Hui Jiang,et al.  Confidence measures for speech recognition: A survey , 2005, Speech Commun..

[6]  Don H. Johnson,et al.  Symmetrizing the Kullback-Leibler Distance , 2001 .

[7]  John R. Hershey,et al.  Approximating the Kullback Leibler Divergence Between Gaussian Mixture Models , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[8]  K. Maekawa CORPUS OF SPONTANEOUS JAPANESE : ITS DESIGN AND EVALUATION , 2003 .

[9]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[10]  Steve Young,et al.  The HTK book version 3.4 , 2006 .