Semi-supervised meeting event recognition with adapted HMMs

This paper investigates the use of unlabeled data to help labeled data for audio-visual event recognition in meetings. To deal with situations in which it is difficult to collect enough labeled data to capture event characteristics, but collecting a large amount of unlabeled data is easy, we present a semi-supervised framework using HMM adaptation techniques. Instead of directly training one model for each event, we first train a well-estimated general event model for all events using both labeled and unlabeled data, and then adapt the general model to each specific event model using its own labeled data. We illustrate the proposed approach with a set of eight audio-visual events defined in meetings. Experiments and comparison with the fully-supervised baseline method show the validity of the proposed semi-supervised approach.

[1]  Yoshua Bengio,et al.  Markovian Models for Sequential Data , 2004 .

[2]  M. Seeger Learning with labeled and unlabeled dataMatthias , 2001 .

[3]  Biing-Hwang Juang,et al.  Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.

[4]  Samy Bengio,et al.  ADAPTED GENERATIVE MODELS FOR FACE VERIFICATION , 2003 .

[5]  Samy Bengio,et al.  Modeling human interaction in meetings , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[6]  Samy Bengio,et al.  Face verification using adapted generative models , 2004, Sixth IEEE International Conference on Automatic Face and Gesture Recognition, 2004. Proceedings..

[7]  Douglas A. Reynolds,et al.  Speaker Verification Using Adapted Gaussian Mixture Models , 2000, Digit. Signal Process..

[8]  Chin-Hui Lee,et al.  Maximum a posteriori estimation for multivariate Gaussian mixture observations of Markov chains , 1994, IEEE Trans. Speech Audio Process..

[9]  Eric Horvitz,et al.  Layered representations for learning and inferring office activity from multiple sensory channels , 2004, Comput. Vis. Image Underst..

[10]  Stuart J. Russell,et al.  Dynamic bayesian networks: representation, inference and learning , 2002 .

[11]  Jeff A. Bilmes,et al.  A gentle tutorial of the em algorithm and its application to parameter estimation for Gaussian mixture and hidden Markov models , 1998 .