The VidTIMIT Database

This communication describes the multi-modal VidTIMIT database, which can be useful for research involving mono- or multi-modal speech recognition or person authentication. It is comprised of video and corresponding audio recordings of 43 volunteers, reciting short sentences selected from the NTIMIT corpus.

[1]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[2]  Kuldip K. Paliwal,et al.  Likelihood normalization for face authentication in variable recording conditions , 2002, Proceedings. International Conference on Image Processing.

[3]  Johnny Mariéthoz,et al.  Comparison of Client Model Adaptation Schemes , 2001 .

[4]  David G. Stork,et al.  Pattern Classification , 1973 .

[5]  Jiri Matas,et al.  XM2VTSDB: The Extended M2VTS Database , 1999 .

[6]  Kuldip K. Paliwal,et al.  Polynomial features for robust face authentication , 2002, Proceedings. International Conference on Image Processing.

[7]  Conrad Sanderson,et al.  Automatic Person Verification Using Speech and Face Information , 2003 .

[8]  Samy Bengio,et al.  Evaluation of Biometric Technology on XM2VTS , 2001 .

[9]  Luc Vandendorpe,et al.  The M2VTS Multimodal Face Database (Release 1.00) , 1997, AVBPA.

[10]  Samy Bengio,et al.  Learning the decision function for speaker verification , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[11]  Douglas A. Reynolds,et al.  The NIST speaker recognition evaluation - Overview, methodology, systems, results, perspective , 2000, Speech Commun..

[12]  S. Furui,et al.  Cepstral analysis technique for automatic speaker verification , 1981 .

[13]  Sadaoki Furui,et al.  Recent advances in speaker recognition , 1997, Pattern Recognit. Lett..

[14]  Douglas A. Reynolds,et al.  Speaker Verification Using Adapted Gaussian Mixture Models , 2000, Digit. Signal Process..

[15]  Sara H. Basson,et al.  NTIMIT: a phonetically balanced, continuous speech, telephone bandwidth speech database , 1990, International Conference on Acoustics, Speech, and Signal Processing.