论文信息 - Aspects of speaking-face data corpus design methodology

Aspects of speaking-face data corpus design methodology

This paper develops a methodology for the design of audiovideo data corpora of the speaking face. Existing corpora are surveyed and the principles of data specification, data description and statistical representation are analysed both from an application-driven and from a scientifically motivated perspective. Furthermore, the possibility of “opportunistic” design of speaking-face data corpora is considered.

Michael Wagner | Roland Göcke | J. Bruce Millar

[1] Sara H. Basson,et al. NTIMIT: a phonetically balanced, continuous speech, telephone bandwidth speech database , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[2] J. Bruce Millar. Customisation and quality assessment of spoken language description , 1998, ICSLP.

[3] Yoni Bauduin,et al. Audio-Visual Speech Recognition , 2004 .

[4] J.N. Gowdy,et al. CUAVE: A new audio-visual database for multimodal human-computer interface research , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[5] Roland Göcke,et al. The audio-video australian English speech data corpus AVOZES , 2012, INTERSPEECH.

[6] Jiri Matas,et al. XM2VTSDB: The Extended M2VTS Database , 1999 .

[7] Jiri Matas,et al. Acquisition of a Large Database for Biometric Identity Verification , 1998 .

[8] Kuldip K. Paliwal,et al. Fast features for face authentication under illumination direction changes , 2003, Pattern Recognit. Lett..

[9] Jean-Philippe Thiran,et al. The BANCA Database and Evaluation Protocol , 2003, AVBPA.

[10] J. Bruce Millar. A structure for comprehensive spoken langnage description , 1998 .