论文信息 - Spontaneous Speech Corpus of Japanese

Spontaneous Speech Corpus of Japanese

Design issues of a spontaneous speech corpus is described. The corpus under compilation will contain 800-1000 hour spontaneously uttered Common Japanese speech and the morphologically annotated transcriptions. Also, segmental and intonation labeling will be provided for a subset of the corpus. The primary application domain of the corpus is speech recognition of spontaneous speech, but we plan to make it useful for natural language processing and phonetic/linguistic studies also.

[1] Mary Beth Beckman,et al. Tagging prosody and discourse structure in elicited spontaneous speech , 2000 .

[2] J. Pierrehumbert,et al. Japanese Tone Structure , 1988 .

[3] Mary E. Beckman,et al. A Typology of Spontaneous Speech , 1997, Computing Prosody.

[4] Yoshinori Sagisaka,et al. Computing Prosody, Computational Models for Processing Spontaneous Speech , 2011 .

[5] Campbell Nick. The ToBI (Tones and Break Indices) system and its application to Japanese , 1997 .