论文信息 - A logistic regression model for detecting prominences

A logistic regression model for detecting prominences

The paper describes the development of a model for identifying points of prominence in speech. This model can be used as a first step in intonational labeling of corpora that are used in some speech synthesis systems (A. Black and P. Taylor, 1995). The working definition of prominence is that starred ToBI accents (K. Silverman et al., 1992), that is, H*, L*, L*+H, L+H*, and H+IH*, are prominent. The prominence detection model developed here is based on the sums of products vowel duration model (J.P.H. van Santen, 1992). The model was trained and tested on different portions of the Boston University Radio News corpus and achieves accuracy results of 86.3% correct identification with 12.52 false detection. The results are comparable to those of previous work (C.W. Wightman and W.N. Campbell, 1995): 85.9% correct identification with 10.7% false detection. The advantage of this model is that it can be trained quickly on as few as 600 data points, reducing the need for large corpora.

Arman Maghbouleh

[1] T. H. Crystal,et al. Segmental durations in connected speech signals , 1981 .

[2] Alan W. Black,et al. CHATR: a generic speech synthesis system , 1994, COLING.

[3] T. Crystal,et al. Segmental durations in connected-speech signals: Syllabic stress , 1988 .

[4] K. D. Jong. The supraglottal articulation of prominence in English: Linguistic stress as localized hyperarticulation , 1995 .

[5] Dwight L. Bolinger,et al. Intonation and Its Uses: Melody in Grammar and Discourse , 1989 .

[6] Jan P. H. van Santen,et al. Assignment of segmental duration in text-to-speech synthesis , 1994, Comput. Speech Lang..

[7] Mari Ostendorf,et al. TOBI: a standard for labeling English prosody , 1992, ICSLP.

[8] Mari Ostendorf,et al. Automatic labeling of prosodic patterns , 1994, IEEE Trans. Speech Audio Process..

[9] T. Crystal,et al. Segmental durations in connected‐speech signals: Current results , 1988 .

[10] Jan P. H. van Santen,et al. Contextual effects on vowel duration , 1992, Speech Commun..

[11] D. Klatt. Linguistic uses of segmental duration in English: acoustic and perceptual evidence. , 1976, The Journal of the Acoustical Society of America.

[12] D. Bolinger. Intonation and its parts : melody in spoken English , 1987 .