In this paper, we present models for predicting major phrase boundary location and pause insertion using a stochastic context-free grammar (SCFG) from an input part of speech (POS) sequence. These prediction models were made with similar ideas as both major phrase boundary location and pause insertion have similar characteristics. In these models, word attributes and left/right-branching probability parameters representing stochastic phrasing characteristics are used as input parameters of a feed-forward neural network for the prediction. To obtain the probabilities, first, major phrase characteristics and pause characteristics are learned through the SCFG training using the inside-outside algorithm. Then, the probabilities of each bracketing structure are computed using the SCFG. Experiments were carried out to confirm the effectiveness of these stochastic models for the prediction of major phrase boundary locations and pause locations. In a test predicting major phrase boundaries with unseen data, 92.9% of the major phrase boundaries were correctly predicted with a 16.9% false insertion rate. For pause prediction with unseen data, 85.2% of the pause boundaries were correctly predicted with a 9.1% false insertion rate.
[1]
Shigeru Katagiri,et al.
A large-scale Japanese speech database
,
1990,
ICSLP.
[2]
Yoshinori Sagisaka,et al.
Pause characteristics and local phrase-dependency structure in Japanese
,
1992,
ICSLP.
[3]
Keikichi Hirose,et al.
Manifestation of linguistic and para-linguistic information in the voice fundamental frequency contours of spoken Japanese
,
1990,
ICSLP.
[4]
Fernando Pereira,et al.
Inside-Outside Reestimation From Partially Bracketed Corpora
,
1992,
HLT.
[5]
Steve Young,et al.
Applications of stochastic context-free grammars using the Inside-Outside algorithm
,
1990
.
[6]
Kiyohiro Shikano,et al.
Fast back-propagation learning methods for large phonemic neural networks
,
1989,
EUROSPEECH.
[7]
Y. Sagisaka,et al.
Optimization of intonation control using statistical F/sub 0/ resetting characteristics
,
1992,
[Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.