论文信息 - Context-modulated vowel discrimination using connectionist networks☆

Context-modulated vowel discrimination using connectionist networks☆

Abstract A method for constructing isomorphic context-specific connectionist networks for phoneme recognition is introduced. It is shown that such networks can be merged into a single context-modulated network that makes use of second-order unit interconnections. This is accomplished by computing a minimal basis for the set of context-specific weight vectors using the singular value decomposition algorithm. Compact networks are thus obtained in which the phoneme discrimination surfaces are modulated by phonetic context. These methods are demonstrated on a small but non-trivial vowel recognition problem. It is shown that a context-modulated network can achieve a lower error rate than a context-independent network by a factor of 7. Similar results are obtained using optimized rather than constructed networks.

Raymond L. Watrous

[1] Kevin J. Lang. A time delay neural network architecture for speech recognition , 1989 .

[2] Michael I. Jordan,et al. Task Decomposition Through Competition in a Modular Connectionist Architecture: The What and Where Vision Tasks , 1990, Cogn. Sci..

[3] C. Lee Giles,et al. Encoding Geometric Invariances in Higher-Order Neural Networks , 1987, NIPS.

[4] Geoffrey E. Hinton,et al. Phoneme recognition using time-delay neural networks , 1989, IEEE Trans. Acoust. Speech Signal Process..

[5] Leigh Lisker. The Distinction between [æ] and [ε]: A Problem in Acoustic Analysis@@@The Distinction between [ae] and [e]: A Problem in Acoustic Analysis , 1948 .

[6] S. Blumstein,et al. Phonetic features and acoustic invariance in speech , 1981, Cognition.

[7] Raymond L. Watrous. Context‐modulated discrimination of similar vowels using second‐order connectionist networks , 1989 .

[8] Richard M. Schwartz,et al. Improved hidden Markov modeling of phonemes for continuous speech recognition , 1984, ICASSP.

[9] A. Liberman,et al. Acoustic Loci and Transitional Cues for Consonants , 1954 .

[10] K. Stevens,et al. Acoustical description of syllabic nuclei: an interpretation in terms of a dynamic model of articulation. , 1966, The Journal of the Acoustical Society of America.

[11] Thomas Brooks Martin,et al. Acoustic recognition of a limited vocabulary in continuous speech , 1970 .

[12] Richard Lippmann,et al. Review of Neural Networks for Speech Recognition , 1989, Neural Computation.

[13] G. E. Peterson,et al. Duration of Syllable Nuclei in English , 1960 .

[14] Alex Waibel,et al. The Meta-Pi network: connectionist rapid adaptation for high-performance multi-speaker phoneme recognition , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[15] Geoffrey E. Hinton,et al. Learning internal representations by error propagation , 1986 .

[16] Raymond L. Watrous. Phoneme Discrimination Using Connectionist Networks , 1990, Machine Learning: From Theory to Applications.

[17] Geoffrey E. Hinton. A Parallel Computation that Assigns Canonical Object-Based Frames of Reference , 1981, IJCAI.