Automatic language identification: Using intonation as a discriminating feature

Current research into automatic language identification systems sees the problem as being related to speaker independent speech recognition and speaker identification. In particular, speaker indentification methods appear to outperform all other other methods and the incorporation of prosodie information has contributed only marginally to their success. This is a counterintuitive result suggesting that perhaps the brute-force application of standard available pattern recognition methods is inappropriate, not least because it ignores the linguistic cues that human beings use so easily and efficiently. It has been proposed that an attempt to rank parameter extraction with respect to a taxonomy of linguistic complexity would give results more in keeping with our own abilities to discriminate between various languages. For example, the pressure of discrimination concerning grossly different languages such as Mandarin Chinese and English would be low compared to that associated with an attempt to distinguish between two quite similar languages such as Dutch and German. The present work aims to differentiate between the two broadest groups separating tone and stress languages by using parameters which best model the linguistic differences between those groups. In particular, the supra-segmental feature of intonation is modelled as a memory effect which can be measured using the Hurst exponent.

[1]  Ronald A. Cole,et al.  Perceptual benchmarks for automatic language identification , 1994, Proceedings of ICASSP '94. IEEE International Conference on Acoustics, Speech and Signal Processing.

[2]  S. Eady,et al.  Differences in the F0 Patterns of Speech: Tone Language Versus Stress Language , 1981 .

[3]  J. R. Wallis,et al.  Robustness of the rescaled range R/S in the measurement of noncyclic long run statistical dependence , 1969 .

[4]  Kuldip K. Paliwal Neural net classifiers for robust speech recognition under noisy environments , 1990, International Conference on Acoustics, Speech, and Signal Processing.