Speech recognition by using fractal transformation for English and Polish vowels

We consider a few aspects of speech recognition such as: basic phonemes, words and syllables extraction from the continuous speech. On this purpose we have used a recent holistic method based on fractal transformation. We discuss with the dose of criticism the possibility of employing iterated function system (IFS) and its parameters as a features’ generator for recognition of basic phoneme. We have found the results obtained in recognition of English and Polish vowels very unsatisfactory and unreliable, which question the results published recently. More promising application of IFS we see in extracting words and syllables.

[1]  Louis A. Liporace,et al.  Maximum likelihood estimation for multivariate observations of Markov sources , 1982, IEEE Trans. Inf. Theory.

[2]  C. Sparrow The Fractal Geometry of Nature , 1984 .

[3]  L. R. Rabiner,et al.  A probabilistic distance measure for hidden Markov models , 1985, AT&T Technical Journal.

[4]  Lalit R. Bahl,et al.  Maximum mutual information estimation of hidden Markov model parameters for speech recognition , 1986, ICASSP '86. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[5]  Ken-ichi Iso,et al.  Speaker-independent word recognition using dynamic programming neural networks , 1989, International Conference on Acoustics, Speech, and Signal Processing,.

[6]  Geoffrey E. Hinton,et al.  A time-delay neural network architecture for isolated word recognition , 1990, Neural Networks.

[7]  Erik L.J. Bohez,et al.  Amplitude scale method: new and efficient approach to measure fractal dimension of speech waveforms , 1992 .

[8]  Erik L.J. Bohez,et al.  Fractal dimension and iterated function system (IFS) for speech recognition , 1992 .

[9]  Biing-Hwang Juang,et al.  Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.

[10]  Yuan-Fu Liao,et al.  Speech recognition with hierarchical recurrent neural networks , 1995, Pattern Recognit..

[11]  Takeshi Matsumura,et al.  Non-uniform unit based HMMs for continuous speech recognition , 1995, Speech Commun..

[12]  Sam Kwong,et al.  A maximum model distance approach for HMM-based speech recognition , 1998, Pattern Recognit..

[13]  Kim-Fung Man,et al.  An improved maximum model distance approach for HMM-based speech recognition systems , 2000, Pattern Recognit..

[14]  Climent Nadeu,et al.  Time and frequency filtering of filter-bank energies for robust HMM speech recognition , 2000, Speech Commun..

[15]  Ulla Uebler,et al.  Multilingual speech recognition in seven languages , 2001, Speech Commun..

[16]  Roy D. Patterson,et al.  Segregating information about the size and shape of the vocal tract using a time-domain auditory model: The stabilised wavelet-Mellin transform , 2002, Speech Commun..

[17]  Elmar Nöth,et al.  Integrated recognition of words and prosodic phrase boundaries , 2002, Speech Commun..

[18]  Guoliang Tao,et al.  Algorithm for Clustering Analysis of ECG Data , 2005, 2005 IEEE Engineering in Medicine and Biology 27th Annual Conference.