Perceptual Features Based Isolated Digit and Continuous Speech Recognition Using Iterative Clustering Approach

The main objective of this paper is to explore the effectiveness of perceptual features for performing isolated digits and continuous speech recognition. The proposed perceptual features are captured and training models are developed by K-means clustering procedure. Speech recognition system is evaluated on clean and noisy test speeches and the experimental results reveal the performance of the proposed algorithm in recognizing digits and continuous speeches based on minimum distance between test features and clusters. Performance of these features is tested on speeches randomly chosen from “TI Digits_1”, “TI Digits_2” and “TIMIT” databases. These perceptual features are evaluated with accuracy of 91% for speaker independent isolated digit recognition, 99.5% for clean continuous speech recognition and 87% for noisy continuous speech recognition. In this work, F-ratio is computed as a theoretical measure to validate the experimental results and distribution is used to justify the good experimental results statistically.

[1]  Hynek Hermansky,et al.  The challenge of inverse-E: the RASTA-PLP method , 1991, [1991] Conference Record of the Twenty-Fifth Asilomar Conference on Signals, Systems & Computers.

[2]  S. Arivazhagan,et al.  Fingerprint Verification Using Gabor Co-occurrence Features , 2007, International Conference on Computational Intelligence and Multimedia Applications (ICCIMA 2007).

[3]  Y. Venkataramani,et al.  Iterative Clustering Approach for Text Independent Speaker Identification using Multiple Features , 2008, 2008 2nd International Conference on Signal Processing and Communication Systems.

[4]  Y. Venkataramani,et al.  Text Independent Composite Speaker Identification/Verification Using Multiple Features , 2009, 2009 WRI World Congress on Computer Science and Information Engineering.

[5]  Y. Venkataramani,et al.  Effectiveness of LP Derived Features and DCTC in Twins Identification - Iterative Speaker Clustering Approach , 2007, International Conference on Computational Intelligence and Multimedia Applications (ICCIMA 2007).

[6]  牧野 正三 Perceptually based processing in automatic speech recognition , 1986 .

[7]  Hynek Hermansky,et al.  RASTA processing of speech , 1994, IEEE Trans. Speech Audio Process..

[8]  Biing-Hwang Juang,et al.  Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.

[9]  Y. Venkataramani,et al.  Use of perceptual features in iterative clustering based twins identification system , 2008, 2008 International Conference on Computing, Communication and Networking.