Voice onset/offset based local features (VOOLF) for Arabic Speaker recognition

Local features for any pattern recognition system are based on the information extracted locally. In this paper, a local feature extraction technique is developed, which captures the formant transition and voice onset/off set of a speaker. We named this technique as voice onset/offset local features (VOOLF). These features are extracted in the time spectrum domain by taking the moving average on the diagonal directions. These proposed features are compared with MFCC for speaker recognition system. The results showed that proposed technique perform better than the commonly used MFCC. The proposed method is able to capture the formant transitions and onset/offset of the speaker; hence this resulted in recognition rate higher than the other speech features.

[1]  Tsuneo Nitta A novel feature-extraction for speech recognition based on multiple acoustic-feature planes , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[2]  D.A. Reynolds,et al.  Large population speaker identification using clean and telephone speech , 1995, IEEE Signal Processing Letters.

[3]  Takashi Fukuda,et al.  Orthogonalized Distinctive Phonetic Feature Extraction for Noise-Robust Automatic Speech Recognition , 2004, IEICE Trans. Inf. Syst..

[4]  M. G. Sumithra,et al.  A New Speaker Recognition System with Combined Feature Extraction Techniques , 2011 .

[5]  Tsuneo Nitta Feature extraction for speech recognition based on orthogonal acoustic-feature planes and LDA , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[6]  Foyzul Hassan,et al.  Local Feature or Mel Frequency Cepstral Coefficients - Which One Is Better for MLN-Based Bangla Speech Recognition? , 2011, ACC.

[7]  M. Alsulaiman,et al.  Multidirectional Local Feature for Speaker Recognition , 2012, 2012 Third International Conference on Intelligent Systems Modelling and Simulation.

[8]  Douglas A. Reynolds,et al.  Speaker Verification Using Adapted Gaussian Mixture Models , 2000, Digit. Signal Process..

[9]  Mübeccel Demirekler,et al.  Why does output normalization create problems in multiple classifier systems? , 2002, Object recognition supported by user interaction for service robots.

[10]  Aaron D. Lawson,et al.  Survey and evaluation of acoustic features for speaker recognition , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[11]  S. R. Mahadeva Prasanna,et al.  Analysis, Feature Extraction, Modeling and Testing Techniques for Speaker Recognition , 2009 .

[12]  Muhammad Ghulam,et al.  Arabic Speaker Recognition: Babylon Levantine Subset Case Study , 2010 .

[13]  M. A. Anusuya,et al.  Front end analysis of speech recognition: a review , 2011, Int. J. Speech Technol..