论文信息 - Pitch detection and formant analysis of Arabic speech processing

Pitch detection and formant analysis of Arabic speech processing

Abstract Speech processing and synthesis has been a well researched area for several years linked to a renewal of interest especially in electronics, artificial intelligence, telecommunications, and even in medicine. For example, the implantation of speaker recognition systems, the development of new low bit rate coders, speech synthesis and assistance of the handicapped person, the identification of some neurological and ORL pathologies by vocal analysis are considered as the most promising applications in this field. In fact, for these applications, speech processing constitutes an essential stage of the extraction and the identification of vocal parameters (pitch, formants, stamp…) which depend on the physical, physiological and the linguistic structure of the spoken language. Moreover, the variability of the speech signal (children, male, female sounds) and its prosodic aspects (shouted, sung sounds…) render the task of treatment more difficult and oblige us to observe and acquire a large quantity of speech signals to extract that which is relevant. Hence, we have improved the processing part by the development of a convivial hard and soft environment under MATLAB 5-2. The originality of the work is that the developed program works in real time when associated with the MATLAB real time toolbox. In fact, the new speech processing program computes the pitch period, extracts the formant frequencies of Arabic speech and identifies the speaker vocal stamp. The database consists of Arabic sentences phonetically balanced, pronounced by several speakers. After acquisition, conversion and segmentation, we identify the voiced–unvoiced (V/UV) speech by analysing its zero-crossing evolution. Then we compute the fundamental frequency, the formants and the spectral envelope (vocal stamp). These parameters are not used only in speech synthesis and recognition but also in the prediction of the speaker's emotional and psychological state.

Lamia Bouafif | Adnène Cherif | Turkia Dabbabi

[1] Dennis H. Klatt,et al. Prediction of perceived phonetic distance from critical-band spectra: A first step , 1982, ICASSP.

[2] I. Titze,et al. Comparison of Fo extraction methods for high-precision voice perturbation measurements. , 1993, Journal of speech and hearing research.

[3] John E. Markel,et al. Linear Prediction of Speech , 1976, Communication and Cybernetics.

[4] A. Noll. Cepstrum pitch determination. , 1967, The Journal of the Acoustical Society of America.

[5] A. Gray,et al. Distance measures for speech processing , 1976 .

[6] Aaron E. Rosenberg,et al. A comparative performance study of several pitch detection algorithms , 1976 .