Frame distance array algorithm parameter tune-up for TIMIT corpus automatic speech segmentation

This work is related to unsupervised automatic speech segmentation. An experiment was carried out on the Frame Distance Array (FDA) algorithm with a main goal of the algorithm parameter tune-up. The experiment was carried out by applying the algorithm on TIMIT corpus and by using MFCC as the speech signal features. The parameters tuned up in this work are the frame length, the frame increment, the number of test frames and the test frame step size. The best combination of values was chosen based on the observations on the detection rate, the miss rate and the false boundary rate. The best parameter tune-up found at 23 ms, 1.5 ms, 9 frames and 2 frames for the frame length, the frame increment, the number of test frames and the test frame step size respectively.

[1]  Mohammed A. Al-Manie,et al.  Arabic Speech Segmentation: Automatic Verses Manual Method and Zero Crossing Measurements , 2010 .

[2]  Thippur V. Sreenivas,et al.  Automatic speech segmentation using average level crossing rate information , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[3]  Husni Al-Muhtaseb,et al.  AUTOMATIC SEGMENTATION OF ARABIC SPEECH , 2007 .

[4]  Jean-Philippe Goldman,et al.  EasyAlign: An Automatic Phonetic Alignment Tool Under Praat , 2011, INTERSPEECH.

[5]  Raymond A. Barnett,et al.  College Algebra with Trigonometry , 1979 .

[6]  Bert Cranen,et al.  A computational model for unsupervised word discovery , 2007, INTERSPEECH.

[7]  Manish Sharma,et al.  "Blind" speech segmentation: automatic segmentation of speech without linguistic knowledge , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[8]  Mansour M. Alghmadi,et al.  KACST Arabic Phonetics Database , 2003 .

[9]  Unto K. Laine,et al.  Blind Segmentation of Speech Using Non-Linear Filtering Methods , 2011 .

[10]  Andreas Stolcke,et al.  Highly accurate phonetic segmentation using boundary correction models and system fusion , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[11]  John H. L. Hansen,et al.  Discrete-Time Processing of Speech Signals , 1993 .

[12]  Kris Demuynck,et al.  A Comparison of Different Approaches to Automatic Speech Segmentation , 2002, TSD.

[13]  Thomas Niesler,et al.  Automatic segmentation of TIMIT by dynamic programming , 2012 .

[14]  S. Rapp Automatic Phonemic Transcription and Linguistic Annotation from Known Text with Hidden Markov Models , 1995 .

[15]  T. M. Nazmy,et al.  A Novel Method for Arabic Consonant/Vowel Segmentation Using Wavelet Transform , 2005, Egypt. Comput. Sci. J..

[16]  Mohammed A. Al-Manie,et al.  Automatic speech segmentation using the Arabic phonetic database , 2009 .

[17]  Klara Vicsi,et al.  Language independent automatic speech segmentation into phoneme-like units on the base of acoustic distinctive features , 2013, 2013 IEEE 4th International Conference on Cognitive Infocommunications (CogInfoCom).

[18]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.