Automatic Segmentation of Indonesian Speech into Syllables using Fuzzy Smoothed Energy Contour with Local Normalization, Splitting, and Assimilation

This paper discusses the usage of the short-term energy contour of speech smoothed by a fuzzy-based method to automatically segment it into syllabic units. Two new additional procedures, loca l normalization and postprocessing, are proposed to adapt to the Indone sian language. Testing to 220 Indonesian utterances showed that the local normali zation significantly improved the performance of the fuzzy-based smoothing. In th e postprocessing procedure, splitting and assimilation work in different ways. The splitting of missed short syllables sharply reduced deletion, but slightly in creased insertion. On the other hand, the assimilation of a single consonant segmen t into an expected previous or next segment slightly reduced insertion, but inc reased deletion. The use of splitting gave a higher accuracy than the assimilat ion and combined splitting- assimilation procedures, since in many cases the as similation keeps the unexpected insertions and overmerges the expected segments.

[1]  Suyanto,et al.  Yooi: An Indonesian Short Message Dictation , 2012 .

[2]  Steven Greenberg,et al.  Integrating syllable boundary information into speech recognition , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[3]  Javier M. Antelis,et al.  Syllable-based speech recognition using EMG , 2010, 2010 Annual International Conference of the IEEE Engineering in Medicine and Biology.

[4]  Vaibhava Goel,et al.  Syllable-a promising recognition unit for LVCSR , 1997, 1997 IEEE Workshop on Automatic Speech Recognition and Understanding Proceedings.

[5]  Er. Amanpreet Kaur,et al.  Segmentation of Continuous Punjabi Speech Signal into Syllables , 2010 .

[6]  L. Shastri,et al.  SYLLABLE DETECTION AND SEGMENTATION USING TEMPORAL FLOW NEURAL NETWORKS , 1999 .

[7]  P. Santiprabhob,et al.  A Framework for Connected Speech Recognition for Thai Language , 2005 .

[8]  Ivan Kopecek,et al.  Speech Recognition and Syllable Segments , 1999, TSD.

[9]  Steven Greenberg,et al.  Incorporating information from syllable-length time scales into automatic speech recognition , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[10]  João Paulo da Silva Neto,et al.  The use of syllable segmentation information in continuous speech recognition hybrid systems applied to the Portuguese language , 2000, INTERSPEECH.

[11]  Punjabi Syllables Segmentation of Continuous Punjabi Speech Signal into Syllables , 2010 .

[12]  Hema A. Murthy,et al.  Robust syllable segmentation and its application to syllable-centric continuous speech recognition , 2010, 2010 National Conference On Communications (NCC).

[13]  Mark Tatham,et al.  Automatic segmentation of recorded speech into syllables for speech synthesis , 2001, INTERSPEECH.

[14]  Rajesh M. Hegde,et al.  Segmentation of speech into syllable-like units , 2003, INTERSPEECH.

[15]  Susanne Burger,et al.  Syllable detection in read and spontaneous speech , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[16]  Hema A. Murthy,et al.  Automatic segmentation of continuous speech using minimum phase group delay functions , 2004, Speech Commun..

[17]  Tao Jianhua,et al.  Syllable Boundaries based Speech Segmentation in Demi-Syllable Level for Mandarin with HTK , 2002 .

[18]  Farshad Almasganj,et al.  Segmentation of speech into syllable units using fuzzy smoothed short term energy contour , 2011, 2011 18th Iranian Conference of Biomedical Engineering (ICBME).

[19]  Jeff A. Bilmes,et al.  Use of syllable nuclei locations to improve ASR , 2007, 2007 IEEE Workshop on Automatic Speech Recognition & Understanding (ASRU).

[20]  Ronald A. Cole,et al.  Speech recognition using syllable-like units , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[21]  Steven Greenberg,et al.  Performance improvements through combining phone- and syllable-scale information in automatic speech recognition , 1998, ICSLP.

[22]  Philip C. Woodland,et al.  Modelling syllable characteristics to improve a large vocabulary continuous speech recogniser , 1994, ICSLP.

[23]  Rudi C. Villing,et al.  Automatic Blind Syllable Segmentation for Continuous Speech , 2004 .

[24]  HEMA A MURTHY,et al.  Group delay functions and its applications in speech technology , 2011 .

[25]  P. Mermelstein Automatic segmentation of speech into syllabic units. , 1975, The Journal of the Acoustical Society of America.

[26]  Seiichi Nakagawa,et al.  A method for continuous speech segmentation using HMM , 1988, [1988 Proceedings] 9th International Conference on Pattern Recognition.

[27]  Francesco Cutugno,et al.  A syllable segmentation algorithm for English and italian , 2003, INTERSPEECH.