论文信息 - Automatic Segmentation of Indonesian Speech into Syllables using Fuzzy Smoothed Energy Contour with Local Normalization, Splitting, and Assimilation

Automatic Segmentation of Indonesian Speech into Syllables using Fuzzy Smoothed Energy Contour with Local Normalization, Splitting, and Assimilation

This paper discusses the usage of the short-term energy contour of speech smoothed by a fuzzy-based method to automatically segment it into syllabic units. Two new additional procedures, loca l normalization and postprocessing, are proposed to adapt to the Indone sian language. Testing to 220 Indonesian utterances showed that the local normali zation significantly improved the performance of the fuzzy-based smoothing. In th e postprocessing procedure, splitting and assimilation work in different ways. The splitting of missed short syllables sharply reduced deletion, but slightly in creased insertion. On the other hand, the assimilation of a single consonant segmen t into an expected previous or next segment slightly reduced insertion, but inc reased deletion. The use of splitting gave a higher accuracy than the assimilat ion and combined splitting- assimilation procedures, since in many cases the as similation keeps the unexpected insertions and overmerges the expected segments.

Agfianto Eko Putra | Suyanto Suyanto | S. Suyanto | A. E. Putra

[1] Suyanto,et al. Yooi: An Indonesian Short Message Dictation , 2012 .

[2] Steven Greenberg,et al. Integrating syllable boundary information into speech recognition , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[3] Javier M. Antelis,et al. Syllable-based speech recognition using EMG , 2010, 2010 Annual International Conference of the IEEE Engineering in Medicine and Biology.

[4] Vaibhava Goel,et al. Syllable-a promising recognition unit for LVCSR , 1997, 1997 IEEE Workshop on Automatic Speech Recognition and Understanding Proceedings.

[5] Er. Amanpreet Kaur,et al. Segmentation of Continuous Punjabi Speech Signal into Syllables , 2010 .

[6] L. Shastri,et al. SYLLABLE DETECTION AND SEGMENTATION USING TEMPORAL FLOW NEURAL NETWORKS , 1999 .

[7] P. Santiprabhob,et al. A Framework for Connected Speech Recognition for Thai Language , 2005 .

[8] Ivan Kopecek,et al. Speech Recognition and Syllable Segments , 1999, TSD.

[9] Steven Greenberg,et al. Incorporating information from syllable-length time scales into automatic speech recognition , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[10] João Paulo da Silva Neto,et al. The use of syllable segmentation information in continuous speech recognition hybrid systems applied to the Portuguese language , 2000, INTERSPEECH.

[11] Punjabi Syllables. Segmentation of Continuous Punjabi Speech Signal into Syllables , 2010 .

[12] Hema A. Murthy,et al. Robust syllable segmentation and its application to syllable-centric continuous speech recognition , 2010, 2010 National Conference On Communications (NCC).

[13] Mark Tatham,et al. Automatic segmentation of recorded speech into syllables for speech synthesis , 2001, INTERSPEECH.

[14] Rajesh M. Hegde,et al. Segmentation of speech into syllable-like units , 2003, INTERSPEECH.

[15] Susanne Burger,et al. Syllable detection in read and spontaneous speech , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[16] Hema A. Murthy,et al. Automatic segmentation of continuous speech using minimum phase group delay functions , 2004, Speech Commun..

[17] Tao Jianhua,et al. Syllable Boundaries based Speech Segmentation in Demi-Syllable Level for Mandarin with HTK , 2002 .

[18] Farshad Almasganj,et al. Segmentation of speech into syllable units using fuzzy smoothed short term energy contour , 2011, 2011 18th Iranian Conference of Biomedical Engineering (ICBME).

[19] Jeff A. Bilmes,et al. Use of syllable nuclei locations to improve ASR , 2007, 2007 IEEE Workshop on Automatic Speech Recognition & Understanding (ASRU).

[20] Ronald A. Cole,et al. Speech recognition using syllable-like units , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[21] Steven Greenberg,et al. Performance improvements through combining phone- and syllable-scale information in automatic speech recognition , 1998, ICSLP.

[22] Philip C. Woodland,et al. Modelling syllable characteristics to improve a large vocabulary continuous speech recogniser , 1994, ICSLP.

[23] Rudi C. Villing,et al. Automatic Blind Syllable Segmentation for Continuous Speech , 2004 .

[24] HEMA A MURTHY,et al. Group delay functions and its applications in speech technology , 2011 .

[25] P. Mermelstein. Automatic segmentation of speech into syllabic units. , 1975, The Journal of the Acoustical Society of America.

[26] Seiichi Nakagawa,et al. A method for continuous speech segmentation using HMM , 1988, [1988 Proceedings] 9th International Conference on Pattern Recognition.

[27] Francesco Cutugno,et al. A syllable segmentation algorithm for English and italian , 2003, INTERSPEECH.