A Preliminary Plains Cree Speech Synthesizer

This paper discusses the development and evaluation of a Speech Synthesizer for Plains Cree, an Algonquian language of North America. Synthesis is achieved using Simple4All and evaluation was performed using a modified Cluster Identification, Semantically Unpredictable Sentence, and a basic dichotomized judgment task. Resulting synthesis was not well received; however, observations regarding the process of speech synthesis evaluation in North American indigenous communities were made: chiefly, that tolerance for variation is often much lower in these communities than for majority languages. The evaluator did not recognize grammatically consistent but semantically nonsense strings as licit language. As a result, monosyllabic clusters and semantically unpredictable sentences proved not the most appropriate evaluate tools. Alternative evaluation methods are discussed.

[1]  D. Mandelbaum,et al.  The Plains Cree : an ethnographic, historical, and comparative study , 1940 .

[2]  H. C. Wolfart,et al.  Plains Cree: A Grammatical Study , 1976 .

[3]  Freda Ahenakew Waskahikaniwiyiniw-Acimowina: Stories of the House People , 1987 .

[4]  Ute Jekosch The cluster-identification test , 1992, ICSLP.

[5]  Martine Grice,et al.  The SUS test: A method for the assessment of text-to-speech synthesis intelligibility using Semantically Unpredictable Sentences , 1996, Speech Commun..

[6]  Nancy LeClaire,et al.  Alberta Elders' Cree Dictionary/alperta ohci kehtehayak nehiyaw otwestamakewasinahikan , 1998 .

[7]  Alan W. Black,et al.  Issues in building general letter to sound rules , 1998, SSW.

[8]  H. C. Wolfart,et al.  They knew both sides of medicine : Cree tales of curing and cursing told by Alice Ahenakew , 2000 .

[9]  James H. Martin,et al.  Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition , 2000 .

[10]  Kishore Prahallad,et al.  Sub-Phonetic Modeling For Capturing Pronunciation Variations For Conversational Speech Synthesis , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[11]  J. Muehlbauer,et al.  Vowel spaces in Plains Cree , 2012, Journal of the International Phonetic Association.

[12]  Simon King,et al.  Lightly supervised discriminative training of grapheme models for improved sentence-level alignment of speech and text data , 2013, INTERSPEECH.

[13]  Oliver Watts,et al.  Lightly supervised GMM VAD to use audiobook for speech synthesiser , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[14]  Benjamin V. Tucker,et al.  Vowels Spaces and Reduction in Plains Cree , 2015 .

[15]  Zhizheng Wu,et al.  Merlin: An Open Source Neural Network Speech Synthesis System , 2016, SSW.

[16]  Heiga Zen,et al.  WaveNet: A Generative Model for Raw Audio , 2016, SSW.

[17]  Oliver Watts,et al.  ALISA: An automatic lightly supervised speech segmentation and alignment tool , 2016, Comput. Speech Lang..

[18]  Florian Hinterleitner,et al.  Quality of synthetic speech : perceptual dimensions, influencing factors, and instrumental assessment , 2017 .

[19]  Lene Antonsen,et al.  Learning from the computational modelling of Plains Cree verbs , 2017, Morphology.