Editorial: Speech Synthesis Comes of Age
暂无分享,去创建一个
It gives me great pleasure to introduce the readership of the International Journal of Speech Technology to the first part of a double Special Issue on the topic of Speech Synthesis. This double Special Issue documents a selection of the papers presented at the 4th International Speech Communication Association (ISCA) Speech Synthesis Workshop held in Pitlochry in the beautiful highlands of Scotland, in August/September 2001. The Workshop was ably chaired by Paul Taylor, assisted by a Local Organising Committee of Andy Breen, John Local and myself. In all, 41 papers were presented in 7 sessions (6 oral and 1 poster) and more than 40 practical systems were demonstrated from 18 participants in 20 languages. The workshop was full, with 120 registrants attending. I hope and believe that these two issues will not only stand as testimony to a very successful workshop, but will also form a long-lasting reference to the state-of-the-art of speech synthesis at the beginning of the new millenium. When the first workshop in this series was held in Autrans, France, in 1990 under the joint chairmanship of Gérard Bailly and Christian Benoı̂t, ICSA did not then exist in its present form. It was in those days the European Speech Communication Association and the event was billed as the 1st ESCA International Conference on Speech Synthesis. Autrans set the trend for small four-yearly workshops held in a scenic mountain location with excellent food and drink, where participants could live and work together for several uninterrupted days. The trend continued with workshops at Lake Mohonk in New York State in 1994, chaired by Julia Hirschberg, and Jenolan Caves in the Blue Mountains of New South Wales, Australia, in 1998, chaired by Julie Vonwiller. From a personal perspective, each of these previous workshops was notable in retrospect for the emergence of a central theme. Others may disagree but, for me, Autrans marked the arrival of concatenative synthesis, Mohonk brought the importance of prosodic modeling to the fore, and Jenolan emphasised the importance of component and system evaluation for both research and development of synthesis. So what was the emergent theme of Pitlochry? Perhaps it is too early to say with complete certainty—hindsight takes a while to do its work—but I was definitely struck by the steady move away from basic science and towards commercial exploitation of the technology which has occurred since 1990. The evidence of this move was not just in the presented papers, but in the exhibition of more than 40 speech synthesis systems. This makes it fitting that these papers should be published in IJST with its emphasis on practical exploitation of speech technology, and this is the sense in which I claim (cf. the title of this editorial) that speech synthesis came of age in Pitlochry. The papers contained herein, and those to appear in the second of the two Special Issues, are based on some of the best presentations at the Workshop. All papers were extended to include additional material relevant to an archival journal which could not fit into the time and space constraints of the workshop itself. They have been thoroughly reviewed and I wish to express thanks to those peers who have helped with this important task. All too often, there is little or no public recognition of the effort that goes into reviewing, without which scientific and technical publishing would grind to a halt, so it gives me pleasure to acknowledge the help of the following colleagues: