Multilingualization of Speech Processing

Speech-to-speech translation is a technology that connects people of different languages together and its multilingualization dramatically expands the circle of people connected. “Population” in Table 1.1a shows the potential number of people who can be part of the circle, when the corresponding language benefits from the technology. However, the same table also tells us that the languages of the world are incredibly diverse, and therefore multilingualization is not an easy task. Nevertheless, methods of processing speech sounds have been devised and developed uniformly regardless of language differences. What made this possible, is the wide commonality across languages due to the nature of language—it is a spontaneous tool created for the single purpose of mutual communication between humans who basically share the same biological hardware. This chapter will describe the multilingualization of automatic speech recognition (ASR) and text-to-speech synthesis (TTS); the two speech-related components of the three that constitute the speech-to-speech translation technology.