PARSING REAL INPUT IN JANUS: A CONCEPT-BASED APPROACH TO SPOKEN LANGUAGE TRANSLATION

As part of the JANUS speech-to-speech translation project[5], we have developed a translation system that successfully parses full utterances and is effective in parsing spontaneous speech, which is often syntactically ill-formed. The system is concept-based, meaning that it has no explicit notion of a sentence but rather views each input utterance as a potential sequence of concepts. Generation is performed by translating each of these concepts in whole phrases into the target language, consulting lookup tables only for low-level concepts such as numbers. Currently, we are working on an appointment scheduling task, parsing English, German, Spanish, and Korean input and producing output in those same languages and also Japanese.