Punctuating confusion networks for speech translation

Translating from confusion networks (CNs) has been proven to be more effective than translating from single best hypotheses. Moreover, it is widely accepted that the availability of good punctuation marks in the input can improve translation quality. At present, no ASR systems can generate punctuation marks in the word graphs, therefore CNs miss punctuation. In this paper we investigate the problem of adding punctuation marks into confusion networks. We investigate different punctuation strategies and show that the use of multiple hypotheses improves translation quality in a large-vocabulary speech translation task.