论文信息 - A system for real time collaborative transcription correction

A system for real time collaborative transcription correction

We present a system to enable efficient, collaborative human correction of ASR transcripts, designed to operate in real-time situations, for example, when post-editing live captions generated for news broadcasts. In the system, confusion networks derived from ASR lattices are used to highlight low-confident words and present alternatives to the user for quick correction. The system uses a client-server architecture, whereby information about each manual edit is posted to the server. Such information can be used to dynamically update the one-best ASR output for all utterances currently in the editing pipeline. We propose to make updates in three different ways; by finding a new one-best path through an existing ASR lattice consistent with the correction received; by identifying further instances of out-of-vocabulary terms entered by the user; and by adapting the language model on the fly. Updates are received asynchronously by the client.

Peter Bell | Catherine Lai | Joachim Fainberg | Mark Sinclair

[1] Peter Bell,et al. Unsupervised Adaptation of Recurrent Neural Network Language Models , 2016, INTERSPEECH.

[2] Lukás Burget,et al. Recurrent neural network based language model , 2010, INTERSPEECH.

[3] Sanjeev Khudanpur,et al. Using proxies for OOV keywords in the keyword search task , 2013, 2013 IEEE Workshop on Automatic Speech Recognition and Understanding.

[4] Mark J. F. Gales,et al. The MGB challenge: Evaluating multi-genre broadcast media recognition , 2015, 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU).

[5] Kostas Saltzis,et al. BREAKING NEWS ONLINE , 2012 .

[6] Michael I. Jordan,et al. Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[7] Daniel Povey,et al. The Kaldi Speech Recognition Toolkit , 2011 .