论文信息 - A system for automatic broadcast news summarisation, geolocation and translation

A system for automatic broadcast news summarisation, geolocation and translation

An increasing amount of news content is produced in audiovideo form every day. To effectively analyse and monitoring this multilingual data stream, we require methods to extract and present audio content in accessible ways. In this paper, we describe an end-to-end system for processing and browsing audio news data. This fully automated system brings together our recent research on audio scene analysis, speech recognition, summarisation, named entity detection, geolocation, and machine translation. The graphical interface allows users to visualise the distribution of news content by entity names and story location. Browsing of news events is facilitated through extractive summaries and the ability to view transcripts in multiple languages.

Peter Bell | Alexandra Birch | Clare Llewellyn | Catherine Lai | Mark Sinclair

[1] Mauro Cettolo,et al. WIT3: Web Inventory of Transcribed and Translated Talks , 2012, EAMT.

[2] Claire Grover,et al. Rule-Based Chunking and Reusability , 2006, LREC.

[3] Philipp Koehn,et al. Moses: Open Source Toolkit for Statistical Machine Translation , 2007, ACL.

[4] Eric Gilbert,et al. VADER: A Parsimonious Rule-Based Model for Sentiment Analysis of Social Media Text , 2014, ICWSM.

[5] Steve Renals,et al. Incorporating lexical and prosodic information at different levels for meeting summarization , 2014, INTERSPEECH.

[6] Philipp Koehn,et al. Europarl: A Parallel Corpus for Statistical Machine Translation , 2005, MTSUMMIT.

[7] Lukás Burget,et al. Sequence-discriminative training of deep neural networks , 2013, INTERSPEECH.

[8] Philipp Koehn,et al. Dirt Cheap Web-Scale Parallel Text from the Common Crawl , 2013, ACL.

[9] Claire Grover,et al. Named Entity Recognition for Digitised Historical Texts , 2008, LREC.