The Computable News project: Research in the Newsroom

We report on a four year academic research project to build a natural language processing platform in support of a large media company. The Computable News platform processes news stories, producing a layer of structured data that can be used to build rich applications. We describe the underlying platform and the research tasks that we explored building it. The platform supports a wide range of prototype applications designed to support different newsroom functions. We hope that this qualitative review provides some insight into the challenges involved in this type of project.

[1]  Joel Nothman,et al.  Document-level Entity Linking: CMCRC at TAC 2010 , 2010, TAC.

[2]  Joel Nothman,et al.  Cheap and easy entity evaluation , 2014, ACL.

[3]  Joel Nothman,et al.  Event Linking: Grounding Event Reference in a News Archive , 2012, ACL.

[4]  Will Radford Linking named entities to Wikipedia , 2014 .

[5]  James R. Curran,et al.  Joint Apposition Extraction with Syntactic and Semantic Constraints , 2013, ACL.

[6]  Mark Steedman,et al.  Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers) , 2013 .

[7]  James R. Curran,et al.  An annotated corpus of quoted opinions in news articles , 2013, ACL.

[8]  Joel Nothman,et al.  SYDNEY CMCRC at TAC 2013 , 2013, TAC.

[9]  Joel Nothman,et al.  Naïve but effective NIL clustering baselines - CMCRC at TAC 2011 , 2011, TAC.

[10]  Joel Nothman,et al.  Grounding event references in news , 2013 .

[11]  James R. Curran,et al.  docrep: A lightweight and efficient document representation framework , 2014, COLING.

[12]  Joel Nothman,et al.  Evaluating Entity Linking with Wikipedia , 2013, Artif. Intell..

[13]  Joel Nothman,et al.  (Almost) Total Recall - SYDNEY CMCRC at TAC 2012 , 2012, TAC.

[14]  James R. Curran,et al.  A Sequence Labelling Approach to Quote Attribution , 2012, EMNLP.

[15]  Joel Nothman,et al.  Command-line utilities for managing and exploring annotated corpora , 2014, OIAF4HLT@COLING.

[16]  Timothy O'Keefe Extracting and Attributing Quotes in Text and Assessing them as Opinions , 2014 .