Stand-off Annotation of Web Content as a Legally Safer Alternative to Crawling for Distribution

Funding from the European Union Seventh Framework Programme FP7/2007-2013 under grant agreement PIAP-GA-2012-324414 (Abu-MaTran) is acknowledged.

[1]  Xiaobo Ren,et al.  Translation Analysis and Translation Automation , 1993, TMI.

[2]  Jens Lehmann,et al.  Linked-Data Aware URI Schemes for Referencing Text Fragments , 2012, EKAW.

[3]  Jörg Tiedemann,et al.  Parallel Data, Tools and Interfaces in OPUS , 2012, LREC.

[4]  Andreas Eisele,et al.  DGT-TM: A freely available Translation Memory in 22 languages , 2012, LREC.

[5]  David M. Pennock,et al.  Persistence of Web References in Scientific Research , 2001, Computer.

[6]  Xiaoyi Ma,et al.  BITS: a method for bilingual text search over the Web , 1999, MTSUMMIT.

[7]  Philipp Koehn,et al.  Europarl: A Parallel Corpus for Statistical Machine Translation , 2005, MTSUMMIT.

[8]  Jimmy J. Lin,et al.  Overview of the TREC-2013 Microblog Track , 2013, TREC.

[9]  Noah A. Smith,et al.  The Web as a Parallel Corpus , 2003, CL.

[10]  Mikel L. Forcada,et al.  Combining Content-Based and URL-Based Heuristics to Harvest Aligned Bitexts from Multilingual Sites with Bitextor , 2010, Prague Bull. Math. Linguistics.

[11]  Nikos Mastropavlos,et al.  AUTOMATIC ACQUISITION OF BILINGUAL LANGUAGE RESOURCES , 2012 .

[12]  Alain Désilets,et al.  WeBiText: Building Large Heterogeneous Translation Memories from Parallel Web Content , 2008, TC.

[13]  David M. Pennock,et al.  Analysis of lexical signatures for improving information persistence on the World Wide Web , 2004, TOIS.

[14]  Tomaz Erjavec,et al.  The JRC-Acquis: A Multilingual Aligned Parallel Corpus with 20+ Languages , 2006, LREC.

[15]  B. Harris Bi-text, a new concept in translation theory , 1988 .

[16]  Hiroyuki Kitagawa,et al.  Bringing your dead links back to life: a comprehensive approach and lessons learned , 2009, HT '09.

[17]  Ewan Klein,et al.  Natural Language Processing with Python , 2009 .

[18]  Andy Way,et al.  Recent Advances in Example-Based Machine Translation , 2004 .

[19]  Tony McEnery,et al.  A Glossary of Corpus Linguistics , 2006 .

[20]  Daniel Gomes,et al.  Modelling information persistence on the web , 2006, ICWE '06.

[21]  Lynne Bowker Computer-Aided Translation , 2014 .

[22]  Miles Osborne,et al.  Statistical Machine Translation , 2010, Encyclopedia of Machine Learning and Data Mining.

[23]  Josef van Genabith,et al.  Domain Adaptation of Statistical Machine Translation using Web-Crawled Resources: A Case Study , 2012, EAMT.