论文信息 - TermWise: A computer assisted translation tool with context-sensitive terminological support

TermWise: A computer assisted translation tool with context-sensitive terminological support

• User uploads a new translation assignment in client (CAT-tool) • Document is uploaded to server and segmentized into sentences • N-grams from database are detected in each segment • 2 types of similarity calculations (bag-of-n-grams, cosine) on server: 1. Segments matched with all sentences in Staatblad (~TM fuzzy match) 2. Assignment’s similarity with all documents in Staatsblad • For identified n-grams: concordances retrieved from Staatsblad + metadata of the document they occurred in. • Concordances are sorted based on the document similarity to current assignment ( relevancy) and categorized by translation ( disambiguation) • Output from server sent as XML-file back to client TermWise A Computer Assisted Translation Tool with Context-Sensitive Terminological Support Kris Heylen*, Stephen Bond*, Dirk De Hertog*, Ivan Vulic†, Hendrik Kockaert* ‡ *QLVL Linguistics Department (KU Leuven), †LIIR – Department of Computer Science (KU Leuven), ‡University of the Free State, Bloemfontein

Hendrik Kockaert | Kris Heylen | Dirk De Hertog | Stephen Bond | Ivan Vulić

[1] Marie-Francine Moens,et al. Sub-corpora Sampling with an Application to Bilingual Lexicon Extraction , 2012, COLING.

[2] Dirk De Hertog. TermWise Xtract: Automatic term Extraction Applied to the legal Domain , 2014 .

[3] Neal R. Norrick,et al. Phrasemes in legal texts , 2007 .

[4] Tom Vanallemeersch. Belgisch Staatsblad Corpus: Retrieving French-Dutch Sentences from Official Documents , 2010, LREC.