论文信息 - Asiya: An Open Toolkit for Automatic Machine Translation (Meta-)Evaluation - 字舞流文

Asiya: An Open Toolkit for Automatic Machine Translation (Meta-)Evaluation

Asiya: An Open Toolkit for Automatic Machine Translation (Meta-)Evaluation This article describes the Asiya Toolkit for Automatic Machine Translation Evaluation and Meta-evaluation, an open framework offering system and metric developers a text interface to a rich repository of metrics and meta-metrics.

Lluís Màrquez i Villodre | Jesús Giménez | J. Giménez

[1] Julio Gonzalo,et al. MT Evaluation: Human-Like vs. Human Acceptable , 2006, ACL.

[2] Hermann Ney,et al. Accelerated DP based search for statistical translation , 1997, EUROSPEECH.

[3] Bruno Pouliquen,et al. Automatic Identification of Document Translations in Large Multilingual Document Collections , 2006, ArXiv.

[4] Mariona Taulé,et al. AnCora: Multilevel Annotated Corpora for Catalan and Spanish , 2008, LREC.

[5] Philipp Koehn,et al. Statistical Significance Tests for Machine Translation Evaluation , 2004, EMNLP.

[6] Beatrice Santorini,et al. Building a Large Annotated Corpus of English: The Penn Treebank , 1993, CL.

[7] Lluís Màrquez i Villodre,et al. A Graphical Interface for MT Evaluation and Error Analysis , 2012, ACL.

[8] Ding Liu,et al. Syntactic Features for Evaluation of Machine Translation , 2005, IEEvaluation@ACL.

[9] R. Fisher. 014: On the "Probable Error" of a Coefficient of Correlation Deduced from a Small Sample. , 1921 .

[10] Johan Bos,et al. Wide-Coverage Semantic Analysis with Boxer , 2008, STEP.

[11] Pascal Denis,et al. Coupling an Annotated Corpus and a Morphosyntactic Lexicon for State-of-the-Art POS Tagging with Less Human Effort , 2009, PACLIC.

[12] Mihai Surdeanu,et al. Semantic Role Labeling Using Complete Syntactic Analysis , 2005, CoNLL.

[13] Alon Lavie,et al. METEOR-NEXT and the METEOR Paraphrase Tables: Improved Evaluation Support for Five Target Languages , 2010, WMT@ACL.

[14] Johan Bos,et al. Linguistically Motivated Large-Scale NLP with C&C and Boxer , 2007, ACL.

[15] Hermann Ney,et al. An Evaluation Tool for Machine Translation: Fast Evaluation for MT Research , 2000, LREC.

[16] R. Fisher. 036: On a Distribution Yielding the Error Functions of Several Well Known Statistics. , 1924 .

[17] Francis M. Tyers,et al. Free/Open-Source Resources in the Apertium Platform for Machine Translation Research and Development , 2010, Prague Bull. Math. Linguistics.

[18] Erhard W. Hinrichs,et al. The Tüba-D/Z Treebank: Annotating German with a Context-Free Backbone , 2004, LREC.

[19] Salim Roukos,et al. Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[20] Vladimir I. Levenshtein,et al. Binary codes capable of correcting deletions, insertions, and reversals , 1965 .

[21] B. Navarro,et al. Syntactic , semantic and pragmatic annotation in Cast 3 LB , 2003 .

[22] Robert Tibshirani,et al. Bootstrap Methods for Standard Errors, Confidence Intervals, and Other Measures of Statistical Accuracy , 1986 .

[23] M. Kendall. Rank Correlation Methods , 1949 .

[24] Philipp Koehn,et al. Findings of the 2009 Workshop on Statistical Machine Translation , 2009, WMT@EACL.

[25] Lluís Màrquez i Villodre,et al. Linguistic Features for Automatic Evaluation of Heterogenous MT Systems , 2007, WMT@ACL.

[26] Dan Klein,et al. Learning Accurate, Compact, and Interpretable Tree Annotation , 2006, ACL.

[27] W. Hoeffding,et al. Rank Correlation Methods , 1949 .

[28] George R. Doddington,et al. Automatic Evaluation of Machine Translation Quality Using N-gram Co-Occurrence Statistics , 2002 .

[29] Enrique Amigó,et al. IQmt: A Framework for Automatic Machine Translation Evaluation , 2006, LREC.

[30] I. Dan Melamed,et al. Precision and Recall of Machine Translation , 2003, NAACL.

[31] C. Spearman. The proof and measurement of association between two things. By C. Spearman, 1904. , 1987, The American journal of psychology.

[32] Joakim Nivre,et al. Benchmarking of Statistical Dependency Parsers for French , 2010, COLING.

[33] Xavier Carreras,et al. FreeLing: An Open-Source Suite of Language Analyzers , 2004, LREC.

[34] Margaret King,et al. Using Test Suites in Evaluation of Machine Translation Systems , 1990, COLING.

[35] Ralph Weischedel,et al. A STUDY OF TRANSLATION ERROR RATE WITH TARGETED HUMAN ANNOTATION , 2005 .

[36] Chin-Yew Lin,et al. ORANGE: a Method for Evaluating Automatic Evaluation Metrics for Machine Translation , 2004, COLING.

[37] Joakim Nivre,et al. A Dependency-Driven Parser for German Dependency and Constituency Representations , 2008, ACL 2008.

[38] C. Spearman. The proof and measurement of association between two things. , 2015, International journal of epidemiology.

[39] Alon Lavie,et al. METEOR: An Automatic Metric for MT Evaluation with Improved Correlation with Human Judgments , 2005, IEEvaluation@ACL.

[40] Dekang Lin,et al. Dependency-Based Evaluation of Minipar , 2003 .

[41] Yuji Matsumoto. MaltParser: A language-independent system for data-driven dependency parsing , 2005 .

[42] Matthew G. Snover,et al. A Study of Translation Edit Rate with Targeted Human Annotation , 2006, AMTA.

[43] Alex Kulesza,et al. Confidence Estimation for Machine Translation , 2004, COLING.

[44] Lucia Specia,et al. Machine translation evaluation versus quality estimation , 2010, Machine Translation.

[45] Lluís Màrquez i Villodre,et al. Linguistic measures for automatic machine translation evaluation , 2010, Machine Translation.

[46] Lluís Màrquez i Villodre,et al. SVMTool: A general POS Tagger Generator Based on Support Vector Machines , 2004, LREC.

[47] K. Pearson,et al. The Life, Letters and Labours of Francis Galton , 1931, Nature.

[48] Julio Gonzalo,et al. QARLA: A Framework for the Evaluation of Text Summarization Systems , 2005, ACL.

[49] Sabine Brants,et al. The TIGER Treebank , 2001 .

[50] Chin-Yew Lin,et al. Automatic Evaluation of Machine Translation Quality Using Longest Common Subsequence and Skip-Bigram Statistics , 2004, ACL.

[51] Mihai Surdeanu,et al. Named entity recognition from spontaneous open-domain speech , 2005, INTERSPEECH.

[52] Mihai Surdeanu,et al. A Robust Combination Strategy for Semantic Role Labeling , 2005, HLT.

[53] Lluís Màrquez i Villodre,et al. Fast and accurate part-of-speech tagging: The SVM approach revisited , 2003, RANLP.

[54] Philipp Koehn,et al. Findings of the 2010 Joint Workshop on Statistical Machine Translation and Metrics for Machine Translation , 2010, WMT@ACL.

[55] Evgeniy Gabrilovich,et al. Computing Semantic Relatedness Using Wikipedia-based Explicit Semantic Analysis , 2007, IJCAI.

[56] Eugene Charniak,et al. Coarse-to-Fine n-Best Parsing and MaxEnt Discriminative Reranking , 2005, ACL.

[57] Andreas Stolcke,et al. SRILM - an extensible language modeling toolkit , 2002, INTERSPEECH.

[58] Daniel Gildea,et al. The Proposition Bank: An Annotated Corpus of Semantic Roles , 2005, CL.

[59] Pascal Denis,et al. Statistical French Dependency Parsing: Treebank Conversion and First Results , 2010, LREC.

[60] Nitin Madnani,et al. Fluency, Adequacy, or HTER? Exploring Different Human Judgments with a Tunable MT Metric , 2009, WMT@EACL.