Machine Translation from Text

Machine translation (MT) from text, the topic of this chapter, is perhaps the heart of the GALE project. Beyond being a well defined application that stands on its own, MT from text is the link between the automatic speech recognition component and the distillation component. The focus of MT in GALE is on translating from Arabic or Chinese to English. The three languages represent a wide range of linguistic diversity and make the GALE MT task rather challenging and exciting.

[1]  Kristina Toutanova,et al.  Applying Morphology Generation Models to Machine Translation , 2008, ACL.

[2]  David Chiang,et al.  Forest Rescoring: Faster Decoding with Integrated Language Models , 2007, ACL.

[3]  Alexander M. Fraser,et al.  Semi-Supervised Training for Statistical Word Alignment , 2006, ACL.

[4]  José B. Mariño,et al.  System Combination for Machine Translation of Spoken and Written Language , 2008, IEEE Transactions on Audio, Speech, and Language Processing.

[5]  Marine Carpuat,et al.  HKUST statistical machine translation experiments for IWSLT 2007 , 2007, IWSLT.

[6]  Eiichiro Sumita,et al.  Chinese word segmentation and statistical machine translation , 2008, TSLP.

[7]  Mikko Kurimo,et al.  Minimum Bayes Risk Combination of Translation Hypotheses from Alternative Morphological Decompositions , 2009, NAACL.

[8]  Hermann Ney,et al.  Improvements in beam search , 1994, ICSLP.

[9]  Hermann Ney,et al.  Do We Need Chinese Word Segmentation for Statistical Machine Translation? , 2004, SIGHAN@ACL.

[10]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[11]  William J. Byrne,et al.  Large-Scale Statistical Machine Translation with Weighted Finite State Transducers , 2009, FSMNLP.

[12]  Martin Rajman,et al.  Lattice Parsing for Speech Recognition , 1999 .

[13]  Stanley F. Chen,et al.  An empirical study of smoothing techniques for language modeling , 1999 .

[14]  Hermann Ney,et al.  Learning to Combine Machine Translation Systems , 2008 .

[15]  Jan Niehues,et al.  The ISL Phrase-Based MT System for the 2007 ACL Workshop on Statistical Machine Translation , 2007, WMT@ACL.

[16]  Philip Resnik,et al.  Online Large-Margin Training of Syntactic and Structural Translation Features , 2008, EMNLP.

[17]  Günter Neumann,et al.  Arabic Computational Morphology: Knowledge-based and Empirical Methods , 2007 .

[18]  Yuji Matsumoto,et al.  Phrase reordering for statistical machine translation based on predicate-argument structure , 2006, IWSLT.

[19]  Michael Schiehlen Learning Tense Translation from Bilingual Corpora , 1998, COLING-ACL.

[20]  Marine Carpuat,et al.  How phrase sense disambiguation outperforms word sense disambiguation for statistical machine translation , 2007, TMI.

[21]  Wen Wang,et al.  Development of SRI's translation systems for broadcast news and broadcast conversations , 2008, INTERSPEECH.

[22]  Andreas Stolcke,et al.  SRILM - an extensible language modeling toolkit , 2002, INTERSPEECH.

[23]  Daniel Gildea,et al.  The Proposition Bank: An Annotated Corpus of Semantic Roles , 2005, CL.

[24]  Stuart M. Shieber,et al.  Synchronous Tree-Adjoining Grammars , 1990, COLING.

[25]  Sean R. Eddy,et al.  Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids , 1998 .

[26]  Richard Zens,et al.  The JHU workshop 2006 IWSLT system , 2006, IWSLT.

[27]  Hermann Ney,et al.  A Systematic Comparison of Various Statistical Alignment Models , 2003, CL.

[28]  Andreas Zollmann,et al.  Syntax Augmented Machine Translation via Chart Parsing , 2006, WMT@HLT-NAACL.

[29]  Hermann Ney,et al.  Discriminative Reordering Models for Statistical Machine Translation , 2006, WMT@HLT-NAACL.

[30]  Philipp Koehn,et al.  (Meta-) Evaluation of Machine Translation , 2007, WMT@ACL.

[31]  Kevin Knight,et al.  An Overview of Probabilistic Tree Transducers for Natural Language Processing , 2005, CICLing.

[32]  Michael Paul,et al.  Overview of the IWSLT06 evaluation campaign , 2006, IWSLT.

[33]  Richard M. Schwartz,et al.  Combining Outputs from Multiple Machine Translation Systems , 2007, NAACL.

[34]  Richard M. Schwartz,et al.  Incremental Hypothesis Alignment for Building Confusion Networks with Application to Machine Translation System Combination , 2008, WMT@ACL.

[35]  Ondrej Bojar,et al.  English-to-Czech Factored Machine Translation , 2007, WMT@ACL.

[36]  Kristina Toutanova,et al.  Generating Complex Morphology for Machine Translation , 2007, ACL.

[37]  Mengqiu Wang,et al.  A Dual-layer CRFs Based Joint Decoding Method for Cascaded Segmentation and Labeling Tasks , 2007, IJCAI.

[38]  Marine Carpuat,et al.  Improving Statistical Machine Translation Using Word Sense Disambiguation , 2007, EMNLP.

[39]  Michael Gamon,et al.  Normalizing German and English inflectional morphology to improve statistical word alignment , 2004, AMTA.

[40]  Alexander M. Fraser,et al.  A Smorgasbord of Features for Statistical Machine Translation , 2004, NAACL.

[41]  Qun Liu,et al.  HHMM-based Chinese Lexical Analyzer ICTCLAS , 2003, SIGHAN.

[42]  Marine Carpuat,et al.  Evaluation of Context-Dependent Phrasal Translation Lexicons for Statistical Machine Translation , 2008, LREC.

[43]  Thomas Emerson,et al.  The Second International Chinese Word Segmentation Bakeoff , 2005, IJCNLP.

[44]  Yang Ye,et al.  Latent Features in Automatic Tense Translation between Chinese and English , 2006, SIGHAN@COLING/ACL.

[45]  Kristina Toutanova,et al.  Generating Case Markers in Machine Translation , 2007, NAACL.

[46]  Wei Jiang,et al.  Chinese Word Segmentation based on Mixing Model , 2005, SIGHAN@IJCNLP 2005.

[47]  H. Sebastian Seung,et al.  Query by committee , 1992, COLT '92.

[48]  Philipp Koehn,et al.  Enriching Morphologically Poor Languages for Statistical Machine Translation , 2008, ACL.

[49]  Hermann Ney,et al.  Using POS information for statistical machine translation into morphologically rich languages , 2003, Proceedings of the tenth conference on European chapter of the Association for Computational Linguistics - EACL '03.

[50]  Hermann Ney,et al.  Symmetric Word Alignments for Statistical Machine Translation , 2004, COLING.

[51]  Hermann Ney,et al.  HMM-Based Word Alignment in Statistical Translation , 1996, COLING.

[52]  Kenneth Ward Church,et al.  Using Suffix Arrays to Compute Term Frequency and Document Frequency for All Substrings in a Corpus , 2001, Computational Linguistics.

[53]  Pascale Fung,et al.  Automatic Learning of Chinese English Semantic Structure Mapping , 2006, 2006 IEEE Spoken Language Technology Workshop.

[54]  Noah A. Smith,et al.  The Web as a Parallel Corpus , 2003, CL.

[55]  Adam Kilgarriff,et al.  The Senseval-3 English lexical sample task , 2004, SENSEVAL@ACL.

[56]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[57]  Hermann Ney,et al.  Word Re-ordering and DP-based Search in Statistical Machine Translation , 2000, COLING.

[58]  Philipp Koehn,et al.  Proceedings of the Workshop on Statistical Machine Translation , 2006 .

[59]  Daniel Marcu,et al.  SPMT: Statistical Machine Translation with Syntactified Target Language Phrases , 2006, EMNLP.

[60]  I. Dan Melamed,et al.  Statistical Machine Translation by Parsing , 2004, ACL.

[61]  Beatrice Santorini,et al.  Building a Large Annotated Corpus of English: The Penn Treebank , 1993, CL.

[62]  Jonathan G. Fiscus,et al.  A post-processing system to yield reduced word error rates: Recognizer Output Voting Error Reduction (ROVER) , 1997, 1997 IEEE Workshop on Automatic Speech Recognition and Understanding Proceedings.

[63]  Hermann Ney,et al.  Are Very Large N-Best Lists Useful for SMT? , 2007, HLT-NAACL.

[64]  Dana Shapira,et al.  Edit distance with move operations , 2002, J. Discrete Algorithms.

[65]  R. Bellman Dynamic programming. , 1957, Science.

[66]  Dekai Wu,et al.  Stochastic Inversion Transduction Grammars and Bilingual Parsing of Parallel Corpora , 1997, CL.

[67]  Ben Taskar,et al.  Alignment by Agreement , 2006, NAACL.

[68]  Philipp Koehn,et al.  Proceedings of the Third Workshop on Statistical Machine Translation , 2008, WMT@ACL.

[69]  Chao Wang,et al.  Chinese Syntactic Reordering for Statistical Machine Translation , 2007, EMNLP.

[70]  Changning Huang,et al.  Chinese Word Segmentation and Named Entity Recognition: A Pragmatic Approach , 2005, CL.

[71]  Philipp Koehn,et al.  Moses: Open Source Toolkit for Statistical Machine Translation , 2007, ACL.

[72]  David Chiang,et al.  Hierarchical Phrase-Based Translation , 2007, CL.

[73]  Hwee Tou Ng,et al.  Chinese Part-of-Speech Tagging: One-at-a-Time or All-at-Once? Word-Based or Character-Based? , 2004, EMNLP.

[74]  Hermann Ney,et al.  Integrated Chinese Word Segmentation in Statistical Machine Translation , 2005, IWSLT.

[75]  Sanjeev Khudanpur,et al.  Machine Translation System Combination using ITG-based Alignments , 2008, ACL.

[76]  Jianfeng Gao,et al.  Indirect-HMM-based Hypothesis Alignment for Combining Outputs from Machine Translation Systems , 2008, EMNLP.

[77]  Andrew McCallum,et al.  Reducing Weight Undertraining in Structured Discriminative Learning , 2006, NAACL.

[78]  Masaki Murata,et al.  Using a Support-Vector Machine for Japanese-to-English Translation of Tense, Aspect, and Modality , 2001, DDMMT@ACL.

[79]  Daniel Povey,et al.  Large scale discriminative training of hidden Markov models for speech recognition , 2002, Comput. Speech Lang..

[80]  Salim Roukos,et al.  Feature-based language understanding , 1997, EUROSPEECH.

[81]  Mei Yang,et al.  Phrase-Based Backoff Models for Machine Translation of Highly Inflected Languages , 2006, EACL.

[82]  Hermann Ney,et al.  Data driven search organization for continuous speech recognition , 1992, IEEE Trans. Signal Process..

[83]  Stephan Vogel,et al.  Language Model Adaptation for Statistical Machine Translation via Structured Query Models , 2004, COLING.

[84]  James H. Martin,et al.  Speech and language processing: an introduction to natural language processing, computational linguistics, and speech recognition, 2nd Edition , 2000, Prentice Hall series in artificial intelligence.

[85]  Rémi Zajac,et al.  SYSTRAN's Chinese Word Segmentation , 2003, SIGHAN.

[86]  H. Sebastian Seung,et al.  Selective Sampling Using the Query by Committee Algorithm , 1997, Machine Learning.

[87]  Eiichiro Sumita,et al.  Corpus-based Generation of Numeral Classifier using Phrase Alignment , 2002, COLING.

[88]  J. Lafferty,et al.  Mixed-membership models of scientific publications , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[89]  Christoph Tillmann,et al.  A Unigram Orientation Model for Statistical Machine Translation , 2004, NAACL.

[90]  Giorgio Satta,et al.  Efficient Parsing for Bilexical Context-Free Grammars and Head Automaton Grammars , 1999, ACL.

[91]  Nizar Habash,et al.  Using Shallow Syntax Information to Improve Word Alignment and Reordering for SMT , 2008, WMT@ACL.

[92]  Joshua Goodman,et al.  Semiring Parsing , 1999, CL.

[93]  Mari Ostendorf,et al.  Integration of Diverse Recognition Methodologies Through Reevaluation of N-Best Sentence Hypotheses , 1991, HLT.

[94]  Nizar Habash,et al.  Arabic Morphological Representations for Machine Translation , 2007 .

[95]  Dilek Z. Hakkani-Tür,et al.  Active learning for automatic speech recognition , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[96]  Frank Vanden Berghen,et al.  CONDOR, a new parallel, constrained extension of Powell's UOBYQA algorithm: experimental results and comparison with the DFO algorithm , 2005 .

[97]  Hermann Ney,et al.  Statistical Machine Translation with Scarce Resources Using Morpho-syntactic Information , 2004, CL.

[98]  Richard M. Schwartz,et al.  Language Model Adaptation in Machine Translation from Speech , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[99]  Philipp Koehn,et al.  Clause Restructuring for Statistical Machine Translation , 2005, ACL.

[100]  Smaranda Muresan,et al.  Generalizing Word Lattice Translation , 2008, ACL.

[101]  Alon Itai,et al.  Word Sense Disambiguation Using a Second Language Monolingual Corpus , 1994, CL.

[102]  Daniel Gildea,et al.  Synchronous Binarization for Machine Translation , 2006, NAACL.

[103]  Daniel Jurafsky,et al.  A Conditional Random Field Word Segmenter for Sighan Bakeoff 2005 , 2005, IJCNLP.

[104]  Hichem Sahbi,et al.  Consensus Network Decoding for Statistical Machine Translation System Combination , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[105]  Robert L. Mercer,et al.  The Mathematics of Statistical Machine Translation: Parameter Estimation , 1993, CL.

[106]  Alon Lavie,et al.  Multi-engine machine translation guided by explicit word matching , 2005, EAMT.

[107]  Hermann Ney,et al.  N-Gram Posterior Probabilities for Statistical Machine Translation , 2006, WMT@HLT-NAACL.

[108]  Taro Watanabe,et al.  Online Large-Margin Training for Statistical Machine Translation , 2007, EMNLP.

[109]  M. Held,et al.  A dynamic programming approach to sequencing problems , 1962, ACM National Meeting.

[110]  Eric P. Xing,et al.  HM-BiTAM: Bilingual Topic Exploration, Word Alignment, and Translation , 2007, NIPS.

[111]  Stephan Vogel,et al.  Combination of Machine Translation Systems via Hypothesis Selection from Combined N-Best Lists , 2008, AMTA 2008.


[113]  Judea Pearl,et al.  Chapter 2 – BAYESIAN INFERENCE , 1988 .

[114]  Lluís Màrquez i Villodre,et al.  A Smorgasbord of Features for Automatic MT Evaluation , 2008, WMT@ACL.

[115]  Roger K. Moore Computer Speech and Language , 1986 .

[116]  William H. Press,et al.  Numerical recipes in C , 2002 .

[117]  P. Donnelly,et al.  Inference of population structure using multilocus genotype data. , 2000, Genetics.

[118]  Mari Ostendorf,et al.  Relevance weighting for combining multi-domain data for n-gram language modeling , 1999, Comput. Speech Lang..

[119]  Salim Roukos,et al.  A Maximum Entropy Word Aligner for Arabic-English Machine Translation , 2005, HLT.

[120]  Philip Resnik,et al.  Soft Syntactic Constraints for Hierarchical Phrased-Based Translation , 2008, ACL.

[121]  W. Press,et al.  Numerical Recipes: The Art of Scientific Computing , 1987 .

[122]  Noah A. Smith,et al.  Compiling Comp Ling: Weighted Dynamic Programming and the Dyna Language , 2005, HLT.

[123]  Kemal Oflazer,et al.  Exploring Different Representational Units in English-to-Turkish Statistical Machine Translation , 2007, WMT@ACL.

[124]  Marine Carpuat,et al.  Word Sense Disambiguation vs. Statistical Machine Translation , 2005, ACL.

[125]  Lillian Lee,et al.  Mostly-unsupervised statistical segmentation of Japanese kanji sequences , 2002, Natural Language Engineering.

[126]  Robert C. Moore A Discriminative Framework for Bilingual Word Alignment , 2005, HLT.

[127]  Philippe Langlais,et al.  Translating Unknown Words by Analogical Learning , 2007, EMNLP.

[128]  Hermann Ney,et al.  Chunk-Level Reordering of Source Language Sentences with Automatically Learned Rules for Statistical Machine Translation , 2007, SSST@HLT-NAACL.

[129]  Richard M. Schwartz,et al.  Improved Word-Level System Combination for Machine Translation , 2007, ACL.

[130]  Ying Zhang,et al.  Distributed Language Modeling for N-best List Re-ranking , 2006, EMNLP.

[131]  Chris Quirk,et al.  Dependency Treelet Translation: Syntactically Informed Phrasal SMT , 2005, ACL.

[132]  Ralf D. Brown,et al.  Example-Based Machine Translation in the Pangloss System , 1996, COLING.

[133]  Liang Huang,et al.  Statistical Syntax-Directed Translation with Extended Domain of Locality , 2006, AMTA.

[134]  Sharon Goldwater,et al.  Improving Statistical MT through Morphological Analysis , 2005, HLT.

[135]  Daniel Marcu,et al.  Binarizing Syntax Trees to Improve Syntax-Based Machine Translation Accuracy , 2007, EMNLP.

[136]  David Chiang,et al.  A Hierarchical Phrase-Based Model for Statistical Machine Translation , 2005, ACL.

[137]  Fei Huang,et al.  Confidence Measure for Word Alignment , 2009, ACL.

[138]  Hermann Ney,et al.  Phrase-Based Statistical Machine Translation , 2002, KI.

[139]  Josep M. Crego,et al.  Integration of POStag-based Source Reordering into SMT Decoding by an Extended Search Graph , 2006, AMTA.

[140]  Chris Callison-Burch,et al.  Demonstration of Joshua: An Open Source Toolkit for Parsing-based Machine Translation , 2009, ACL.

[141]  Adam Kilgarriff,et al.  English Lexical Sample Task Description , 2001, *SEMEVAL.

[142]  Michel Simard,et al.  Statistical Phrase-Based Post-Editing , 2007, NAACL.

[143]  Hermann Ney,et al.  Alignment templates: the RWTH SMT system , 2004, IWSLT.

[144]  Giorgio Satta,et al.  Generalized Multitext Grammars , 2004, ACL.

[145]  Michael Collins,et al.  Head-Driven Statistical Models for Natural Language Parsing , 2003, CL.

[146]  Hermann Ney,et al.  Improved chunk-level reordering for statistical machine translation , 2007, IWSLT.

[147]  George A. Miller,et al.  Introduction to WordNet: An On-line Lexical Database , 1990 .

[148]  Christoph Tillmann,et al.  A Rule-Driven Dynamic Programming Decoder for Statistical MT , 2008, SSST@ACL.

[149]  Wayne A. Lea,et al.  Trends in Speech Recognition , 1980 .

[150]  Nianwen Xue,et al.  Semantic role labeling of nominalized predicates in Chinese , 2006, NAACL.


[152]  Hermann Ney,et al.  Improved Alignment Models for Statistical Machine Translation , 1999, EMNLP.

[153]  Richard Zens,et al.  Speech Translation by Confusion Network Decoding , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[154]  Niladri Chatterjee,et al.  Identification of divergence for English to Hindi EBMT , 2003, MTSUMMIT.

[155]  José B. Mariño,et al.  N-gram-based SMT System Enhanced with Reordering Patterns , 2006, WMT@HLT-NAACL.

[156]  Matt Post,et al.  Syntax-based language models for statistical machine translation , 2010 .

[157]  Galen Andrew,et al.  A Hybrid Markov/Semi-Markov Conditional Random Field for Sequence Segmentation , 2006, EMNLP.

[158]  Wen Wang,et al.  Improving Alignments for Better Confusion Networks for Combining Machine Translation Systems , 2008, COLING.

[159]  Michael Collins,et al.  A Discriminative Model for Tree-to-Tree Translation , 2006, EMNLP.

[160]  Marc Dymetman,et al.  Experiments in Discriminating Phrase-Based Translations on the Basis of Syntactic Coupling Features , 2008, SSST@ACL.

[161]  Michel Simard,et al.  NRC‘s PORTAGE System for WMT 2007 , 2007, WMT@ACL.

[162]  Tanja Schultz,et al.  Bilingual-LSA Based LM Adaptation for Spoken Language Translation , 2007, ACL.

[163]  Pushpak Bhattacharyya,et al.  Simple Syntactic and Morphological Processing Can Help English-Hindi Statistical Machine Translation , 2008, IJCNLP.

[164]  Gerard Salton,et al.  Research and Development in Information Retrieval , 1982, Lecture Notes in Computer Science.

[165]  Ben Taskar,et al.  A Discriminative Matching Approach to Word Alignment , 2005, HLT.

[166]  Kevin Knight,et al.  A Decoder for Syntax-based Statistical MT , 2002, ACL.

[167]  Christopher Dyer The University of maryland translation system for IWSLT 2007 , 2007, IWSLT.

[168]  Jorge Civera Saiz Novel statistical approaches to text classification, machine translation and computer-assisted translation , 2011 .

[169]  Nizar Habash,et al.  On Arabic Transliteration , 2007 .

[170]  William J. Byrne,et al.  Phrasal Segmentation Models for Statistical Machine Translation , 2008, COLING.

[171]  Hermann Ney,et al.  Word Reordering and a Dynamic Programming Beam Search Algorithm for Statistical Machine Translation , 2003, CL.

[172]  Gary Geunbae Lee,et al.  Transformation-based Sentence Splitting method for Statistical Machine Translation , 2008, IJCNLP.

[173]  Geoffrey Zweig,et al.  Anatomy of an extremely fast LVCSR decoder , 2005, INTERSPEECH.

[174]  Hermann Ney,et al.  Word-Level Confidence Estimation for Machine Translation , 2007, CL.

[175]  Adam L. Berger,et al.  A Maximum Entropy Approach to Natural Language Processing , 1996, CL.

[176]  David A. Smith,et al.  Minimum Risk Annealing for Training Log-Linear Models , 2006, ACL.

[177]  Gerard G. L. Meyer,et al.  Selective sampling of training data for speech recognition , 2002 .

[178]  Philipp Koehn,et al.  CCG Supertags in Factored Statistical Machine Translation , 2007, WMT@ACL.

[179]  Philipp Koehn,et al.  Factored Translation Models , 2007, EMNLP.

[180]  Nizar Habash,et al.  Combination of Arabic Preprocessing Schemes for Statistical Machine Translation , 2006, ACL.

[181]  William J. Byrne,et al.  HMM Word and Phrase Alignment for Statistical Machine Translation , 2005, HLT.

[182]  William J. Byrne,et al.  MTTK: An Alignment Toolkit for Statistical Machine Translation , 2006, NAACL.

[183]  Noah A. Smith,et al.  Annealing Techniques For Unsupervised Statistical Language Learning , 2004, ACL.

[184]  Liang Huang,et al.  Forest Reranking: Discriminative Parsing with Non-Local Features , 2008, ACL.

[185]  Franz Josef Och,et al.  Minimum Error Rate Training in Statistical Machine Translation , 2003, ACL.

[186]  Dan Klein,et al.  A* Parsing: Fast Exact Viterbi Parse Selection , 2003, NAACL.

[187]  Noah A. Smith,et al.  Proceedings of EMNLP , 2007 .

[188]  Frederick Jelinek,et al.  Statistical methods for speech recognition , 1997 .

[189]  José B. Mariño,et al.  On the impact of morphology in English to Spanish statistical MT , 2008, Speech Commun..

[190]  Adria de Gispert Ramis Introducing linguistic knowledge into statistical machine translation , 2007 .

[191]  Ben Taskar,et al.  An End-to-End Discriminative Approach to Machine Translation , 2006, ACL.

[192]  Lalit R. Bahl,et al.  A Maximum Likelihood Approach to Continuous Speech Recognition , 1983, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[193]  Joel D. Martin,et al.  PORTAGE: A Phrase-Based Machine Translation System , 2005, ParallelText@ACL.

[194]  Joshua Goodman,et al.  A bit of progress in language modeling , 2001, Comput. Speech Lang..

[195]  Hermann Ney,et al.  Improved backing-off for M-gram language modeling , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.

[196]  Philipp Koehn,et al.  Explorer Edinburgh System Description for the 2005 IWSLT Speech Translation Evaluation , 2005 .

[197]  Eugene Charniak,et al.  Edge-Based Best-First Chart Parsing , 1998, VLC@COLING/ACL.

[198]  D. Aldous Exchangeability and related topics , 1985 .

[199]  Daniel Jurafsky,et al.  Support Vector Learning for Semantic Argument Classification , 2005, Machine Learning.

[200]  Lluís Màrquez i Villodre,et al.  Linguistic Features for Automatic Evaluation of Heterogenous MT Systems , 2007, WMT@ACL.

[201]  Hermann Ney,et al.  Morpho-syntactic Arabic Preprocessing for Arabic to English Statistical Machine Translation , 2006, WMT@HLT-NAACL.

[202]  Yuan Ding,et al.  Machine Translation Using Probabilistic Synchronous Dependency Insertion Grammars , 2005, ACL.

[203]  Alon Lavie,et al.  METEOR: An Automatic Metric for MT Evaluation with Improved Correlation with Human Judgments , 2005, IEEvaluation@ACL.

[204]  Hermann Ney,et al.  A Systematic Comparison of Training Criteria for Statistical Machine Translation , 2007, EMNLP-CoNLL.

[205]  Andy Way,et al.  Supertagged Phrase-Based Statistical Machine Translation , 2007, ACL.

[206]  Anthony Skjellum,et al.  Using MPI - portable parallel programming with the message-parsing interface , 1994 .

[207]  Hwee Tou Ng,et al.  Word Sense Disambiguation Improves Statistical Machine Translation , 2007, ACL.

[208]  Sanjeev Khudanpur,et al.  Efficient Extraction of Oracle-best Translations from Hypergraphs , 2009, HLT-NAACL.

[209]  Koby Crammer,et al.  Online Large-Margin Training of Dependency Parsers , 2005, ACL.

[210]  Christopher D. Manning,et al.  Optimizing Chinese Word Segmentation for Machine Translation Performance , 2008, WMT@ACL.

[211]  Daniel Marcu,et al.  Scalable Inference and Training of Context-Rich Syntactic Translation Models , 2006, ACL.

[212]  Richard Zens,et al.  Phrase based statistical machine translation: models, search, raining , 2008 .

[213]  Ming Zhou,et al.  Measure Word Generation for English-Chinese SMT Systems , 2008, ACL.

[214]  Jan Niehues,et al.  Discriminative Word Alignment via Alignment Matrix Modeling , 2008, WMT@ACL.

[215]  Philipp Koehn,et al.  Proceedings of the Fourth Workshop on Statistical Machine Translation, WMT@EACL 2009, Athens, Greece, March 30-31, 2009 , 2009, WMT@EACL.

[216]  Anoop Sarkar,et al.  Discriminative Reranking for Machine Translation , 2004, NAACL.

[217]  Hermann Ney,et al.  Can We Translate Letters? , 2007, WMT@ACL.

[218]  John D. Lafferty,et al.  Inducing Features of Random Fields , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[219]  Rebecca Hwa,et al.  Sample Selection for Statistical Parsing , 2004, CL.

[220]  Bruce Lowerre,et al.  The Harpy speech understanding system , 1990 .

[221]  Nizar Habash,et al.  Arabic Tokenization, Part-of-Speech Tagging and Morphological Disambiguation in One Fell Swoop , 2005, ACL.

[222]  Franck Thollard,et al.  Proceedings of COLING , 2004 .

[223]  William T. Freeman,et al.  Understanding belief propagation and its generalizations , 2003 .

[224]  Hermann Ney,et al.  Analysing soft syntax features and heuristics for hierarchical phrase based machine translation. , 2008, IWSLT.

[225]  Hermann Ney,et al.  Sentence segmentation using IBM word alignment model 1 , 2005, EAMT.

[226]  Gunnar Evermann,et al.  Posterior probability decoding, confidence estimation and system combination , 2000 .

[227]  Mei Yang,et al.  Improved Language Modeling for Statistical Machine Translation , 2005, ParallelText@ACL.

[228]  Alexander H. Waibel,et al.  Language Model Adaptation for Statistical Machine Translation Based on Information Retrieval , 2004, LREC.

[229]  Robert L. Mercer,et al.  Word-Sense Disambiguation Using Statistical Methods , 1991, ACL.

[230]  Kevin Knight,et al.  Training Tree Transducers , 2004, NAACL.

[231]  Alexandre Allauzen,et al.  Combining Morphosyntactic Enriched Representation with n-best Reranking in Statistical Translation , 2007, SSST@HLT-NAACL.

[232]  Hermann Ney,et al.  Discriminative Training and Maximum Entropy Models for Statistical Machine Translation , 2002, ACL.

[233]  Bonnie J. Dorr,et al.  Machine Translation Divergences: A Formal Description and Proposed Solution , 1994, CL.

[234]  Mauro Cettolo,et al.  Reordering rules for phrase-based statistical machine translation , 2006, IWSLT.

[235]  Hermann Ney,et al.  Augmenting a Small Parallel Text with Morpho-Syntactic Language , 2005, ParallelText@ACL.

[236]  Kristina Toutanova,et al.  Learning to Predict Case Markers in Japanese , 2006, ACL.

[237]  C. Fellbaum An Electronic Lexical Database , 1998 .

[238]  Hermann Ney,et al.  Improvements in dynamic programming beam search for phrase-based statistical machine translation. , 2008, IWSLT.

[239]  Daniel Jurafsky,et al.  Automatic Labeling of Semantic Roles , 2002, CL.

[240]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[241]  Averill M. Law,et al.  The art and theory of dynamic programming , 1977 .

[242]  Jinxi Xu,et al.  A New String-to-Dependency Machine Translation Algorithm with a Target Dependency Language Model , 2008, ACL.

[243]  Kemal Oflazer,et al.  A MT system from Turkmen to Turkish employing finite state and statistical methods , 2007, MTSUMMIT.

[244]  Giuseppe Riccardi,et al.  Computing consensus translation from multiple machine translation systems , 2001, IEEE Workshop on Automatic Speech Recognition and Understanding, 2001. ASRU '01..

[245]  Roland Kuhn,et al.  Tighter Integration of Rule-Based and Statistical MT in Serial System Combination , 2008, COLING.

[246]  Dragos Stefan Munteanu,et al.  Improving Machine Translation Performance by Exploiting Non-Parallel Corpora , 2005, CL.

[247]  Alex Waibel,et al.  Adaptation of the translation model for statistical machine translation based on information retrieval , 2005, EAMT.

[248]  Brian Roark,et al.  Discriminative n-gram language modeling , 2007, Comput. Speech Lang..

[249]  Daniel Gildea,et al.  Loosely Tree-Based Alignment for Machine Translation , 2003, ACL.

[250]  Michael I. Jordan,et al.  A generalized mean field algorithm for variational inference in exponential families , 2002, UAI.

[251]  Kevin Knight,et al.  Decoding Complexity in Word-Replacement Translation Models , 1999, Comput. Linguistics.

[252]  Jeff A. Bilmes,et al.  Factored Language Models and Generalized Parallel Backoff , 2003, NAACL.

[253]  Philipp Koehn,et al.  Statistical Post Editing and Dictionary Extraction: Systran/Edinburgh Submissions for ACL-WMT2009 , 2009, WMT@EACL.

[254]  Marta R. Costa-jussà,et al.  Analysis of Statistical and Morphological Classes to Generate Weigthed Reordering Hypotheses on a Statistical Machine Translation System , 2007, WMT@ACL.

[255]  David Chiang,et al.  Better k-best Parsing , 2005, IWPT.

[256]  Roland Kuhn,et al.  Mixture-Model Adaptation for SMT , 2007, WMT@ACL.

[257]  David Haussler,et al.  Dirichlet mixtures: a method for improved detection of weak but significant protein sequence homology , 1996, Comput. Appl. Biosci..

[258]  Kevin Knight,et al.  A Syntax-based Statistical Translation Model , 2001, ACL.

[259]  John DeNero,et al.  The Complexity of Phrase Alignment Problems , 2008, ACL.

[260]  Daniel Jurafsky,et al.  Shallow Semantc Parsing of Chinese , 2004, HLT-NAACL.

[261]  Salim Roukos,et al.  Direct Translation Model 2 , 2007, HLT-NAACL.

[262]  Nizar Habash,et al.  Large Scale Lexeme Based Arabic Morphological Generation , 2004 .

[263]  Andy Way,et al.  A Syntactic Skeleton for Statistical Machine Translation , 2006, EAMT.

[264]  Ann Bies,et al.  Developing an Arabic Treebank: Methods, Guidelines, Procedures, and Tools , 2004 .

[265]  Alon Lavie,et al.  METEOR: An Automatic Metric for MT Evaluation with High Levels of Correlation with Human Judgments , 2007, WMT@ACL.

[266]  David M. Magerman Statistical Decision-Tree Models for Parsing , 1995, ACL.

[267]  Jason Eisner,et al.  Learning Non-Isomorphic Tree Mappings for Machine Translation , 2003, ACL.

[268]  Mark Dras,et al.  Syntax-based word reordering in phrase-based statistical machine translation: why does it work? , 2007, MTSUMMIT.

[269]  Zhao Tie Increasing Accuracy of Chinese Segmentation with Strategy of Multi step Processing , 2001 .

[270]  Christoph Tillmann,et al.  Efficient Dynamic Programming Search Algorithms for Phrase-Based SMT , 2006 .

[271]  Greg Schohn,et al.  Less is More: Active Learning with Support Vector Machines , 2000, ICML.

[272]  Fredric C. Gey,et al.  Proceedings of LREC , 2010 .

[273]  Alex Waibel,et al.  Low Cost Portability for Statistical Machine Translation based on N-gram Coverage , 2005 .

[274]  Michael Collins,et al.  Discriminative Training Methods for Hidden Markov Models: Theory and Experiments with Perceptron Algorithms , 2002, EMNLP.

[275]  Shankar Kumar,et al.  Minimum Bayes-Risk Decoding for Statistical Machine Translation , 2004, NAACL.

[276]  Shankar Kumar,et al.  Lattice Minimum Bayes-Risk Decoding for Statistical Machine Translation , 2008, EMNLP.

[277]  Thomas L. Griffiths,et al.  Contextual Dependencies in Unsupervised Word Segmentation , 2006, ACL.

[278]  Daniel Marcu,et al.  What’s in a translation rule? , 2004, NAACL.

[279]  Daniel Marcu,et al.  Statistical Phrase-Based Translation , 2003, NAACL.

[280]  Jiajun Zhang,et al.  Sentence Type Based Reordering Model for Statistical Machine Translation , 2008, COLING.

[281]  Andy Way,et al.  Wrapper Syntax for Example-Based Machine Translation , 2006 .

[282]  Ming Zhou,et al.  A Probabilistic Approach to Syntax-based Reordering for Statistical Machine Translation , 2007, ACL.

[283]  Mathias Creutz,et al.  Morphology-aware statistical machine translation based on morphs induced in an unsupervised manner , 2007, MTSUMMIT.

[284]  Hermann Ney,et al.  Computing Consensus Translation for Multiple Machine Translation Systems Using Enhanced Hypothesis Alignment , 2006, EACL.

[285]  David Yarowsky,et al.  A method for disambiguating word senses in a large corpus , 1992, Comput. Humanit..

[286]  Donald Geman,et al.  Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[287]  Hermann Ney,et al.  Toward hierarchical models for statistical machine translation of inflected languages , 2001, DDMMT@ACL.

[288]  Philipp Koehn,et al.  Statistical Significance Tests for Machine Translation Evaluation , 2004, EMNLP.

[289]  Phil Blunsom,et al.  Discriminative Word Alignment with Conditional Random Fields , 2006, ACL.

[290]  David Yarowsky,et al.  Minimally Supervised Morphological Segmentation with Applications to Machine Translation , 2006, AMTA.

[291]  Stephan Vogel,et al.  Parallel Implementations of Word Alignment Tool , 2008, SETQALNLP.

[292]  Douglas W. Oard,et al.  Dictionary-based techniques for cross-language information retrieval , 2005, Inf. Process. Manag..

[293]  Nizar Habash,et al.  Permission is granted to quote short excerpts and to reproduce figures and tables from this report, provided that the source of such material is fully acknowledged. Arabic Preprocessing Schemes for Statistical Machine Translation , 2006 .

[294]  Hermann Ney,et al.  AER: do we need to “improve” our alignments? , 2006, IWSLT.

[295]  Andreas Stolcke,et al.  Finding consensus in speech recognition: word error minimization and other applications of confusion networks , 2000, Comput. Speech Lang..

[296]  Marta R. Costa-jussà,et al.  Statistical Machine Reordering , 2006, EMNLP.

[297]  Hermann Ney,et al.  Improved Statistical Alignment Models , 2000, ACL.

[298]  Ying Zhang,et al.  An efficient phrase-to-phrase alignment model for arbitrarily long phrase and large corpora , 2005, EAMT.

[299]  Philip Koehn,et al.  Statistical Machine Translation , 2010, EAMT.

[300]  Kemal Oflazer,et al.  Initial Explorations in English to Turkish Statistical Machine Translation , 2006, WMT@HLT-NAACL.

[301]  Phil Blunsom,et al.  A Discriminative Latent Variable Model for Statistical Machine Translation , 2008, ACL.

[302]  Philipp Koehn,et al.  Statistical Post-Editing on SYSTRAN‘s Rule-Based Translation System , 2007, WMT@ACL.

[303]  Qun Liu,et al.  Improving Statistical Machine Translation Performance by Training Data Selection and Optimization , 2007, EMNLP-CoNLL.

[304]  Nianwen Xue,et al.  Automatic Semantic Role Labeling for Chinese Verbs , 2005, IJCAI.

[305]  Dekai Wu,et al.  Machine Translation with a Stochastic Grammatical Channel , 1998, COLING-ACL.

[306]  Andreas Stolcke,et al.  Entropy-based Pruning of Backoff Language Models , 2000, ArXiv.

[307]  Philipp Koehn,et al.  Feature-Rich Statistical Translation of Noun Phrases , 2003, ACL.

[308]  Joshua Goodman,et al.  A bit of progress in language modeling , 2001, Comput. Speech Lang..

[309]  John Cocke,et al.  A Statistical Approach to Machine Translation , 1990, CL.

[310]  Kemal Oflazer,et al.  Machine Translation between Turkic Languages , 2007, ACL.

[311]  Khalil Sima'an,et al.  Data-Oriented Parsing , 2003 .

[312]  David Yarowsky,et al.  Statistical Machine Translation Using Coercive Two-Level Syntactic Transduction , 2003, EMNLP.

[313]  Colin Cherry,et al.  A Probability Model to Improve Word Alignment , 2003, ACL.

[314]  Hermann Ney,et al.  The Alignment Template Approach to Statistical Machine Translation , 2004, CL.

[315]  T. Tanaka Translation selection for Japanese-English noun-noun compounds , 2003 .

[316]  William J. Byrne,et al.  European Language Translation with Weighted Finite State Transducers: The CUED MT System for the 2008 ACL Workshop on SMT , 2008, WMT@ACL.

[317]  Ronald Rosenfeld,et al.  Adaptive Statistical Language Modeling; A Maximum Entropy Approach , 1994 .

[318]  Ossama Emam,et al.  Language Model Based Arabic Word Segmentation , 2003, ACL.

[319]  Philipp Koehn,et al.  Pharaoh: A Beam Search Decoder for Phrase-Based Statistical Machine Translation Models , 2004, AMTA.

[320]  Hermann Ney,et al.  Bayesian Semi-Supervised Chinese Word Segmentation for Statistical Machine Translation , 2008, COLING.

[321]  Eugene W. Myers,et al.  Suffix arrays: a new method for on-line string searches , 1993, SODA '90.

[322]  William W. Cohen,et al.  NER Systems that Suit User’s Preferences: Adjusting the Recall-Precision Trade-off for Entity Extraction , 2006, NAACL.

[323]  Nitin Madnani,et al.  Fluency, Adequacy, or HTER? Exploring Different Human Judgments with a Tunable MT Metric , 2009, WMT@EACL.

[324]  James R. Glass,et al.  Segmentation for English-to-Arabic Statistical Machine Translation , 2008, ACL.

[325]  Gina-Anne Levow,et al.  The Third International Chinese Language Processing Bakeoff: Word Segmentation and Named Entity Recognition , 2006, SIGHAN@COLING/ACL.

[326]  Christoph Tillmann,et al.  A Projection Extension Algorithm for Statistical Machine Translation , 2003, EMNLP.

[327]  Adam Lopez,et al.  Hierarchical Phrase-Based Translation with Suffix Arrays , 2007, EMNLP.

[328]  Sanjeev Khudanpur,et al.  Variational Decoding for Statistical Machine Translation , 2009, ACL.

[329]  Philipp Koehn,et al.  Re-evaluating the Role of Bleu in Machine Translation Research , 2006, EACL.

[330]  Jean-Cédric Chappelier,et al.  A Generalized CYK Algorithm for Parsing Stochastic CFG , 1998, TAPD.

[331]  École d'été de probabilités de Saint-Flour,et al.  École d'été de probabilités de Saint-Flour XIII - 1983 , 1985 .

[332]  Young-Suk Lee,et al.  Morphological Analysis for Statistical Machine Translation , 2004, NAACL.

[333]  Lluís Màrquez i Villodre,et al.  Context-aware Discriminative Phrase Selection for Statistical Machine Translation , 2007, WMT@ACL.

[334]  William A. Gale,et al.  Good-Turing Frequency Estimation Without Tears , 1995, J. Quant. Linguistics.

[335]  Richard M. Schwartz,et al.  Incremental Hypothesis Alignment with Flexible Matching for Building Confusion Networks: BBN System Description for WMT09 System Combination Task , 2009, WMT@EACL.

[336]  Fei Xia,et al.  Improving a Statistical MT System with Automatically Learned Rewrite Patterns , 2004, COLING.

[337]  Alexander M. Fraser,et al.  Squibs and Discussions: Measuring Word Alignment Quality for Statistical Machine Translation , 2007, CL.

[338]  Hermann Ney,et al.  Improvements in Phrase-Based Statistical Machine Translation , 2004, NAACL.

[339]  Zhifei Li,et al.  First- and Second-Order Expectation Semirings with Applications to Minimum-Risk Training on Translation Forests , 2009, EMNLP.

[340]  Eric P. Xing,et al.  BiTAM: Bilingual Topic AdMixture Models for Word Alignment , 2006, ACL.

[341]  Frederick Jelinek,et al.  Interpolated estimation of Markov source parameters from sparse data , 1980 .

[342]  Adam Kilgarriff,et al.  Framework and Results for English SENSEVAL , 2000, Comput. Humanit..

[343]  Stuart M. Shieber,et al.  Principles and Implementation of Deductive Parsing , 1994, J. Log. Program..

[344]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.