Crowdsourcing Translation by Leveraging Tournament Selection and Lattice-Based String Alignment

Crowdsourcing translation tasks typically face issues due to poor quality and spam translations. We propose a novel method for generating large multilingual text corpora leveraging Tournament Selection and Lattice-Based String Alignment without requiring expert involvement or Gold data. We use crowdsourcing for gathering a set of candidate translations of a given source sentence. A crowd sourced Tournament Selection allows to prune this set and keep an n-best list. Finally, a lattice-based string alignment allows to compose parts of these n-best translations to generate a consensus translation. As a work-in-progress, the evaluation is still ongoing, but we are confident in the potential for this method to generate translations of good quality.