Many Czech References for 50 Sentences Selected from WMT11 Data
暂无分享,去创建一个
This dataset contains the whole set of very many Czech translations for 50 English source sentences coming from WMT11 test set (http://www.statmt.org/wmt11).
In total, there are 15431447 Czech sentences, i.e. 300k reference translations per source English sentence on average, but the exact number greatly varies across sentences.
You can find more details in included README file.
If you use this dataset, please cite the following paper which describes the technique used to construct the Czech translations:
Bojar Ondřej, Machacek Matous, Tamchyna Ales, Zeman Daniel:
Scratching the Surface of Possible Translations.
Lecture Notes in Computer Science, Vol. 8082, Text, Speech and Dialogue: 16th
International Conference, TSD 2013. Proceedings, Copyright © Springer Verlag,
Berlin / Heidelberg, ISBN 978-3-642-40584-6, ISSN 0302-9743, pp. 465-474, 2013, DOI: 10.1007/978-3-642-40585-3_59