A Factory of Comparable Corpora from Wikipedia
暂无分享,去创建一个
[1] Birger Hjørland,et al. Organizing Knowledge. An Introduction to Managing Access to Information , 2009, J. Documentation.
[2] Belinda Maia. What are comparable corpora , 2003 .
[3] Darren Gergle,et al. The tower of Babel meets web 2.0: user-generated content and its applications in a multilingual context , 2010, CHI.
[4] Andreas Stolcke,et al. SRILM - an extensible language modeling toolkit , 2002, INTERSPEECH.
[5] Steffen Staab,et al. Explicit Versus Latent Concept Models for Cross-Language Information Retrieval , 2009, IJCAI.
[6] Jörg Tiedemann,et al. Parallel Data, Tools and Interfaces in OPUS , 2012, LREC.
[7] Pablo Gamallo Otero,et al. Wikipedia as Multilingual Source of Comparable Corpora , 2011 .
[8] J. Brown,et al. Organizing Knowledge , 1998 .
[9] Inguna Skadina,et al. ACCURAT Toolkit for Multi-Level Alignment and Information Extraction from Comparable Corpora , 2012, ACL.
[10] Bruno Pouliquen,et al. Automatic Identification of Document Translations in Large Multilingual Document Collections , 2006, ArXiv.
[11] Inguna Skadina,et al. A Collection of Comparable Corpora for Under-resourced Languages , 2010, Baltic HLT.
[12] James Mayfield,et al. Character N-Gram Tokenization for European Language Text Retrieval , 2004, Information Retrieval.
[13] Martin F. Porter,et al. An algorithm for suffix stripping , 1997, Program.
[14] Maarten de Rijke,et al. Finding Similar Sentences across Multiple Languages in Wikipedia , 2006 .
[15] Philipp Koehn,et al. Moses: Open Source Toolkit for Statistical Machine Translation , 2007, ACL.
[16] Miles Osborne,et al. Statistical Machine Translation , 2010, Encyclopedia of Machine Learning and Data Mining.
[17] Jakob Uszkoreit,et al. Large Scale Parallel Document Mining for Machine Translation , 2010, COLING.
[18] Sabine Hunsicker,et al. Hybrid Parallel Sentence Mining from Comparable Corpora , 2012, EAMT.
[19] Philipp Koehn,et al. Europarl: A Parallel Corpus for Statistical Machine Translation , 2005, MTSUMMIT.
[20] Junichi Tsujii,et al. Bilingual Dictionary Extraction from Wikipedia , 2009, MTSUMMIT.
[21] Philippe Langlais,et al. Identifying Parallel Documents from a Large Bilingual Collection of Texts: Application to Parallel Article Extraction in Wikipedia. , 2011, BUCC@ACL.
[22] Pascale Fung,et al. A Statistical View on Bilingual Lexicon Extraction: From Parallel Corpora to Non-parallel Corpora , 1998, AMTA.
[23] Philipp Cimiano,et al. Exploiting Wikipedia for cross-lingual and multilingual information retrieval , 2012, Data Knowl. Eng..
[24] Jacob Cohen. A Coefficient of Agreement for Nominal Scales , 1960 .
[25] Franz Josef Och,et al. Minimum Error Rate Training in Statistical Machine Translation , 2003, ACL.
[26] Pascale Fung,et al. Rare Word Translation Extraction from Aligned Comparable Documents , 2011, ACL.
[27] Michel Simard,et al. Using cognates to align sentences in bilingual corpora , 1993, TMI.
[28] András A. Benczúr,et al. Cross-Language Retrieval with Wikipedia , 2008, CLEF.
[29] Kristina Toutanova,et al. Extracting Parallel Sentences from Comparable Corpora using Document Level Alignment , 2010, NAACL.
[30] Iryna Gurevych,et al. Extracting Lexical Semantic Knowledge from Wikipedia and Wiktionary , 2008, LREC.
[31] Salim Roukos,et al. Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.
[32] Takahiro Hara,et al. An Approach for Extracting Bilingual Terminology from Wikipedia , 2008, DASFAA.
[33] Benno Stein,et al. A Wikipedia-Based Multilingual Retrieval Model , 2008, ECIR.
[34] Pascale Fung,et al. Mining Very-Non-Parallel Corpora: Parallel Sentence and Lexicon Extraction via Bootstrapping and E , 2004, EMNLP.
[35] Hermann Ney,et al. A Systematic Comparison of Various Statistical Alignment Models , 2003, CL.
[36] Reinhard Rapp,et al. Identifying Word Translations in Non-Parallel Texts , 1995, ACL.
[37] Eiichiro Sumita,et al. Method for Building Sentence-Aligned Corpus from Wikipedia , 2008 .
[38] Pascale Fung,et al. Inversion Transduction Grammar Constraints for Mining Parallel Sentences from Quasi-Comparable Corpora , 2005, IJCNLP.
[39] Pierre Zweigenbaum,et al. Overviewing Important Aspects of the Last Twenty Years of Research in Comparable Corpora , 2013, Building and Using Comparable Corpora.
[40] Chenhui Chu,et al. Iterative Bilingual Lexicon Extraction from Comparable Corpora with Topical and Contextual Knowledge , 2014, CICLing.
[41] Qin Lu,et al. Corpus Exploitation from Wikipedia for Ontology Construction , 2008, LREC.
[42] Martin Volk,et al. Towards a Wikipedia-extracted alpine corpus , 2012 .
[43] Pascale Fung,et al. Compiling Bilingual Lexicon Entries From a Non-Parallel English-Chinese Corpus , 1995, VLC@ACL.