A Manual for Web Corpus Crawling of Low Resource Languages
暂无分享,去创建一个
[1] Holly Hearon,et al. Orality and Literacy , 2016 .
[2] M. Cysouw. Disentangling geography from genealogy , 2013 .
[3] Thomas Eckart,et al. Building Large Monolingual Dictionaries at the Leipzig Corpora Collection: From 100 to 200 Languages , 2012, LREC.
[4] Duygu Özge Demir. Dying Words : Endangered Languages and What They Have to Tell Us , 2012 .
[5] András Kornai. Digital language death , 2013 .
[6] Andrei Z. Broder,et al. Graph structure in the Web , 2000, Comput. Networks.
[7] Silvia Bernardini,et al. The WaCky wide web: a collection of very large linguistically processed web-crawled corpora , 2009, Lang. Resour. Evaluation.
[8] Éva Csató Johanson,et al. The Turkic Languages , 1998 .
[9] Jörg Tiedemann,et al. Parallel Data, Tools and Interfaces in OPUS , 2012, LREC.
[10] Lindsay J. Whaley,et al. Dying words: endangered languages and what they have to tell us , 2011 .
[11] Sharon L. Milgram,et al. The Small World Problem , 1967 .
[12] Gregory Grefenstette,et al. Web as Corpus , 2003 .
[13] Kevin P. Scannell. The Crúbadán Project: Corpus building for under-resourced languages , 2007 .
[14] B. Comrie,et al. Appendixes to Some observations on typological features of hunter-gatherer languages , 2013 .
[15] John F. Dooley,et al. History of Cryptography and Cryptanalysis , 2018, History of Computing.
[16] Robert W. Gehl,et al. Weaving the Dark Web: Legitimacy on Freenet, Tor, and I2p , 2018 .
[17] Antal van den Bosch,et al. Estimating search engine index size variability: a 9-year longitudinal study , 2016, Scientometrics.
[18] Silvia Bernardini,et al. BootCaT: Bootstrapping Corpora and Terms from the Web , 2004, LREC.