EMILLE, A 67-Million Word Corpus of Indic Languages: Data Collection, Mark-up and Harmonisation
暂无分享,去创建一个
Tony McEnery | Andrew Hardie | Hamish Cunningham | Robert J. Gaizauskas | Paul Baker | R. Gaizauskas | H. Cunningham | Tony McEnery | Paul Baker | Andrew Hardie
[1] Tony McEnery,et al. A new agenda for corpus linguistics - working with all of the world's languages , 2000 .
[2] Bidyut B. Chaudhuri,et al. Computer recognition of printed Bangla script , 1995 .
[3] Tony McEnery,et al. Validation tecniques for language corpora: a report from the front , 1998, LREC.
[4] Anthony McEnery,et al. Building a corpus of spoken sylheti. , 1999 .
[5] Tony McEnery,et al. Building a parallel corpus of English/Panjabi , 2000 .
[6] Geoffrey Leech,et al. Spoken English on Computer: Transcription, Mark-Up and Application , 1995 .
[7] Tony McEnery,et al. Corpus Resources and Minority Language Engineering , 2000, LREC.
[8] Kalina Bontcheva,et al. Experience using GATE for NLP R&D , 2000, COLING 2000.
[9] Geoffrey Leech,et al. Standards for Tagsets. , 1999 .
[10] Signe Oksefjell,et al. A description of the English-Norwegian parallel corpus : Compilation and further developments , 1999 .