Developing computational infrastructure for the CorCenCC corpus: The National Corpus of Contemporary Welsh
暂无分享,去创建一个
[1] Mark Davies,et al. The Corpus of Contemporary American English as the first reliable monitor corpus of English , 2010, Lit. Linguistic Comput..
[2] Douglas Biber,et al. Representativeness in corpus design , 1993 .
[3] Briony Williams,et al. A welsh speech database: preliminary results , 1999, EUROSPEECH.
[4] Elhuyar Fundazioa,et al. ZT Corpus Annotation and tools for Basque corpora , .
[5] Delyth Prys,et al. Gathering Data for Speech Technology in the Welsh Language: A Case Study , 2018, LREC 2018.
[6] Erik Duval,et al. Metadata Principles and Practicalities , 2002, D Lib Mag..
[7] J. Herring,et al. Building bilingual corpora , 2014 .
[8] Adam Kilgarriff,et al. The Sketch Engine: ten years on , 2014 .
[9] Marc Kupietz,et al. The German Reference Corpus DeReKo: New Developments - New Opportunities , 2018, LREC.
[10] Atro Voutilainen,et al. A language-independent system for parsing unrestricted text , 1995 .
[11] Andrew Hardie,et al. CQPweb — combining power, flexibility and usability in a corpus analysis tool , 2012 .
[12] Dawn Knight,et al. Towards a Welsh Semantic Annotation System , 2018, LREC.
[13] Catherine Smith,et al. Crowdsourcing formulaic phrases: towards a new type of spoken corpus , 2020, Corpora.
[14] Kevin P. Scannell. The Crúbadán Project: Corpus building for under-resourced languages , 2007 .
[15] Robbie Love. Overcoming Challenges in Corpus Construction , 2020 .
[16] Daren C. Brabham. Crowdsourcing as a Model for Problem Solving , 2008 .
[17] Anthony McEnery,et al. The UCREL Semantic Analysis System , 2004 .
[18] Charles F. Meyer. English Corpus Linguistics: Frontmatter , 2002 .
[19] Riitta Jääskeläinen. Think-aloud protocol , 2010 .
[20] Karel Kucera. The Czech National Corpus: Principles, Design, and Results , 2002, Lit. Linguistic Comput..
[21] B. MacWhinney. The CHILDES project: tools for analyzing talk , 1992 .
[22] Dawn Knight,et al. Formality in Digital Discourse: A Study of Hedging in CANELC , 2013 .
[23] Dawn Knight,et al. The CorCenCC crowdsourcing app: a bespoke tool for the user-driven creation of the national corpus of contemporary Welsh , 2017 .
[24] Gemma Boleda,et al. CUCWeb: A Catalan corpus built from the Web , 2006 .
[25] Martin Wynne,et al. Developing Linguistic Corpora: a Guide to Good Practice , 2005 .
[26] Dawn Knight,et al. Leveraging Lexical Resources and Constraint Grammar for Rule-Based Part-of-Speech Tagging in Welsh , 2018, LREC.
[27] Fernando González-Ladrón-de-Guevara,et al. Towards an integrated crowdsourcing definition , 2012, J. Inf. Sci..
[28] G. Leech. The state of the art in corpus linguistics , 2014 .
[29] Fred Karlsson,et al. Constraint Grammar as a Framework for Parsing Running Text , 1990, COLING.
[30] Adam Kilgarriff,et al. The TenTen Corpus Family , 2013 .
[31] Thomas Schmidt,et al. The Database for Spoken German ― DGD2 , 2014, LREC.
[32] Nicholas Ostler,et al. Corpus Design Criteria , 1992 .
[33] Deborah E. White,et al. Thematic Analysis , 2017 .
[34] Rita C Simpson-Vlach,et al. The MICASE Handbook: A Resource for Users of the Michigan Corpus of Academic Spoken English , 2006 .
[35] Tony McEnery,et al. Introduction:compiling and analysing the Spoken British National Corpus 2014 , 2017 .
[36] Michael McCarthy,et al. Exploring Spoken English , 1997 .
[37] Guy Aston,et al. The BNC Handbook: Exploring the British National Corpus with SARA , 1998 .
[38] Siqi Liu. Overcoming Challenges in Corpus Construction: The Spoken British National Corpus 2014, by Robbie Love. New York: Routledge, 2020. ISBN 978-1-138-36737-1, xviii + 202 pages , 2021 .
[39] R. Carter,et al. Talking, Creating: Interactional Language, Creativity, and Context , 2004 .
[40] Robert Fuchs,et al. Expanding horizons in the study of World Englishes with the 1.9 billion word Global Web-based English Corpus (GloWbE) , 2015 .