A Large Portuguese Corpus On-Line: Cleaning and Preprocessing
暂无分享,去创建一个
[1] Michel Généreux,et al. Lexical analysis of pre and post revolution discourse in Portugal , 2010, LREC 2010.
[2] Amália Mendes,et al. On the use of comparable corpora of African varieties of Portuguese for linguistic description and teaching/learning applications , 2008, LREC 2008.
[3] António Horta Branco,et al. Contractions: Breaking the Tokenization-Tagging Circularity , 2003, PROPOR.
[4] Thorsten Joachims,et al. Learning to classify text using support vector machines - methods, theory and algorithms , 2002, The Kluwer international series in engineering and computer science.
[5] W. B. Cavnar,et al. N-gram-based text categorization , 1994 .
[6] Amália Mendes,et al. Open Resources and Tools for the Shallow Processing of Portuguese: The TagShare Project , 2006, LREC.
[7] Tony Berber Sardinha. History and compilation of a large registerdiversified corpus of portuguese at cepril , 2007 .
[8] Sandra M. Aluísio,et al. The Lácio-Web: Corpora and Tools to Advance Brazilian Portuguese Language Investigations and Computational Linguistic Tools , 2004, LREC.
[9] Walter Daelemans,et al. Memory-Based Morphological Analysis , 1999, ACL.
[10] Thiago Alexandre Salgueiro Pardo,et al. Computational Processing of the Portuguese Language - 11th International Conference, PROPOR 2014, São Carlos/SP, Brazil, October 6-8, 2014. Proceedings , 2014, Lecture Notes in Computer Science.
[11] Luísa Pereira,et al. Portuguese Corpora at CLUL , 2000, LREC.
[12] António Branco,et al. A Suite of Shallow Processing Tools for Portuguese: LX-Suite , 2006, EACL.
[13] Sandra M. Aluísio,et al. An Account of the Challenge of Tagging a Reference Corpus for Brazilian Portuguese , 2003, PROPOR.
[14] Walter Daelemans,et al. Memory-Based Language Processing , 2009, Studies in natural language processing.
[15] Diana Santos. Linguateca's infrastructure for Portuguese and how it allows the detailed study of language varieties , 2011 .
[16] Walter Daelemans,et al. MBT: A Memory-Based Part of Speech Tagger-Generator , 1996, VLC@COLING.
[17] Stefan Evert. A Lightweight and Efficient Tool for Cleaning Web Pages , 2008, LREC.
[18] Diana Santos,et al. Evaluating CETEMPúblico, a Free Resource for Portuguese , 2001, ACL.
[19] António Branco,et al. Evaluating Solutions for the Rapid Development of State-of-the-Art POS Taggers for Portuguese , 2004, LREC.