Modélisation du prétraitement des textes
暂无分享,去创建一个
[1] Pasi Tapanainen,et al. What is a word, What is a sentence? Problems of Tokenization , 1994 .
[2] Martti Juhola,et al. Stemming and lemmatization in the clustering of finnish text documents , 2004, CIKM '04.
[3] Gilles Adda,et al. Towards tokenization evaluation , 1998, LREC.
[4] Andrei Mikheev,et al. Document centered approach to text normalization , 2000, SIGIR '00.
[5] José Gabriel Pereira Lopes,et al. EXTRACTION AUTOMATIQUE D'ASSOCIATIONS TEXTUELLES PARTIR DE CORPORA NON TRAITS , 2000 .
[6] Lori Lamel,et al. Text normalization and speech recognition in French , 1997, EUROSPEECH.
[7] David A. Hull. Stemming Algorithms: A Case Study for Detailed Evaluation , 1996, J. Am. Soc. Inf. Sci..
[8] Thomas Heitz,et al. From the Texts to the Contexts They Contain: A Chain of Linguistic Treatments , 2004, TREC.
[9] Tong Zhang,et al. Updating an NLP system to fit new domains: an empirical study on the sentence segmentation problem , 2003, CoNLL.
[10] Stephen Tomlinson,et al. Lexical and Algorithmic Stemming Compared for 9 European Languages with Hummingbird SearchServerTM at CLEF 2003 , 2003, CLEF.
[11] Éric Villemonte de la Clergerie,et al. MAF: a Morphosyntactic Annotation Framework , 2005 .