N-Grams and Morphological Normalization in Text Classification: A Comparison on a Croatian-English Parallel Corpus
暂无分享,去创建一个
Annie Morin | Bojana Dalbelo Basic | Jean-Hugues Chauchat | Artur Silic | B. D. Basic | J. Chauchat | A. Morin | Artur Silic
[1] Jan Snajder,et al. Language morphology offset: Text classification on a Croatian-English parallel corpus , 2008, Inf. Process. Manag..
[2] David D. Lewis,et al. An evaluation of phrasal and clustered representations on a text categorization task , 1992, SIGIR '92.
[3] Ted E. Dunning,et al. Statistical Identification of Language , 1994 .
[4] Thorsten Joachims,et al. Learning to classify text using support vector machines - methods, theory and algorithms , 2002, The Kluwer international series in engineering and computer science.
[5] Yiming Yang,et al. A Comparative Study on Feature Selection in Text Categorization , 1997, ICML.
[6] Shi Bing,et al. Inductive learning algorithms and representations for text categorization , 2006 .
[7] Thorsten Joachims,et al. Text Categorization with Support Vector Machines: Learning with Many Relevant Features , 1998, ECML.
[8] Jacques Savoy,et al. Light stemming approaches for the French, Portuguese, German and Hungarian languages , 2006, SAC.
[9] Dunja Mladenic,et al. Using String Kernels for Classification of Slovenian Web Documents , 2005, GfKl.
[10] W. B. Cavnar,et al. N-gram-based text categorization , 1994 .
[11] Marko Tadic. Building the Croatian-English Parallel Corpus , 2000, LREC.
[12] C. E. SHANNON,et al. A mathematical theory of communication , 1948, MOCO.
[13] Fabrizio Sebastiani,et al. Machine learning in automated text categorization , 2001, CSUR.
[14] F. Saric,et al. TMT: Object-Oriented Text Classification Library , 2007, 2007 29th International Conference on Information Technology Interfaces.
[15] R. Jalam. Apprentissage automatique et catégorisation de textes multilingues , 2003 .
[16] R. Jalam,et al. Kernel-based text categorisation , 2001, IJCNN'01. International Joint Conference on Neural Networks. Proceedings (Cat. No.01CH37222).
[17] Dan Shen,et al. Performance and Scalability of a Large-Scale N-gram Based Information Retrieval System , 2000, J. Digit. Inf..
[18] Sang Joon Kim,et al. A Mathematical Theory of Communication , 2006 .
[19] Martin F. Porter,et al. An algorithm for suffix stripping , 1997, Program.
[20] Jean-Hugues Chauchat,et al. Pourquoi les n-grammes permettent de classer des textes? Recherche de mots-clefs pertinents à l'aide des n-grammes caractéristiques , 2002 .
[21] Marko Grobelnik,et al. Feature selection using linear classifier weights: interaction with classification models , 2004, SIGIR '04.
[22] Michael F. Lynch,et al. Stemming and N-gram matching for term conflation in Turkish texts , 1996, Information Research.
[23] Wessel Kraaij,et al. Variations on language modeling for information retrieval , 2005, SIGF.