Defeating the Homogeneity Assumption
暂无分享,去创建一个
[1] Adam Kilgarriff,et al. Corpora from the Web , 2005 .
[2] Gabriela Cavaglia. Measuring corpus homogeneity using a range of measures for inter-document distance , 2002, LREC.
[3] Hinrich Schütze,et al. Book Reviews: Foundations of Statistical Natural Language Processing , 1999, CL.
[4] Gabriela Cavaglia. ITRI-02-07 Measuring the homogeneity of different varieties of language , 2002 .
[5] Paul Rayson,et al. Comparing Corpora using Frequency Profiling , 2000, Proceedings of the workshop on Comparing corpora -.
[6] Kenneth Ward Church. Empirical Estimates of Adaptation: The chance of Two Noriegas is closer to p/2 than p2 , 2000, COLING.
[7] Alexander Franz. Independence Assumptions Considered Harmful , 1997, ACL.
[8] Tony G. Rose,et al. The Effects of Corpus Size and Homogeneity on Language Model Quality , 1997, VLC.
[9] Adam Kilgarriff,et al. Using Word Frequency Lists to Measure Corpus Homogeneity and Similarity between Corpora , 1997, VLC.
[10] Slava M. Katz. Distribution of content words and phrases in text and language modelling , 1996, Natural Language Engineering.
[11] Adam Kilgarriff,et al. Which words are particularly characteristic of a text? a survey of statistical approaches , 1996 .
[12] Ted Dunning,et al. Accurate Methods for the Statistics of Surprise and Coincidence , 1993, CL.