A Very Very Large Corpus Doesn’t Always Yield Reliable Estimates
暂无分享,去创建一个
[1] Mark Stevenson,et al. The Reuters Corpus Volume 1 -from Yesterday’s News to Tomorrow’s Language Resources , 2002, LREC.
[2] F ChenStanley,et al. An Empirical Study of Smoothing Techniques for Language Modeling , 1996, ACL.
[3] Treebank Penn,et al. Linguistic Data Consortium , 1999 .
[4] Martin Volk,et al. Exploiting the WWW as a corpus to resolve PP attachment ambiguities , 2001 .
[5] John D. Lafferty,et al. A Model of Lexical Attraction and Repulsion , 1997, ACL.
[6] V. V. Petrov. Limit Theorems of Probability Theory: Sequences of Independent Random Variables , 1995 .
[7] Frank Keller,et al. Using the Web to Overcome Data Sparseness , 2002, EMNLP.
[8] James R. Curran,et al. Scaling Context Space , 2002, ACL.
[9] Michele Banko,et al. Scaling to Very Very Large Corpora for Natural Language Disambiguation , 2001, ACL.