Hypergeometric language models for republished article finding
暂无分享,去创建一个
[1] Alberto Barrón-Cedeño,et al. Reducing the Plagiarism Detection Search Space on the Basis of the Kullback-Leibler Distance , 2009, CICLing.
[2] W. Bruce Croft,et al. Finding text reuse on the web , 2009, WSDM '09.
[3] David Kauchak,et al. Modeling word burstiness using the Dirichlet distribution , 2005, ICML.
[4] Andrei Z. Broder,et al. On the resemblance and containment of documents , 1997, Proceedings. Compression and Complexity of SEQUENCES 1997 (Cat. No.97TB100171).
[5] Alexander Löser,et al. Near-duplicate detection for web-forums , 2009, IDEAS '09.
[6] John D. Lafferty,et al. A study of smoothing methods for language models applied to Ad Hoc information retrieval , 2001, SIGIR '01.
[7] Luis Gravano,et al. dSCAM: finding document copies across multiple databases , 1996, Fourth International Conference on Parallel and Distributed Information Systems.
[8] Leo Egghe,et al. Duality in information retrieval and the hypergeometric distribution , 1997, J. Documentation.
[9] Rada Mihalcea,et al. Wikify!: linking documents to encyclopedic knowledge , 2007, CIKM '07.
[10] Jong Wook Kim,et al. Efficient overlap and content reuse detection in blogs and online news articles , 2009, WWW '09.
[11] Gianni Amati. Information Theoretic Approach to Information Extraction , 2006, FQAS.
[12] Kenneth T. Wallenius,et al. BIASED SAMPLING; THE NONCENTRAL HYPERGEOMETRIC PROBABILITY DISTRIBUTION , 1963 .
[13] W. John Wilbur,et al. Retrieval Testing with Hypergeometric Document Models , 1993, J. Am. Soc. Inf. Sci..
[14] Xuanjing Huang,et al. Efficient partial-duplicate detection based on sequence matching , 2010, SIGIR.
[15] W. Bruce Croft,et al. Local text reuse detection , 2008, SIGIR '08.
[16] Agner Fog,et al. Calculation Methods for Wallenius' Noncentral Hypergeometric Distribution , 2008, Commun. Stat. Simul. Comput..
[17] Jong Wook Kim,et al. Organization and Tagging of Blog and News Entries Based on Content Reuse , 2010, J. Signal Process. Syst..
[18] Andrew Trotman,et al. Overview of the INEX 2010 Link the Wiki Track , 2010, INEX.
[19] Richard M. Schwartz,et al. A hidden Markov model information retrieval system , 1999, SIGIR '99.
[20] Iadh Ounis,et al. Combining fields for query expansion and adaptive query expansion , 2007, Inf. Process. Manag..
[21] Leo Egghe,et al. A Theoretical Study of Recall and Precision Using a Topological Approach to Information Retrieval , 1998, Inf. Process. Manag..
[22] Monika Henzinger,et al. Detecting the origin of text segments efficiently , 2009, WWW '09.
[23] Djoerd Hiemstra,et al. Bayesian extension to the language model for ad hoc information retrieval , 2003, SIGIR.
[24] Daisuke Ikeda,et al. Automatically Linking News Articles to Blog Entries , 2006, AAAI Spring Symposium: Computational Approaches to Analyzing Weblogs.
[25] D. S. Moore,et al. The Basic Practice of Statistics , 2001 .
[26] Bill N. Schilit,et al. Generating links by mining quotations , 2008, Hypertext.
[27] Gianni Amati,et al. Frequentist and Bayesian Approach to Information Retrieval , 2006, ECIR.
[28] M. de Rijke,et al. Linking online news and social media , 2011, WSDM '11.
[29] Felipe Bravo-Marquez,et al. Hypergeometric Language Model and Zipf-Like Scoring Function for Web Document Similarity Retrieval , 2010, SPIRE.
[30] Monika Henzinger,et al. Finding near-duplicate web pages: a large-scale evaluation of algorithms , 2006, SIGIR.
[31] James Allan,et al. Topic detection and tracking: event-based information organization , 2002 .
[32] Gurmeet Singh Manku,et al. Detecting near-duplicates for web crawling , 2007, WWW '07.
[33] Jenq-Haur Wang,et al. Finding Event-Relevant Content from the Web Using a Near-Duplicate Detection Approach , 2007, IEEE/WIC/ACM International Conference on Web Intelligence (WI'07).
[34] Ian H. Witten,et al. Learning to link with wikipedia , 2008, CIKM '08.
[35] Robert Burgin,et al. Performance Standards and Evaluations in IR Test Collections: Vector-Space and Other Retrieval Models , 1997, Inf. Process. Manag..
[36] O. Vorobyev,et al. Discrete multivariate distributions , 2008, 0811.0406.
[37] Charles Elkan,et al. Clustering documents with an exponential-family approximation of the Dirichlet compound multinomial distribution , 2006, ICML.
[38] S. Robertson. The probability ranking principle in IR , 1997 .
[39] Ram Akella,et al. A new probabilistic retrieval model based on the dirichlet compound multinomial distribution , 2008, SIGIR '08.
[40] Djoerd Hiemstra,et al. Twenty-One at TREC7: Ad-hoc and Cross-Language Track , 1998, TREC.
[41] Craig MacDonald,et al. Using Relevance Feedback in Expert Search , 2007, ECIR.
[42] David S. Moore,et al. The Basic Practice of Statistics [With CDROM] , 1999 .