Improving NCD accuracy by combining document segmentation and document distortion
暂无分享,去创建一个
[1] George Kingsley Zipf,et al. Human behavior and the principle of least effort , 1949 .
[2] Zhiguo Gong,et al. Web image indexing by using associated texts , 2005, Knowledge and Information Systems.
[3] R. A. Leibler,et al. On Information and Sufficiency , 1951 .
[4] W. John Wilbur,et al. The automatic identification of stop words , 1992, J. Inf. Sci..
[5] James Allan,et al. Approaches to passage retrieval in full text information systems , 1993, SIGIR.
[6] Alistair Moffat,et al. Efficient Retrieval of Partial Documents , 1995, Inf. Process. Manag..
[7] 李明,et al. New Information Distance Measure and Its Application in Question Answering System , 2008 .
[8] Jonathan D. Hirst,et al. Similarity by Compression , 2007, J. Chem. Inf. Model..
[9] Oren Etzioni,et al. Self-supervised Relation Extraction from the Web , 2006, ISMIS.
[10] Paul M. B. Vitányi,et al. The Google Similarity Distance , 2004, IEEE Transactions on Knowledge and Data Engineering.
[11] Bin Ma,et al. The similarity metric , 2001, IEEE Transactions on Information Theory.
[12] Wei-Ying Ma,et al. Block-based web search , 2004, SIGIR '04.
[13] Esko Ukkonen,et al. Approximate String Matching with q-grams and Maximal Matches , 1992, Theor. Comput. Sci..
[14] Tat-Seng Chua,et al. Mining dependency relations for query expansion in passage retrieval , 2006, SIGIR.
[15] Manuel Cebrián,et al. Contextual information retrieval based on algorithmic information theory and statistical outlier detection , 2007, 2008 IEEE Information Theory Workshop.
[16] David Camacho,et al. Is the contextual information relevant in text clustering by compression? , 2012, Expert Syst. Appl..
[17] Xiaojun Wan,et al. Beyond topical similarity: a structural similarity measure for retrieving highly similar documents , 2008, Knowledge and Information Systems.
[18] Jerry M. Mendel,et al. A vector similarity measure for linguistic approximation: Interval type-2 and type-1 fuzzy sets , 2008, Inf. Sci..
[19] Grzegorz Kondrak,et al. N-Gram Similarity and Distance , 2005, SPIRE.
[20] Jimmy J. Lin,et al. Quantitative evaluation of passage retrieval algorithms for question answering , 2003, SIGIR.
[21] Humberto Bustince,et al. Construction of fuzzy indices from fuzzy DI-subsethood measures: Application to the global comparison of images , 2007, Inf. Sci..
[22] Manuel Cebrián,et al. Evaluating the Impact of Information Distortion on Normalized Compression Distance , 2008, ICMCTA.
[23] Manuel Cebrián,et al. Reducing the Loss of Information through Annealing Text Distortion , 2011, IEEE Transactions on Knowledge and Data Engineering.
[24] Sally Temple,et al. Automatic Summarization of Changes in Biological Image Sequences Using Algorithmic Information Theory , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[25] Kimmo Kettunen,et al. Normalized Compression Distance Based Measures for MetricsMATR 2010 , 2010, WMT@ACL.
[26] Justin Zobel,et al. Passage retrieval revisited , 1997, SIGIR '97.
[27] Peter Schäuble,et al. Document and passage retrieval based on hidden Markov models , 1994, SIGIR '94.
[28] James P. Callan,et al. Passage-level evidence in document retrieval , 1994, SIGIR '94.
[29] Stefan Axelsson,et al. Similarity assessment for removal of noisy end user license agreements , 2011, Knowledge and Information Systems.
[30] Yiming Yang,et al. Noise reduction in a statistical approach to text categorization , 1995, SIGIR '95.
[31] Mohamed S. Kamel,et al. Document Similarity Using a Phrase Indexing Graph Model , 2003, Knowledge and Information Systems.
[32] Christian Plaunt,et al. Subtopic structuring for full-length document access , 1993, SIGIR.
[33] Jörg Tiedemann,et al. Simple is Best: Experiments with Different Document Segmentation Strategies for Passage Retrieval , 2008, COLING 2008.
[34] David Salomon,et al. Data Compression: The Complete Reference , 2006 .
[35] Alexander Dekhtyar,et al. Information Retrieval , 2018, Lecture Notes in Computer Science.
[36] Thanaruk Theeramunkong. Applying passage in Web text mining , 2004, Int. J. Intell. Syst..
[37] Tsachy Weissman,et al. The Information Lost in Erasures , 2008, IEEE Transactions on Information Theory.
[38] Mihai Datcu,et al. A Model Conditioned Data Compression Based Similarity Measure , 2008, Data Compression Conference (dcc 2008).
[39] Hans Peter Luhn,et al. The Automatic Creation of Literature Abstracts , 1958, IBM J. Res. Dev..
[40] David Salomon,et al. Data Compression , 2000, Springer Berlin Heidelberg.
[41] Humberto Bustince,et al. Relationship between restricted dissimilarity functions, restricted equivalence functions and normal EN-functions: Image thresholding invariant , 2008, Pattern Recognit. Lett..
[42] Paul M. B. Vitányi,et al. Clustering by compression , 2003, IEEE Transactions on Information Theory.
[43] Tao Li,et al. Using discriminant analysis for multi-class classification: an experimental investigation , 2006, Knowledge and Information Systems.
[44] Gerard Salton,et al. Automatic Text Processing: The Transformation, Analysis, and Retrieval of Information by Computer , 1989 .
[45] M MendelJerry,et al. A vector similarity measure for linguistic approximation , 2008 .
[46] Hui Xiong,et al. Enhancing data analysis with noise removal , 2006, IEEE Transactions on Knowledge and Data Engineering.
[47] R. Schiffer. Psychobiology of Language , 1986 .