Space-Constrained Gram-Based Indexing for Efficient Approximate String Search
暂无分享,去创建一个
Jiaheng Lu | Chen Li | Shengyue Ji | Alexander Behm | Alexander Behm | S. Ji | Jiaheng Lu | Chen Li
[1] Luis Gravano,et al. Approximate String Joins in a Database (Almost) for Free , 2001, VLDB.
[2] Rajeev Motwani,et al. Robust and efficient fuzzy match for online data cleaning , 2003, SIGMOD '03.
[3] Xiaohui Yu,et al. Hashed samples: selectivity estimators for set similarity selection queries , 2008, Proc. VLDB Endow..
[4] M. Douglas,et al. Development of a Spelling List , 1982 .
[5] Luis Gravano,et al. Selectivity estimation for string predicates: overcoming the underestimation problem , 2004, Proceedings. 20th International Conference on Data Engineering.
[6] Surajit Chaudhuri,et al. A Primitive Operator for Similarity Joins in Data Cleaning , 2006, 22nd International Conference on Data Engineering (ICDE'06).
[7] Jeffrey Xu Yu,et al. Efficient similarity joins for near-duplicate detection , 2011, TODS.
[8] Raghav Kaushik,et al. Efficient exact set-similarity joins , 2006, VLDB.
[9] Xuemin Lin,et al. Ed-Join: an efficient algorithm for similarity joins with edit distance constraints , 2008, Proc. VLDB Endow..
[10] Chen Li,et al. Selectivity Estimation for Fuzzy String Predicates in Large Data Sets , 2005, VLDB.
[11] Divesh Srivastava,et al. Estimating the selectivity of approximate string queries , 2007, TODS.
[12] Divesh Srivastava,et al. Record linkage: similarity measures and algorithms , 2006, SIGMOD Conference.
[13] Gonzalo Navarro,et al. A guided tour to approximate string matching , 2001, CSUR.
[14] Divesh Srivastava,et al. Fast Indexes and Algorithms for Set Similarity Selection Queries , 2008, 2008 IEEE 24th International Conference on Data Engineering.
[15] Bin Wang,et al. Cost-based variable-length-gram selection for string collections to support approximate queries efficiently , 2008, SIGMOD Conference.
[16] Wenke Lee,et al. q-gram matching using tree models , 2006, IEEE Transactions on Knowledge and Data Engineering.
[17] M. D. McIlroy,et al. Development of a Spelling List , 1982, IEEE Trans. Commun..
[18] Lee Jae-Gil,et al. n-Gram/2L: A Space and Time Efficient Two-Level n-Gram Inverted Index Structure , 2006 .
[19] JUSTIN ZOBEL,et al. Inverted files for text search engines , 2006, CSUR.
[20] Divesh Srivastava,et al. Substring selectivity estimation , 1999, PODS '99.
[21] Kyuseok Shim,et al. Extending Q-Grams to Estimate Selectivity of String Matching with Low Edit Distance , 2007, VLDB.
[22] Sunita Sarawagi,et al. Efficient set joins on similarity predicates , 2004, SIGMOD '04.
[23] Alistair Moffat,et al. Self-indexing inverted files for fast text retrieval , 1996, TOIS.
[24] Athman Bouguettaya,et al. An Efficient Near-Duplicate Video Shot Detection Method Using Shot-Based Interest Points , 2009, IEEE Transactions on Multimedia.
[25] Z. Meral Özsoyoglu,et al. Distance based indexing for string proximity search , 2003, Proceedings 19th International Conference on Data Engineering (Cat. No.03CH37405).
[26] Peter Elias,et al. Universal codeword sets and representations of the integers , 1975, IEEE Trans. Inf. Theory.
[27] Bin Wang,et al. VGRAM: Improving Performance of Approximate Queries on String Collections Using Variable-Length Grams , 2007, VLDB.
[28] Zvi Galil,et al. Data structures and algorithms for disjoint set union problems , 1991, CSUR.
[29] Piotr Indyk,et al. Approximate nearest neighbors: towards removing the curse of dimensionality , 1998, STOC '98.
[30] Hakan Hacigümüs,et al. Indexing text data under space constraints , 2004, CIKM '04.
[31] Alistair Moffat,et al. Inverted Index Compression Using Word-Aligned Binary Codes , 2004, Information Retrieval.
[32] Abraham Lempel,et al. Compression of individual sequences via variable-rate coding , 1978, IEEE Trans. Inf. Theory.
[33] P. Krishnan,et al. Estimating alphanumeric selectivity in the presence of wildcards , 1996, SIGMOD '96.
[34] Jiaheng Lu,et al. Efficient Merging and Filtering Algorithms for Approximate String Searches , 2008, 2008 IEEE 24th International Conference on Data Engineering.
[35] Solomon W. Golomb,et al. Run-length encodings (Corresp.) , 1966, IEEE Trans. Inf. Theory.
[36] Roberto J. Bayardo,et al. Scaling up all pairs similarity search , 2007, WWW '07.