Indexing schemes for similarity search in datasets of short protein fragments
暂无分享,去创建一个
[1] CiacciaPaolo,et al. Searching in metric spaces with user-defined and approximate distances , 2002 .
[2] Ambuj K. Singh,et al. Efficient Index Structures for String Databases , 2001, VLDB.
[3] Marco Patella,et al. A Query-sensitive Cost Model for Similarity Queries with M-tree , 1999, Australasian Database Conference.
[4] Pavel Zezula,et al. Processing Complex Similarity Queries with Distance-Based Access Methods , 1998, EDBT.
[5] James Ze Wang,et al. SST: an algorithm for finding near-exact sequence matches in time proportional to the logarithm of the database size , 2002, Bioinform..
[6] Tamer Kahveci,et al. An Efficient Index Structure for String Databases , 2001 .
[7] Paolo Vitolo. The representation of weighted quasimetric spaces , 1999 .
[8] Jeremy Buhler,et al. Efficient large-scale sequence comparison by locality-sensitive hashing , 2001, Bioinform..
[9] M. Gromov. Metric Structures for Riemannian and Non-Riemannian Spaces , 1999 .
[10] Robert Lowen,et al. Handbook of the History of General Topology , 1997 .
[11] Vladimir Pestov,et al. On the geometry of similarity search: Dimensionality curse and concentration of measure , 1999, Inf. Process. Lett..
[12] Juha Kärkkäinen,et al. Better Filtering with Gapped q-Grams , 2001, Fundam. Informaticae.
[13] D. Kitts,et al. Bioactive proteins and peptides from food sources. Applications of bioprocesses used in isolation and recovery. , 2003, Current pharmaceutical design.
[14] Christos Faloutsos,et al. The "DGX" distribution for mining massive, skewed data , 2001, KDD '01.
[15] Anthony K. H. Tung,et al. The ed-tree: an index for large DNA sequence databases , 2003, 15th International Conference on Scientific and Statistical Database Management, 2003..
[16] Esko Ukkonen,et al. Approximate String Matching with q-grams and Maximal Matches , 1992, Theor. Comput. Sci..
[17] Ambuj K. Singh,et al. ProGreSS: Simultaneous Searching of Protein Databases by Sequence and Structure , 2004, Pacific Symposium on Biocomputing.
[18] Malcolm P. Atkinson,et al. A Database Index to Large Biological Sequences , 2001, VLDB.
[19] Aleksandar Stojmirovic. Quasi-metric spaces with measure , 2003 .
[20] G. Gonnet,et al. Exhaustive matching of the entire protein sequence database. , 1992, Science.
[21] Michael G. Walker,et al. SST: An algorithm for searching sequence databases in time proportional to the logarithm of the database size , 2000 .
[22] Peter N. Yianilos,et al. Data structures and algorithms for nearest neighbor search in general metric spaces , 1993, SODA '93.
[23] Esko Ukkonen,et al. Constructing Suffix Trees On-Line in Linear Time , 1992, IFIP Congress.
[24] M S Waterman,et al. Identification of common molecular subsequences. , 1981, Journal of molecular biology.
[25] Gregory D. Schuler,et al. Database resources of the National Center for Biotechnology Information: update , 2004, Nucleic acids research.
[26] A. D. McLachlan,et al. Profile analysis: detection of distantly related proteins. , 1987, Proceedings of the National Academy of Sciences of the United States of America.
[27] Pavel Zezula,et al. M-tree: An Efficient Access Method for Similarity Search in Metric Spaces , 1997, VLDB.
[28] Dónall A. Mac Dónaill,et al. Representation of amino acids as five-bit or three-bit patterns for filtering protein databases , 2001, Bioinform..
[29] Hanan Samet,et al. Index-driven similarity search in metric spaces (Survey Article) , 2003, TODS.
[30] M McCreightEdward. A Space-Economical Suffix Tree Construction Algorithm , 1976 .
[31] Stefan Kurtz,et al. Reducing the space requirement of suffix trees , 1999 .
[32] Hans-Peter A. Künzi,et al. Nonsymmetric Distances and Their Associated Topologies: About the Origins of Basic Ideas in the Area of Asymmetric Topology , 2001 .
[33] Z. Meral Özsoyoglu,et al. Indexing large metric spaces for similarity search queries , 1999, TODS.
[34] Rolf Apweiler,et al. The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000 , 2000, Nucleic Acids Res..
[35] Maria Jesus Martin,et al. The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003 , 2003, Nucleic Acids Res..
[36] Marco Patella,et al. Bulk Loading the M-tree , 2001 .
[37] Marco Patella,et al. Searching in metric spaces with user-defined and approximate distances , 2002, TODS.
[38] W. J. Kent,et al. BLAT--the BLAST-like alignment tool. , 2002, Genome research.
[39] Durbin,et al. Biological Sequence Analysis , 1998 .
[40] Gregory D. Schuler,et al. Database resources of the National Center for Biotechnology Information , 2021, Nucleic Acids Res..
[41] Pavel Zezula,et al. Processing M-trees with parallel resources , 1998, Proceedings Eighth International Workshop on Research Issues in Data Engineering. Continuous-Media Databases and Applications.
[42] Christos Faloutsos,et al. The R+-Tree: A Dynamic Index for Multi-Dimensional Objects , 1987, VLDB.
[43] Jonathan Goldstein,et al. When Is ''Nearest Neighbor'' Meaningful? , 1999, ICDT.
[44] Ricardo A. Baeza-Yates,et al. Searching in metric spaces , 2001, CSUR.
[45] Donald R. Morrison,et al. PATRICIA—Practical Algorithm To Retrieve Information Coded in Alphanumeric , 1968, J. ACM.
[46] Daniel P. Miranker,et al. An Assessment of a Metric Space Database Index to Support Sequence Homology , 2005, Int. J. Artif. Intell. Tools.
[47] Eugene W. Myers,et al. Suffix arrays: a new method for on-line string searches , 1993, SODA '90.
[48] S. Henikoff,et al. Amino acid substitution matrices from protein blocks. , 1992, Proceedings of the National Academy of Sciences of the United States of America.
[49] R. F. Smith,et al. Automatic generation of primary sequence patterns from sets of related protein sequences. , 1990, Proceedings of the National Academy of Sciences of the United States of America.
[50] David Haussler,et al. Dirichlet mixtures: a method for improved detection of weak but significant protein sequence homology , 1996, Comput. Appl. Biosci..
[51] Ela Hunt. Indexed Searching on Proteins Using a Suffix Sequoia , 2004, IEEE Data Eng. Bull..
[52] Vladimir Pestov,et al. Indexing Schemes for Similarity Search: an Illustrated Paradigm , 2002, Fundam. Informaticae.
[53] Hans-Peter A. Künzi,et al. Weighted Quasi‐Metrics , 1994 .
[54] M. O. Dayhoff,et al. Atlas of protein sequence and structure , 1965 .
[55] Dan Gusfield,et al. Algorithms on Strings, Trees, and Sequences - Computer Science and Computational Biology , 1997 .
[56] M. Ledoux. The concentration of measure phenomenon , 2001 .
[57] Edward M. McCreight,et al. A Space-Economical Suffix Tree Construction Algorithm , 1976, JACM.
[58] Ela Hunt. The Suffix Sequoia Index for Approximate String Matching , 2003 .
[59] Edward Fredkin,et al. Trie memory , 1960, Commun. ACM.
[60] M. O. Dayhoff,et al. 22 A Model of Evolutionary Change in Proteins , 1978 .
[61] Liisa Holm,et al. RSDB: representative protein sequence databases have high information content , 2000, Bioinform..
[62] BozkayaTolga,et al. Distance-based indexing for high-dimensional metric spaces , 1997 .
[63] Gonzalo Navarro,et al. A Hybrid Indexing Method for Approximate String Matching , 2007 .
[64] Peter Weiner,et al. Linear Pattern Matching Algorithms , 1973, SWAT.
[65] Christus,et al. A General Method Applicable to the Search for Similarities in the Amino Acid Sequence of Two Proteins , 2022 .
[66] P. Bieniasz,et al. HIV-1 and Ebola virus encode small peptide motifs that recruit Tsg101 to sites of particle assembly to facilitate egress , 2001, Nature Medicine.