论文信息 - Lingos, Finite State Machines, and Fast Similarity Searching - 字舞流文

Lingos, Finite State Machines, and Fast Similarity Searching

We apply a recently published method of text-based molecular similarity searching (LINGO) to standard data sets for the purpose of quantifying the accuracy of the approach. Our implementation is based on a pattern-matching finite state machine (FSM) which results in fast search times. The accuracy of LINGO is demonstrated to be comparable to that of a path-based fingerprint and offers a simple yet effective method for similarity searching.

Roger A. Sayle | J. Andrew Grant | Barry T. Pickup | Anthony Nicholls | James A. Haigh | J. A. Grant | B. T. Pickup | A. Nicholls | R. Sayle | James A. Haigh | J. Andrew Grant

[1] Andreas Bender,et al. A Discussion of Measures of Enrichment in Virtual Screening: Comparing the Information Content of Descriptors with Increasing Levels of Sophistication , 2005, J. Chem. Inf. Model..

[2] Pierre Baldi,et al. Kernels for small molecules and the prediction of mutagenicity, toxicity and anti-cancer activity , 2005, ISMB.

[3] Jérôme Hert,et al. Comparison of Fingerprint-Based Methods for Virtual Screening Using Multiple Bioactive Reference Structures , 2004, J. Chem. Inf. Model..

[4] U. Lessel,et al. In vitro and in silico affinity fingerprints: Finding similarities beyond structural classes , 2000 .

[5] Robert D Clark,et al. Neighborhood behavior: a useful concept for validation of "molecular diversity" descriptors. , 1996, Journal of medicinal chemistry.

[6] John M. Barnard,et al. Chemical Similarity Searching , 1998, J. Chem. Inf. Comput. Sci..

[7] Thierry Kogej,et al. Multifingerprint Based Similarity Searches for Targeted Class Compound Selection , 2006, J. Chem. Inf. Model..

[8] Jérôme Hert,et al. New Methods for Ligand-Based Virtual Screening: Use of Data Fusion and Machine Learning to Enhance the Effectiveness of Similarity Searching , 2006, J. Chem. Inf. Model..

[9] P. Willett,et al. Combination of molecular similarity measures using data fusion , 2000 .

[10] Tudor I. Oprea,et al. Is There a Difference Between Leads and Drugs? A Historical Perspective. , 2001 .

[11] Winston Hide,et al. Biological Evaluation of d2, an Algorithm for High-Performance Sequence Comparison , 1994, J. Comput. Biol..

[12] Ramaswamy Nilakantan,et al. New method for rapid characterization of molecular shapes: applications in drug design , 1993, J. Chem. Inf. Comput. Sci..

[13] M. Murcko,et al. Consensus scoring: A method for obtaining improved hit rates from docking databases of three-dimensional structures into proteins. , 1999, Journal of medicinal chemistry.

[14] David Weininger,et al. SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules , 1988, J. Chem. Inf. Comput. Sci..

[15] Robert P Sheridan,et al. Why do we need so many chemical similarity search methods? , 2002, Drug discovery today.

[16] David Vidal,et al. LINGO, an Efficient Holographic Text Based Method To Calculate Biophysical Properties and Intermolecular Similarities , 2005, J. Chem. Inf. Model..

[17] Irwin D. Kuntz,et al. A fast and efficient method for 2D and 3D molecular shape description , 1992, J. Comput. Aided Mol. Des..

[18] David Vidal,et al. A Novel Search Engine for Virtual Screening of Very Large Databases , 2006, J. Chem. Inf. Model..

[19] D. Davison,et al. A measure of DNA sequence dissimilarity based on Mahalanobis distance between frequencies of words. , 1997, Biometrics.

[20] M. Congreve,et al. Fragment-based lead discovery , 2004, Nature Reviews Drug Discovery.

[21] Jonas S. Almeida,et al. Alignment-free sequence comparison-a review , 2003, Bioinform..

[22] David Weininger,et al. SMILES. 2. Algorithm for generation of unique SMILES notation , 1989, J. Chem. Inf. Comput. Sci..

[23] Peter Willett,et al. Similarity searching in files of three-dimensional chemical structures: Comparison of fragment-based measures of shape similarity , 1994, J. Chem. Inf. Comput. Sci..