论文信息 - Enrichment of Extremely Noisy High-Throughput Screening Data Using a Naïve Bayes Classifier

Enrichment of Extremely Noisy High-Throughput Screening Data Using a Naïve Bayes Classifier

The noise level of a high-throughput screening (HTS) experiment depends on various factors such as the quality and robustness of the assay itself and the quality of the robotic platform. Screening of compound mixtures is noisier than screening single compounds per well. A classification model based on naïve Bayes (NB) may be used to enrich such data. The authors studied the ability of the NB classifier to prioritize noisy primary HTS data of compound mixtures (5 compounds/well) in 4 campaigns in which the percentage of noise presumed to be inactive compounds ranged between 81% and 91%. The top 10% of the compounds suggested by the classifier captured between 26% and 45% of the active compounds. These results are reasonable and useful, considering the poor quality of the training set and the short computing time that is needed to build and deploy the classifier. (Journal of Biomolecular Screening 2004:32-36)

[1] M. Glick,et al. Prioritization of high throughput screening data of compound mixtures using molecular similarity , 2003 .

[2] D. Opitz,et al. Popular Ensemble Methods: An Empirical Study , 1999, J. Artif. Intell. Res..

[3] Paul Labute,et al. Binary QSAR: A New Method for the Determination of Quantitative Structure Activity Relationships , 1998, Pacific Symposium on Biocomputing.

[4] J H Zhang,et al. Confirmation of primary active substances from high throughput screening of chemical and biological populations: a statistical approach and practical considerations. , 2000, Journal of combinatorial chemistry.

[5] Ian H. Witten,et al. Data mining: practical machine learning tools and techniques with Java implementations , 2002, SGMD.

[6] Brian Hudson,et al. Strategic Pooling of Compounds for High-Throughput Screening , 1999, J. Chem. Inf. Comput. Sci..

[7] D. Ruppert. The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2004 .

[8] Ian H. Witten,et al. Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .

[9] H. L. Morgan. The Generation of a Unique Machine Description for Chemical Structures-A Technique Developed at Chemical Abstracts Service. , 1965 .

[10] F. Lombardo,et al. Experimental and computational approaches to estimate solubility and permeability in drug discovery and development settings. , 2001, Advanced drug delivery reviews.