Robust methods for accurate diagnosis using pan-microbiological oligonucleotide microarrays

BackgroundTo address the limitations of traditional virus and pathogen detection methodologies in clinical diagnosis, scientists have developed high-throughput oligonucleotide microarrays to rapidly identify infectious agents. However, objectively identifying pathogens from the complex hybridization patterns of these massively multiplexed arrays remains challenging.MethodsIn this study, we conceived an automated method based on the hypergeometric distribution for identifying pathogens in multiplexed arrays and compared it to five other methods. We evaluated these metrics: 1) accurate prediction, whether the top ranked prediction(s) match the real virus(es); 2) four accuracy scores.ResultsThough accurate prediction and high specificity and sensitivity can be achieved with several methods, the method based on hypergeometric distribution provides a significant advantage in term of positive predicting value with two to sixty folds the positive predicting values of other methods.ConclusionThe proposed multi-specie array analysis based on the hypergeometric distribution addresses shortcomings of previous methods by enhancing signals of positively hybridized probes.

[1]  P. Nederlof,et al.  Array-CGH and breast cancer , 2006, Breast Cancer Research.

[2]  J. Derisi,et al.  Microarray-based detection and genotyping of viral pathogens , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[3]  David A Stenger,et al.  Broad-spectrum respiratory tract pathogen identification using resequencing DNA microarrays. , 2006, Genome research.

[4]  Yang Liu,et al.  Panmicrobial Oligonucleotide Array for Diagnosis of Infectious Diseases , 2007, Emerging infectious diseases.

[5]  F. James Rohlf,et al.  Biometry: The Principles and Practice of Statistics in Biological Research , 1969 .

[6]  Joseph L DeRisi,et al.  E-Predict: a computational strategy for species identification based on observed DNA microarray hybridization patterns , 2005, Genome Biology.

[7]  James M. Eldred,et al.  Viral Discovery and Sequence Recovery Using DNA Microarrays , 2003, PLoS biology.

[8]  Eric S. Lander,et al.  Loss-of-heterozygosity analysis of small-cell lung carcinomas using single-nucleotide polymorphism arrays , 2000, Nature Biotechnology.

[9]  S. P. Fodor,et al.  Light-directed, spatially addressable parallel chemical synthesis. , 1991, Science.

[10]  L. Staudt,et al.  The use of molecular profiling to predict survival after chemotherapy for diffuse large-B-cell lymphoma. , 2002, The New England journal of medicine.

[11]  D. Albertson Profiling Breast Cancer by Array CGH , 2003, Breast Cancer Research and Treatment.

[12]  Todd,et al.  Diffuse large B-cell lymphoma outcome prediction by gene-expression profiling and supervised machine learning , 2002, Nature Medicine.

[13]  Jing Huang,et al.  Parallel genotyping of over 10,000 SNPs using a one-primer assay on a high-density oligonucleotide array. , 2004, Genome research.

[14]  David L. Wheeler,et al.  GenBank , 2015, Nucleic Acids Res..

[15]  Mona Singh,et al.  A cross-genomic approach for systematic mapping of phenotypic traits to genes. , 2003, Genome research.

[16]  Sean Conlan,et al.  Comprehensive viral oligonucleotide probe design using conserved protein regions , 2007, Nucleic acids research.

[17]  Ash A. Alizadeh,et al.  Prediction of survival in diffuse large-B-cell lymphoma based on the expression of six genes. , 2004, The New England journal of medicine.

[18]  Baochuan Lin,et al.  Automated identification of multiple micro-organisms from resequencing DNA microarrays , 2006, Nucleic acids research.

[19]  E. Winzeler,et al.  Genomics, gene expression and DNA arrays , 2000, Nature.

[20]  Ash A. Alizadeh,et al.  Diagnosis of a Critical Respiratory Illness Caused by Human Metapneumovirus by Use of a Pan-Virus Microarray , 2007, Journal of Clinical Microbiology.