Multi-Node Graphs: A Framework for Multiplexed Biological Assays

Multiplex polymerase chain reaction (PCR) is an extension of the standard PCR protocol in which primers for multiple DNA loci are pooled together within a single reaction tube, enabling simultaneous sequence amplification, thus reducing costs and saving time. Potential cost saving and throughput improvements directly depend on the level of multiplexing achieved. Designing reliable and highly multiplexed assays is challenging because primers that are pooled together in a single reaction tube may cross-hybridize, though this can be addressed either by modifying the choice of primers for one or more amplicons, or by altering the way in which DNA loci are partitioned into separate reaction tubes. In this paper, we introduce a new graph formalism called a multi-node graph, and describe its application to the analysis of multiplex PCR scalability. We show, using random multi-node graphs that the scalability of multiplex PCR is constrained by a phase transition, suggesting fundamental limits on efforts to improve the cost-effectiveness and throughput of standard multiplex PCR assays. In particular, we show that when the multiplexing level of the reaction tubes is roughly theta(log (sn)) (where s is the number of primer pair candidates per locus and n is the number of loci to be amplified), then with very high probability we can 'cover' all loci with a valid assignment to one of the tubes in the assay. However, when the multiplexing level of the tube exceeds these bounds, there is no possible cover and moreover the size of the cover drops dramatically. Simulations using a simple greedy algorithm on real DNA data also confirm the presence of this phase transition. Our theoretical results suggest, however, that the resulting phase transition is a fundamental characteristic of the problem, implying intrinsic limits on the development of future assay design algorithms.

[1]  Pavel A. Pevzner,et al.  Computational molecular biology : an algorithmic approach , 2000 .

[2]  David S. Latchman,et al.  PCR applications in pathology : principles and practice , 1995 .

[3]  Richard M. Karp,et al.  Mapping the genome: some combinatorial problems arising in molecular biology , 1993, STOC.

[4]  J. Sninsky,et al.  PCR Applications: Protocols for Functional Genomics , 1999 .

[5]  P E Klapper,et al.  Multiplex PCR: Optimization and Application in Diagnostic Virology , 2000, Clinical Microbiology Reviews.

[6]  Simon Kasif,et al.  Biological context networks: a mosaic view of the interactome , 2006, Molecular systems biology.

[7]  Oliver Riordan,et al.  Spanning Subgraphs of Random Graphs , 2000, Combinatorics, Probability and Computing.

[8]  Dan Gusfield,et al.  Algorithms on Strings, Trees, and Sequences - Computer Science and Computational Biology , 1997 .

[9]  Yonatan Aumann,et al.  Designing optimally multiplexed SNP genotyping assays , 2003, J. Comput. Syst. Sci..

[10]  Henry A. Erlich,et al.  The polymerase chain reaction. , 1989, Trends in genetics : TIG.

[11]  Simon Kasif,et al.  MuPlex: multi-objective multiplex PCR assay design , 2005, Nucleic Acids Res..

[12]  Svante Janson,et al.  Random graphs , 2000, ZOR Methods Model. Oper. Res..

[13]  K. Mullis,et al.  Specific enzymatic amplification of DNA in vitro: the polymerase chain reaction. , 1986, Cold Spring Harbor symposia on quantitative biology.

[14]  R. Ravi,et al.  A polylogarithmic approximation algorithm for the group Steiner tree problem , 2000, SODA '98.

[15]  S Andelinović,et al.  DNA typing from skeletal remains: evaluation of multiplex and megaplex STR systems on DNA isolated from bone and teeth samples. , 2001, Croatian medical journal.

[16]  Simon Kasif,et al.  Computational tradeoffs in multiplex PCR assay design for SNP genotyping , 2005, BMC Genomics.

[17]  Simon Kasif Towards a Constraint-Based Engineering Framework for Algorithm Design and Application , 1996, CSUR.

[18]  Zohar Yakhini,et al.  Multiplexing Schemes for Generic SNP Genotyping Assays , 2004, Pacific Symposium on Biocomputing.

[19]  Oded Schwartz,et al.  On the complexity of approximating tsp with neighborhoods and related problems , 2003, computational complexity.

[20]  Thomas Kämpke,et al.  Efficient primer design algorithms , 2001, Bioinform..

[21]  Doi,et al.  A Greedy Algorithm for Minimizing the Number of Primers in Multiple PCR Experiments. , 1999, Genome informatics. Workshop on Genome Informatics.

[22]  Rina Dechter,et al.  Constraint Processing , 1995, Lecture Notes in Computer Science.

[23]  Takaki Ishikawa,et al.  A new 39-plex analysis method for SNPs including 15 blood group loci. , 2004, Forensic science international.

[24]  S Rozen,et al.  Primer3 on the WWW for general users and for biologist programmers. , 2000, Methods in molecular biology.

[25]  R A Gibbs,et al.  Multiplex PCR: advantages, development, and applications. , 1994, PCR methods and applications.

[26]  P. Pevzner,et al.  An Eulerian path approach to DNA fragment assembly , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[27]  S. Salzberg,et al.  Optimized multiplex PCR: efficiently closing a whole-genome shotgun sequencing project. , 1999, Genomics.

[28]  M M Shi,et al.  Enabling large-scale pharmacogenetic studies by high-throughput mutation detection and genotyping technologies. , 2001, Clinical chemistry.

[29]  Béla Bollobás,et al.  Random Graphs , 1985 .