Hybrid DNA artifact from PCR of closely related target sequences.

We have recently determined through standard molecular cloning methods that Xenopus laevis contains two different nonallelic preproinsulin genes which are very similar to each other (i.e. 94% identity in the coding region) (1). In order to obtain the remaining sequence of the 5'-untranslated regions we used the anchored polymerase chain reaction method as described by others (2,3). We first reverse-transcribed Xenopus pancreatic RNA using an antisense oligonucleotide common to both preproinsulin I and II mRNAs (oligo X, fig. 1). The resulting single-stranded DNA was tailed using terminal 3'-deoxyribonucleotide transferase (TdT) and dATP. Amplification was accomplished by PCR using two antisense oligonucleotides, one common to both preproinsulin I and II mRNAs (oligo Y, fig. 1), and the second oligonucleotide, oligo-dT2Q. The resulting amplified DNA fragment, approximately 350 base-pairs in length, hybridized strongly to a radiolabeled preproinsulin cDNA probe, and was subcloned for subsequent dideoxy sequence analysis. The nucleotide sequence of three independent clones corresponded exactly to preproinsulin I and included the entire S'-untranslated region (85 nucleotides in length). Similarly, an additional four clones corresponded exactly to preproinsulin II and included a similar, but different 5'-untranslated region. Interestingly, a clone was characterized in which the first 186 nucleotides (5'-untranslated region plus the signal peptide and a portion of the B-chain) corresponded exactly to preproinsulin I, followed by 134 nucleotides (the remainder of the B-chain and a portion of the C-peptide) which corresponded exactly to preproinsulin II cDNA. We believe this hybrid clone is an artifact generated by PCR in which oligonucleotide Y initially annealed to a preproinsulin II cDNA, but was only partially extended by Taq polymerase. Since preproinsulin I and II cDNAs are very similar, this partially extended DNA may have annealed to preproinsulin I cDNA during a subsequent cycle, and acted as a template to complete the extension resulting in a hybrid cDNA clone; preproinsulin I at its 5'-end, and preproinsulin II at its 3'-end. Partial extension could have occurred if Taq polymerase encountered a region of secondary structure. Indeed, a region of compression (i.e., secondary structure) was apparent in the sequencing gel near the junction in the hybrid clone. Alternatively, oligonucleotide X could have annealed to a partially degraded preproinsuHn II mRNA, and extended to the end of this mRNA during reverse transcription. During PCR, this partial-length cDNA could have annealed to a full-length preproinsulin I cDNA and completed its extension resulting in a hybrid clone. Although we cannot rule out the possibility that mRNA corresponding to this hybrid clone exists in vivo, we believe this is most unlikely (2). We conclude that hybrid DNA formation may occur when trying to amplify target sequences which are members of of a gene family, or when trying to characterize each allele from a heterozygous subject, especially during forensic applications in which partially degraded DNA may be used, or when trying to characterize a mutant allele causing a defined phenotypic abnormality. References: 1. Shuldiner A.R., Phillips S., Roberts C.T.Jr., and Roth J. (1989) J. Biol. Chem., in press. 2. Frohman M.A., Dush M.K., and Martin G.R., (1988) Proc. Natl. Acad. Sci. USA 85:8998-9002. 3. Loh E.Y., Elliott J.F., Cwirla S., Lanier L.L., and Davis M.M. (1989) Science 243:217-220.