An introductory bioinformatics exercise to reinforce gene structure and expression and analyze the relationship between gene and protein sequences

We have developed an introductory bioinformatics exercise for sophomore biology and biochemistry students that reinforces the understanding of the structure of a gene and the principles and events involved in its expression. In addition, the activity illustrates the severe effect mutations in a gene sequence can have on the protein product. Students search GenBank for the wild‐type nucleotide sequence of the Caenorhabditis elegans unc‐22 gene, the amino acid sequence of its gene product, and the nucleotide sequence of the transposon Tc5. The nucleotide sequences are manipulated using two programs in the Lasergene® software package from DNASTAR®. The first program, EditSeq®, enables students to experience the meticulous process required to precisely locate and remove intron sequences from the wild‐type unc‐22 allele to generate a cDNA sequence. The unc‐22(r466) allele is generated by inserting the sequence of the transposon Tc5 into the appropriate location of the third exon in unc‐22. The open reading frames of both cDNAs are located and then translated. MegAlign®, the second program, aligns the wild‐type sequence of the UNC‐22 protein and the wild‐type and mutant protein sequences that were constructed. The degree of sequence similarity between the aligned proteins allows students to verify their success in processing the gene, as well as to visualize the truncated protein product from the Tc5 mutant allele. Student feedback and possible modifications to the exercise as well as supplemental exercises are also discussed.