Information content of Caenorhabditis elegans splice site sequences varies with intron length.

A database of sequences of 139 introns from the nematode Caenorhabditis elegans was analyzed using the information measure of Schneider et al. (1986) J. Mol. Biol. 128: 415-431. Statistically significant information is encoded by at least the first 30 nt and last 20 nt of C. elegans introns. Both the quantity and the distribution of information in the 5' splice site sequences differs between the typical short (length less than 75 nt) and rarer long (length greater than 75 nt) introns, with the 5 sites of long introns containing approximately one bit more information. 3' splice site sequences of long and short C. elegans introns differ significantly in the region between -20 and -10 nt.