Random sequences.

The comparison of protein or nucleic acid sequences frequently leads to observations whose improbability can be tested only by Monte Carlo techniques that require randomizing the sequences being compared. Two decisions need to be made. One is whether one demands a resulting random sequence to have the properties of the original sequence (a shuffled sequence) or only expects it to have them (a representative sequence). The second decision concerns the properties of the sequence of which two are composition and nearest-neighbor frequencies. It is shown that biased nearest-neighbor frequencies can significantly affect the probability of observing a given result. Methods for producing random sequences according to these decisions are given.