The BamHI restriction modification system was previously cloned into E. coli and maintained with an extra copy of the methylase gene on a high copy vector (Brooks et al., (1989) Nucl. Acids Res. 17, 979-997). The nucleotide sequence of a 3014 bp region containing the endonuclease (R) and methylase (M) genes has now been determined. The sequence predicts a methylase protein of 423 amino acids, Mr 49,527, and an endonuclease protein of 213 amino acids, Mr 24,570. Between the two genes is a small open reading frame capable of encoding a 102 amino acid protein, Mr 13,351. The M. BamHI enzyme has been purified from a high expression clone, its amino terminal sequence determined, and the nature of its substrate modification studied. The BamHI methylase modifies the internal C within its recognition sequence at the N4 position. Comparisons of the deduced amino acid sequence of M. BamHI have been made with those available for other DNA methylases: among them, several contain five distinct regions, 12 to 22 amino acids in length, of pronounced sequence similarity. Finally, stability and expression of the BamHI system in both E. coli and B. subtilis have been studied. The results suggest R and M expression are carefully regulated in a 'natural' host like B. subtilis.