A strategy has been developed for rapid and accurate determination of the amino acid sequence of large proteins, such as many of the members of the class of proteins known as aminoacyl tRNA synthetases. This strategy involves combining DNA sequencing of the gene for the protein of interest with gas chromatographic mass spectrometric identification of tetra- and pentapeptides in partial hydrolysates of the entire protein or very large fragments thereof. These peptides are matched to blocks of codons at locations scattered throughout the entire structural gene. Tetra- and pentapeptide sequences are sufficiently long that they are unlikely to be repeated in the protein sequence or to occur in an incorrect reading frame; therefore, they can be placed at unique clusters of codons on the DNA. This procedure rigorously establishes the proper phasing of the DNA throughout the entire length of the structural gene, and the protein sequence is thereby accurately read from the DNA sequence. This approach is being used to determine the amino acid sequence of EScherichia coli alanine tRNA synthetase, a protein that has approximately 900 amino acids. This paper reports the sequence of the first 165 amino acids from the NH2 terminus.