Using the concept of Chou's pseudo amino acid composition to predict enzyme family classes: an approach with support vector machine based on discrete wavelet transform.

The early determination of family for a newly found enzyme molecule becomes important because it is directly related to the detail information about which specific target it acts on, as well as to its catalytic process and biological function. Unfortunately, it is still a hard work to distinguish enzyme classes by experiments. With an enormous amount of protein sequences uncovered in the genome research, it is both challenging and indispensable to develop an automatic method for fast and reliably classifying the enzyme family. Using the concept of Chou's pseudo amino acid composition, we developed a new method that coupled discrete wavelet transform with support vector machine based on the amino acid hydrophobicity to predict enzyme family. The overall success rate obtained by the 10-cross-validation for the identification of the six enzyme families was 91.9%, indicating the current method could be an effective and promising high-throughput method in the enzyme research.