Application of Vector Quantization Algorithms to Protein Classification and Secondary Structure Computation

In this paper a feature-map based system for protein classification according to circular dichroism spectra is described. The training algorithm has been developed from Kohonen LVQ (Learning Vector Quantization) optimized to get maximum efficiency. As a result, proteins with different secondary structure are clearly separated through a completely unsupervised training process. The algorithm is able to extract features from a high-dimensional vector (CD spectra) and map it to a 2-dimensional network. A new tool has been developed to test LVQ performance, which can be used to fine tune some of LVQ algorithm parameters. Secondary structure for unknown proteins can also be computed, giving better results than classical methods. A 3D solid representation has been introduced to represent 3D feature maps.