Machine Discovery of Functional Components of Proteins from Amino-Acid Sequences Based on Rough Sets and Change of Representation

ABSTRACTProtein structure analysis from DNA sequences is an important and fast growing area in both computer science and biochemistry. Although interesting approaches have been studied, it is very difficult to capture the characteristics of protein, since even a simple protein is made of more than 100 amino acids, which makes biochemical experiments very difficult to detect functional components. For this reason, almost all the problems in this field are left unsolved and it is very important to develop a system which assists researchers on molecular biology to remove the difficulties caused by combinatorial explosions. In this article we report a system, called MW1 (Molecular biologists' Workbench version 1.0), which extracts knowledge from amino- acid sequences by controlling the application of domain knowledge automatically. We apply this method to comparative analysis of lysozyme and α-lactalbumin. The results show that we obtain several interesting results from amino-acid sequences, which have not be...