Classification of Protein-protein Interaction Types Using Support Vector Machines

Protein-protein interactions are of vital importance to many biological processes. However, not all the interactions presented in a certain protein complex structure determined by x-ray crystallography are biologically relevant. Many of them are formed during the crystallization process and would not appear in vivo. Such crystal packing interactions are non-specific crystal artefacts which have no biological functionality [1]. The determination of the oligomeric state of protein complexes remains a non-trivial problem [2]. The types of biological interactions are also diverse [3]. Protomers from obligate complexes do not exist as stable structures in vivo, whereas protomers of nonobligate complexes (e.g. transient complexes) may dissociate from each other and stay as stable and functional units in vivo. We present a two-stage support vector machine (SVM) classifier for discriminating three types of protein-protein interactions: obligate, non-obligate and crystal packing interactions. Firstly, we analyzed five protein-protein interface properties for our interaction data. Then these properties were combined using a support vector machine algorithm to help determine the types of protein-protein interactions. We achieved a total accuracy of 91.1% with a leave-one-out cross-validation (LOOCV) procedure.