Improving the Accuracy of an Affinity Prediction Method by Using Statistics on Shape Complementarity between Proteins

To elucidate the partners in protein-protein interactions (PPIs), we previously proposed an affinity prediction method called affinity evaluation and prediction (AEP), which is based on the shape complementarity characteristics between proteins. The structures of the protein complexes obtained in our shape complementarity evaluation were selected by a newly developed clustering method called grouping. Our previous experiments showed that AEP gave accuracies that differed with the data composition and scale. In this study, we set a data scale (84 x 84 = 7056 protein pairs) including 84 biologically relevant complexes and then designed 225 parameter sets based on four key parameters related to the grouping and the calculation of affinity scores. As a result of receiver operating characteristic analysis, we obtained 27.4% sensitivity (= recall), 91.0% specificity, 3.5% precision, 90.2% accuracy, 6.3% F-measure(max), and an area under the curve of 0.585. Chiefly by optimization of the grouping, AEP was able to provide prediction accuracy for a maximum F-measure that statistically distinguished 23 target complexes among 84 protein pairs. Moreover, the active sites of these complexes were successfully predicted with high accuracy (i.e., 2.37 angstroms in 1CGI and 2.38 angstroms in 1PPE) of interface RMSD. To assess the improvement in accuracy we compared the results of AEP of different data sets and of tentative methods using ZDOCK 3.0.1 or ZRANK scores.