In Silico Protein-Protein Interactions: Avoiding Data and Method Biases Over Sensitivity and Specificity.

The study of protein-protein interactions (PPIs) can help researchers raise new hypotheses about an organism or disease and guide new experiments. Various methods for the identification and analysis of PPIs have been discussed in the literature. These methods are generally categorized as experimental or computational - each having its own advantages and disadvantages. Experimental methods provide insights into the real state of biological interactions but tend to be time-consuming and costly. Computational methods, on the other hand, can study thousands of PPIs at a very low cost and in much less time; however, the accuracy of such in silico prediction results heavily depends on the specific computational approach used. Furthermore, there is no gold standard for these computational methods; a method that works well for predicting one PPI may perform poorly (by generating false positives and false negatives) for a different PPI. Therefore, all such predictions must be carefully validated, preferably with experimental data. In this paper, we review the existing computational approaches and emphasize the use of biological data as inputs for accurate predictions of PPIs. We also discuss how such input datasets and approaches may influence the sensitivity and specificity of the predicted PPI networks.