Prediction of ProteinProtein Interaction Pocket Using L‐Shaped PLS Approach and Its Visualizations by Generative Topographic Mapping

Proteinprotein interaction (PPI) pockets in a hostguest protein system were predicted using an L‐shaped partial least squares (LPLS) method. LPLS is an extension of standard PLS regression, where, in addition to response vector y and regressor matrix X, an extra data matrix Z is constructed which summarizes the background information on X. The regressor matrix X is a similarity matrix of Tanimoto coefficients of the paired fingerprints of pockets, while the background information Z constitutes eleven physico‐chemical and geometrical parameters for describing a pocket. The Boolean response vector y specifies whether each pocket is PPI or non‐PPI (indicated by 1 and 0, respectively). Constructing two LPLS models, we successfully predicted the PPI pockets of two protein clusters. Clusters 1 and 2 comprised the X‐ray crystal structures of protein‐peptide complexes and protein‐protein complexes, respectively. From the loading plots derived from each model, we could speculate the geometrical constraints of the PPI pockets. These two models are exclusively unique and it was validated by the cross‐prediction simulations. The PPI pockets of cluster 1 were projected onto 2D maps by generative topographic mapping (GTM) and the molecular lipophilic potentials (MLP). Among three examples, the MLP distributions were highly similar because the specimens shared the same p53 guest peptides. Contribution to the Autumn School of Chemoinformatics in Nara, Japan, November 27–28, 2013

[1]  Philippe Roche,et al.  2P2Idb: a structural database dedicated to orthosteric modulation of protein–protein interactions , 2012, Nucleic Acids Res..

[2]  Harald Martens,et al.  LPLS-regression: a method for prediction and classification under the influence of background information on predictor variables , 2008 .

[3]  Alexander D. MacKerell,et al.  Computational identification of inhibitors of protein-protein interactions. , 2007, Current topics in medicinal chemistry.

[4]  Jan M. Kriegl,et al.  Self-organizing fuzzy graphs for structure-based comparison of protein pockets. , 2010, Journal of proteome research.

[5]  Héléna A. Gaspar,et al.  Generative Topographic Mapping (GTM): Universal Tool for Data Visualization, Structure‐Activity Modeling and Dataset Comparison , 2012, Molecular informatics.

[6]  Harry Jubb,et al.  Structural biology and drug discovery for protein-protein interactions. , 2012, Trends in pharmacological sciences.

[7]  G. Klebe,et al.  From the Similarity Analysis of Protein Cavities to the Functional Classification of Protein Families Using Cavbase , 2006, Journal of Molecular Biology.

[8]  Nathanael Weill,et al.  Alignment-Free Ultra-High-Throughput Comparison of Druggable Protein-Ligand Binding Sites , 2010, J. Chem. Inf. Model..

[9]  Kimito Funatsu,et al.  Evolution of PLS for Modeling SAR and omics Data , 2012, Molecular informatics.

[10]  Ruben Abagyan,et al.  Compound activity prediction using models of binding pockets or ligand properties in 3D. , 2012, Current topics in medicinal chemistry.

[11]  S. Papson,et al.  “Model” , 1981 .

[12]  Asher Mullard,et al.  Protein–protein interaction inhibitors get into the groove , 2012, Nature Reviews Drug Discovery.

[13]  Pedro A Fernandes,et al.  Hot spots—A review of the protein–protein interface determinant amino‐acid residues , 2007, Proteins.

[14]  Arno G. Stefani,et al.  Application of information theory to feature selection in protein docking , 2012, Journal of Molecular Modeling.

[15]  Christopher M. Bishop,et al.  GTM: The Generative Topographic Mapping , 1998, Neural Computation.

[16]  Philippe Roche,et al.  Atomic Analysis of Protein-Protein Interfaces with Known Inhibitors: The 2P2I Database , 2010, PloS one.

[17]  Matthew H. Godfrey,et al.  Regional Management Units for Marine Turtles: A Novel Framework for Prioritizing Conservation and Research across Multiple Scales , 2010, PloS one.

[18]  Pascal Braun,et al.  History of protein–protein interactions: From egg‐white to complex networks , 2012, Proteomics.

[19]  Kimito Funatsu,et al.  Novel Computational Approaches in QSAR and Molecular Design Based on GA, Multi-Way PLS and NN , 2005 .

[20]  José L. Medina-Franco,et al.  Visualization of Molecular Fingerprints , 2011, J. Chem. Inf. Model..