Pattern similarity study of functional sites in protein sequences: lysozymes and cystatins

BackgroundAlthough it is generally agreed that topography is more conserved than sequences, proteins sharing the same fold can have different functions, while there are protein families with low sequence similarity. An alternative method for profile analysis of characteristic conserved positions of the motifs within the 3D structures may be needed for functional annotation of protein sequences. Using the approach of quantitative structure-activity relationships (QSAR), we have proposed a new algorithm for postulating functional mechanisms on the basis of pattern similarity and average of property values of side-chains in segments within sequences. This approach was used to search for functional sites of proteins belonging to the lysozyme and cystatin families.ResultsHydrophobicity and β-turn propensity of reference segments with 3–7 residues were used for the homology similarity search (HSS) for active sites. Hydrogen bonding was used as the side-chain property for searching the binding sites of lysozymes. The profiles of similarity constants and average values of these parameters as functions of their positions in the sequences could identify both active and substrate binding sites of the lysozyme of Streptomyces coelicolor, which has been reported as a new fold enzyme (Cellosyl). The same approach was successfully applied to cystatins, especially for postulating the mechanisms of amyloidosis of human cystatin C as well as human lysozyme.ConclusionPattern similarity and average index values of structure-related properties of side chains in short segments of three residues or longer were, for the first time, successfully applied for predicting functional sites in sequences. This new approach may be applicable to studying functional sites in un-annotated proteins, for which complete 3D structures are not yet available.

[1]  C. Graffagnino,et al.  Cystatin C mutation in an elderly man with sporadic amyloid angiopathy and intracerebral hemorrhage. , 1995, Stroke.

[2]  John P. Overington,et al.  Protein sequence analysis in silico: application of structure-based bioinformatics to genomic initiatives. , 2002, Current opinion in pharmacology.

[3]  Lysozyme: a model enzyme in protein crystallography. , 1996, EXS.

[4]  M. Matsushima,et al.  Structural changes of active site cleft and different saccharide binding modes in human lysozyme co-crystallized with hexa-N-acetyl-chitohexaose at pH 4.0. , 1995, Journal of molecular biology.

[5]  V. Turk,et al.  Human stefin B readily forms amyloid fibrils in vitro. , 2002, Biochimica et biophysica acta.

[6]  Jinglie Dou,et al.  Homology similarity analysis of sequences of lactoferricin and its derivatives. , 2003, Journal of agricultural and food chemistry.

[7]  Dérick Rousseau,et al.  Pattern similarity analysis of amino acid sequences for peptide emulsification. , 2004, Journal of agricultural and food chemistry.

[8]  D. Brömme,et al.  Thiol-dependent cathepsins: pathophysiological implications and recent advances in inhibitor design. , 2002, Current pharmaceutical design.

[9]  G. A. Grant Synthetic Peptides for Production of Antibodies that Recognize Intact Proteins , 2002, Current protocols in molecular biology.

[10]  R. Huber,et al.  The refined 2.4 A X‐ray crystal structure of recombinant human stefin B in complex with the cysteine proteinase papain: a novel type of proteinase inhibitor interaction. , 1990, The EMBO journal.

[11]  A. Barrett,et al.  Identification of the probable inhibitory reactive sites of the cysteine proteinase inhibitors human cystatin C and chicken cystatin. , 1987, The Journal of biological chemistry.

[12]  I. Ekiel,et al.  Folding-related Dimerization of Human Cystatin C (*) , 1996, The Journal of Biological Chemistry.

[13]  R. Hilgenfeld,et al.  A New Lysozyme Fold , 2001, The Journal of Biological Chemistry.

[14]  A. Barrett,et al.  Inhibition of Mammalian Legumain by Some Cystatins Is Due to a Novel Second Reactive Site* , 1999, The Journal of Biological Chemistry.

[15]  J. Svendsen,et al.  Is information about peptide sequence necessary in multivariate analysis , 2001 .

[16]  T K Attwood,et al.  The quest to deduce protein function from sequence: the role of pattern databases. , 2000, The international journal of biochemistry & cell biology.

[17]  Martin Norin,et al.  Structural proteomics: developments in structure-to-function predictions. , 2002, Trends in biotechnology.

[18]  T. Imoto Engineering of lysozyme. , 1996, EXS.

[19]  Shuryo Nakai,et al.  New multivariate strategy for panel evaluation using principal component similarity , 2000 .

[20]  A. Peracchi,et al.  Enzyme catalysis: removing chemically 'essential' residues by site-directed mutagenesis. , 2001, Trends in biochemical sciences.

[21]  K. Harata,et al.  Protein-carbohydrate interactions in human lysozyme probed by combining site-directed mutagenesis and affinity labeling. , 2000, Biochemistry.

[22]  Strynadka Nc,et al.  Lysozyme: a model enzyme in protein crystallography. , 1996 .

[23]  D. Booth,et al.  Human lysozyme gene mutations cause hereditary systemic amyloidosis , 1993, Nature.

[24]  Masahiro Ogawa,et al.  DEFINITION OF OUTLIERS USING UNSUPERVISED PRINCIPAL COMPONENT SIMILARITY ANALYSIS FOR SENSORY EVALUATION OF FOODS , 2002 .

[25]  I. Kumagai,et al.  Effects of subsite alterations on substrate-binding mode in the active site of hen egg-white lysozyme. , 1993, European journal of biochemistry.

[26]  Alfredo Colosimo,et al.  Nonlinear signal analysis methods in the elucidation of protein sequence-structure relationships. , 2002, Chemical reviews.

[27]  Jie Liang,et al.  Inferring functional relationships of proteins from local sequence and spatial surface patterns. , 2003, Journal of molecular biology.

[28]  D. Turk,et al.  Structural and functional aspects of papain-like cysteine proteinases and their protein inhibitors. , 1997, Biological chemistry.

[29]  C. Orengo,et al.  From protein structure to function. , 1999, Current opinion in structural biology.

[30]  C. Dobson,et al.  Mechanistic studies of the folding of human lysozyme and the origin of amyloidogenic behavior in its disease-related variants. , 1999, Biochemistry.

[31]  P. Jollès,et al.  Lysozymes : model enzymes in biochemistry and biology , 1996 .

[32]  Darren R Flower,et al.  Coupling In Silico and In Vitro Analysis of Peptide-MHC Binding: A Bioinformatic Approach Enabling Prediction of Superbinding Peptides and Anchorless Epitopes , 2004, The Journal of Immunology.

[33]  Hao Jing,et al.  Enhancement of proteinase inhibitory activity of recombinant human cystatin C using random-centroid optimization. , 2002, Biochimica et biophysica acta.

[34]  Masahiro Ogawa,et al.  A Computer‐Aided Strategy for Structure‐Function Relation Study of Food Proteins Using Unsupervised Data Mining , 2003 .

[35]  Lorna J. Smith,et al.  Long-Range Interactions Within a Nonnative Protein , 2002, Science.

[36]  Katarina Håkansson,et al.  Structural Basis for the Biological Specificity of Cystatin C , 1995, The Journal of Biological Chemistry.

[37]  K Karplus,et al.  Predicting protein structure using only sequence information , 1999, Proteins.

[38]  Veronica Morea,et al.  Sequence conservation in families whose members have little or no sequence similarity: the four-helical cytokines and cytochromes. , 2002, Journal of molecular biology.

[39]  Brian Everitt,et al.  Principles of Multivariate Analysis , 2001 .

[40]  S. Wold,et al.  Peptide quantitative structure-activity relationships, a multivariate approach. , 1987, Journal of medicinal chemistry.

[41]  David T. Jones,et al.  Bioinformatics: Genes, Proteins and Computers , 2007 .