Identification of the idiosyncratic bacterial protein tyrosine kinase (BY-kinase) family signature

MOTIVATION Most of the protein tyrosine kinases found in bacteria have been recently classified in a new family, termed BY-kinase. Indeed, they share no sequence homology with their eukaryotic counterparts and have no known eukaryotic homologues. They are involved in several biological functions (e.g. capsule biosynthesis, antibiotic resistance, virulence mechanism). Thus, they can be considered interesting therapeutic targets to develop new drugs to treat infectious diseases. However, their identification is rendered difficult due to slow progress in their structural characterization and comes most often from biochemical experiments. Moreover BY-kinase sequences are related to many other bacterial proteins involved in several biological functions (e.g. ParA family proteins). Accordingly, their annotations in generalist databases, sequence analysis and classification remain partial and inhomogeneous and there is no bioinformatics resource dedicated to these proteins. RESULTS The combination of similarity search with sequence-profile alignment, pattern matching and sliding window computation to detect the tyrosine cluster was used to identify BY-kinase sequences in UniProt Knowledgebase. Cross-validations with keywords searches, pattern matching with several patterns and checking of motifs conservation in multiple sequence alignments were performed. Our pipeline identified 640 sequences as BY-kinases and allowed the definition of a PROSITE pattern that is the signature of the BY-kinases. The sequences identified by our pipeline as BY-kinases share a good sequence similarity with BY-kinases that have already been biochemically characterized, and they all bear the characteristic motifs of the catalytic domain, including the three Walker-like motifs followed by a tyrosine cluster. AVAILABILITY http://bykdb.ibcp.fr

[1]  J. Walker,et al.  Distantly related sequences in the alpha‐ and beta‐subunits of ATP synthase, myosin, kinases and other ATP‐requiring enzymes and a common nucleotide binding fold. , 1982, The EMBO journal.

[2]  A. Cozzone,et al.  Relationship between exopolysaccharide production and protein-tyrosine phosphorylation in gram-negative bacteria. , 2000, Journal of molecular biology.

[3]  K. Geider,et al.  Protein tyrosine kinases in bacterial pathogens are associated with virulence and production of exopolysaccharide , 1999, The EMBO journal.

[4]  J. Thompson,et al.  CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. , 1994, Nucleic acids research.

[5]  P. Argos,et al.  SRS: information retrieval system for molecular biology data banks. , 1996, Methods in enzymology.

[6]  C. Bakal,et al.  No longer an exclusive club: eukaryotic signalling domains in bacteria. , 2000, Trends in cell biology.

[7]  M. Saier,et al.  The phosphoenolpyruvate:sugar phosphotransferase system in gram-positive bacteria: properties, mechanism, and regulation. , 1988, Critical reviews in microbiology.

[8]  Ivan Mijakovic,et al.  Tyrosine phosphorylation: an emerging regulatory device of bacterial physiology. , 2007, Trends in biochemical sciences.

[9]  M. Simon,et al.  Protein phosphorylation in chemotaxis and two-component regulatory systems of bacteria. , 1989, The Journal of biological chemistry.

[10]  Thomas L. Madden,et al.  Improving the accuracy of PSI-BLAST protein database searches with composition-based statistics and other refinements. , 2001, Nucleic acids research.

[11]  Robert D. Finn,et al.  Pfam: clans, web tools and services , 2005, Nucleic Acids Res..

[12]  Sean R. Eddy,et al.  Profile hidden Markov models , 1998, Bioinform..

[13]  T. Mizuno Two-Component Phosphorelay Signal Transduction Systems in Plants: from Hormone Responses to Circadian Rhythms , 2005, Bioscience, biotechnology, and biochemistry.

[14]  C Combet,et al.  NPS@: network protein sequence analysis. , 2000, Trends in biochemical sciences.

[15]  Patrice Gouet,et al.  ESPript: analysis of multiple sequence alignments in PostScript , 1999, Bioinform..