Prediction of nuclear export signals using weighted regular expressions (Wregex)

MOTIVATION Leucine-rich nuclear export signals (NESs) are short amino acid motifs that mediate binding of cargo proteins to the nuclear export receptor CRM1, and thus contribute to regulate the localization and function of many cellular proteins. Computational prediction of NES motifs is of great interest, but remains a significant challenge. RESULTS We have developed a novel approach for amino acid motif searching that can be used for NES prediction. This approach, termed Wregex (weighted regular expression), combines regular expressions with a position-specific scoring matrix (PSSM), and has been implemented in a web-based, freely available, software tool. By making use of a PSSM, Wregex provides a score to prioritize candidates for experimental testing. Key features of Wregex include its flexibility, which makes it useful for searching other types of protein motifs, and its fast execution time, which makes it suitable for large-scale analysis. In comparative tests with previously available prediction tools, Wregex is shown to offer a good rate of true-positive motifs, while keeping a smaller number of potential candidates.

[1]  W. Pearson Rapid and sensitive sequence comparison with FASTP and FASTA. , 1990, Methods in enzymology.

[2]  Søren Brunak,et al.  Analysis and prediction of leucine-rich nuclear export signals. , 2004, Protein engineering, design & selection : PEDS.

[3]  Philip M. Kim,et al.  Computational structural analysis of protein interactions and networks , 2012, Proteomics.

[4]  U. Kutay,et al.  Transport between the cell nucleus and the cytoplasm. , 1999, Annual review of cell and developmental biology.

[5]  S. Bañuelos,et al.  A global survey of CRM1-dependent nuclear export sequences in the human deubiquitinase family. , 2012, The Biochemical journal.

[6]  B. Cullen,et al.  Protein sequence requirements for function of the human T-cell leukemia virus type 1 Rex nuclear export signal delineated by a novel in vivo randomization-selection assay , 1996, Molecular and cellular biology.

[7]  Amos Bairoch,et al.  ScanProsite: detection of PROSITE signature matches and ProRule-associated functional and structural residues in proteins , 2006, Nucleic Acids Res..

[8]  Hui Li,et al.  In silico prediction of post-translational modifications. , 2011, Methods in molecular biology.

[9]  Yuh Min Chook,et al.  Structural basis for leucine-rich nuclear export signal recognition by CRM1 , 2009, Nature.

[10]  K. Imai,et al.  Prediction of leucine-rich nuclear export signal containing proteins with NESsential , 2011, Nucleic acids research.

[11]  R. Kehlenbach,et al.  CRM1-mediated nuclear export: to the pore and beyond. , 2007, Trends in cell biology.

[12]  Nick V. Grishin,et al.  Sequence and structural analyses of nuclear export signals in the NESdb database , 2012, Molecular biology of the cell.

[13]  Jakub Pas,et al.  ELM: the status of the 2010 eukaryotic linear motif resource , 2009, Nucleic Acids Res..

[14]  M. Tomita,et al.  Nuclear Export Signal Consensus Sequences Defined Using a Localization‐Based Yeast Selection System , 2008, Traffic.

[15]  Amos Bairoch,et al.  The PROSITE database , 2005, Nucleic Acids Res..

[16]  B. Henderson,et al.  A comparison of the activity, sequence specificity, and CRM1-dependence of different nuclear export signals. , 2000, Experimental cell research.

[17]  K. Nakai,et al.  Prediction of subcellular locations of proteins: Where to proceed? , 2010, Proteomics.

[18]  Hsuan-Cheng Huang,et al.  ValidNESs: a database of validated leucine-rich nuclear export signals , 2012, Nucleic Acids Res..

[19]  Alexander Fish,et al.  Homodimerization Antagonizes Nuclear Export of Survivin , 2007, Traffic.

[20]  C. Dian,et al.  Crystal structure of the Nuclear Export Receptor CRM1 (exportin-1) lacking the C-terminal helical extension at 4.5A , 2013 .

[21]  Michael Sattler,et al.  NES consensus redefined by structures of PKI-type and Rev-type nuclear export signals bound to CRM1 , 2010, Nature Structural &Molecular Biology.