Consistency checks for characterizing protein forms

Proteomics enforces the reverse chronological order on the gene to protein dogma and imposes amino acid sequences as a starting point of an investigation relative to function. By this approach, proteomics data can confirm the presence of multiple forms of a protein. Notwithstanding variations attributed specific individual features of organisms and tissues, from two to over ten protein forms can be identified in a given sample. The present work describes some guidelines for tracking the origin of alternative protein forms and attempts to tag the details of sequence data in the literature. Working via these guidelines we have uncovered a third alternative form of the Pim subfamily of oncogenes. The term form is here combined with the qualification alternative to describe any product of a given gene including closely related paralogs. This paper also emphasizes the need for consistency checks in annotation processes, such as gene clustering, to avoid losing important details describing protein alternative forms. By identifying alternative protein forms, we illustrate the fact that rationalizing of protein function via the identification of protein-protein interactions should in reality be that of identifying (alternative) form-form interactions.

[1]  M R Wilkins,et al.  Strategy for protein isoform identification from expressed sequence tags and its application to peptide mass fingerprinting , 2001, Proteomics.

[2]  Jos Domen,et al.  The primary structure of the putative oncogene pim-1 shows extensive homology with protein kinases , 1986, Cell.

[3]  A. Berns,et al.  Proviral tagging in E mu‐myc transgenic mice lacking the Pim‐1 proto‐oncogene leads to compensatory activation of Pim‐2. , 1995, The EMBO journal.

[4]  E. Batsché,et al.  Opposite transcriptional activity between the wild type c-myc gene coding for c-Myc1 and c-Myc2 proteins and c-Myc1 and c-Myc2 separately , 1999, Oncogene.

[5]  A. Berns,et al.  The pim‐1 oncogene encodes two related protein‐serine/threonine kinases by alternative initiation at AUG and CUG. , 1991, The EMBO journal.

[6]  Claes Wahlestedt,et al.  NotI flanking sequences: a tool for gene discovery and verification of the human genome. , 2002, Nucleic acids research.

[7]  D. Givol,et al.  Identification of the human pim-1 gene product as a 33-kilodalton cytoplasmic protein with tyrosine kinase activity , 1988, Molecular and cellular biology.

[8]  A. Prats,et al.  Alternative Translation of the Proto-oncogene c-mycby an Internal Ribosome Entry Site* , 1997, The Journal of Biological Chemistry.

[9]  M. Nissen,et al.  Characterization of the proto-oncogene pim-1: kinase activity and substrate recognition sequence. , 1992, Archives of biochemistry and biophysics.

[10]  T. Hirano,et al.  Synergistic roles for Pim-1 and c-Myc in STAT3-mediated cell cycle progression and antiapoptosis. , 1999, Immunity.

[11]  Marc R. Wilkins,et al.  Proteome Research: New Frontiers in Functional Genomics , 1997, Principles and Practice.

[12]  Michael Krauthammer,et al.  Of truth and pathways: chasing bits of information through myriads of articles , 2002, ISMB.

[13]  M. Dunn,et al.  Proteomics: From Protein Sequence to Function , 2001 .

[14]  W. John Wilbur,et al.  Research Paper: Corpus-based Statistical Screening for Phrase Identification , 2000, J. Am. Medical Informatics Assoc..