Bioinformatic Requirements for Protein Database Searching Using Predicted Epitopes from Disease-associated Antibodies*

We describe a new approach to identify proteins involved in disease pathogenesis. The technology, Epitope-Mediated Antigen Prediction (E-MAP), leverages the specificity of patients’ immune responses to disease-relevant targets and requires no prior knowledge about the protein. E-MAP links pathologic antibodies of unknown specificity, isolated from patient sera, to their cognate antigens in the protein database. The E-MAP process first involves reconstruction of a predicted epitope using a peptide combinatorial library. We then search the protein database for closely matching amino acid sequences. Previously published attempts to identify unknown antibody targets in this manner have largely been unsuccessful for two reasons: 1) short predicted epitopes yield too many irrelevant matches from a database search and 2) the epitopes may not accurately represent the native antigen with sufficient fidelity. Using an in silico model, we demonstrate the critical threshold requirements for epitope length and epitope fidelity. We find that epitopes generally need to have at least seven amino acids, with an overall accuracy of >70% to the native protein, in order to correctly identify the protein in a nonredundant protein database search. We then confirmed these findings experimentally, using the predicted epitopes for four monoclonal antibodies. Since many predicted epitopes often fail to achieve the seven amino acid threshold, we demonstrate the efficacy of paired epitope searches. This is the first systematic analysis of the computational framework to make this approach viable, coupled with experimental validation.

[1]  J. Gershoni,et al.  Dissection of the humoral immune response toward an immunodominant epitope of HIV: a model for the analysis of antibody diversity in HIV+ individuals , 2001, FASEB journal : official publication of the Federation of American Societies for Experimental Biology.

[2]  A. Fauci,et al.  Selection of HIV-specific immunogenic epitopes by screening random peptide libraries with HIV-1-positive sera. , 1999, Journal of immunology.

[3]  A. Folgori,et al.  A general strategy to identify mimotopes of pathological antigens using only random peptide libraries and human sera. , 1994, The EMBO journal.

[4]  Michael Gribskov,et al.  Combining evidence using p-values: application to sequence homology searches , 1998, Bioinform..

[5]  R. Ladner,et al.  M13 bacteriophage displaying disulfide-constrained microproteins. , 1993, Gene.

[6]  A. Plückthun,et al.  Trinucleotide phosphoramidites: ideal reagents for the synthesis of mixed oligonucleotides for random mutagenesis. , 1994, Nucleic acids research.

[7]  M. Sioud,et al.  Probing the Specificity of Human Myeloma Proteins with a Random Peptide Phage Library , 2003, Scandinavian journal of immunology.

[8]  Michael Gribskov,et al.  Methods and Statistics for Combining Motif Match Scores , 1998, J. Comput. Biol..

[9]  ANDREAS SCHREIBER,et al.  3D‐Epitope‐Explorer (3DEX): Localization of conformational epitopes within three‐dimensional structures of proteins , 2005, J. Comput. Chem..

[10]  M. Atassi Antigenic structures of proteins. Their determination has revealed important aspects of immune recognition and generated strategies for synthetic mimicking of protein binding sites. , 1984, European journal of biochemistry.

[11]  Lin-Fa Wang,et al.  Epitope identification and discovery using phage display libraries: applications in vaccine development and diagnostics. , 2004, Current drug targets.

[12]  A. Sparks,et al.  Screening phage-displayed random peptide libraries for SH3 ligands. , 1995, Methods in enzymology.

[13]  J. Engberg,et al.  Identification of patient‐specific peptides for detection of M‐proteins and myeloma cells , 1999, British journal of haematology.

[14]  Timothy L. Bailey,et al.  An artificial intelligence approach to motif discovery in protein sequences: Application to steroid dehydrogenases , 1997, The Journal of Steroid Biochemistry and Molecular Biology.

[15]  L. Hafer,et al.  Antibodies immunoreactive with formalin-fixed tissue antigens recognize linear protein epitopes. , 2006, American journal of clinical pathology.

[16]  M. Atassi Antigenic structures of proteins , 1984 .

[17]  S. Bogen,et al.  Accurate identification of paraprotein antigen targets by epitope reconstruction. , 2008, Blood.

[18]  A. Meola,et al.  Selection of antigenic and immunogenic mimics of hepatitis C virus using sera from patients. , 1996, Journal of immunology.

[19]  Charles Elkan,et al.  Fitting a Mixture Model By Expectation Maximization To Discover Motifs In Biopolymer , 1994, ISMB.