EyeSite: a semi-automated database of protein families in the eye

The EyeSite is a web-based database of protein families for proteins that function in the eye and their homologous sequences. The resource clusters proteins at different levels of homology in order to facilitate functional annotation of sequences and modelling of proteins from structural homologues. Eye proteins are organized into the tissue types in which they function and are clustered into homologous families using a novel protocol employing the TribeMCL algorithm. Homologous families are further subdivided into sequence clusters for which multiple sequence alignments are generated. Structural annotations from the CATH domain database are provided for nearly 90% of the sequences, and protein family annotations from the Pfam database for approximately 86%. Homology models have also been generated where appropriate. The EyeSite is stored in a relational database and is extensively linked to other online bioinformatics resources to help relate allelic variants, annotations and clinical details to the derived data in the database. The EyeSite is available for online search, sequence information and model retrieval at http://eyesite.cryst.bbk.ac.uk/.

[1]  Nigel J. Martin,et al.  PFDB: a generic protein family database integrating the CATH domain structure database with sequence based protein family resources , 2002, Bioinform..

[2]  Maria Jesus Martin,et al.  The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003 , 2003, Nucleic Acids Res..

[3]  James E. Bray,et al.  The CATH database: an extended protein family resource for structural and functional genomics , 2003, Nucleic Acids Res..

[4]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[5]  James E. Bray,et al.  Assigning genomic sequences to CATH , 2000, Nucleic Acids Res..

[6]  A. Sali,et al.  Comparative protein structure modeling of genes and genomes. , 2000, Annual review of biophysics and biomolecular structure.

[7]  J. Thompson,et al.  CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. , 1994, Nucleic acids research.

[8]  Graeme Wistow,et al.  A project for ocular bioinformatics: NEIBank. , 2002, Molecular vision.

[9]  Frances M. G. Pearl,et al.  Gene3D: structural assignment for whole genes and genomes using the CATH domain structure database. , 2002, Genome research.

[10]  Anton J. Enright,et al.  An efficient algorithm for large-scale detection of protein families. , 2002, Nucleic acids research.

[11]  S Minoshima,et al.  Eye disorder database “KMeyeDB” , 2000, Human mutation.

[12]  Donna R. Maglott,et al.  RefSeq and LocusLink: NCBI gene-centered resources , 2001, Nucleic Acids Res..

[13]  Richard Hughey,et al.  Hidden Markov models for detecting remote protein homologies , 1998, Bioinform..

[14]  Chris Sander,et al.  Completeness in structural genomics , 2001, Nature Structural Biology.

[15]  Tim J. P. Hubbard,et al.  SCOP database in 2002: refinements accommodate structural genomics , 2002, Nucleic Acids Res..