Software engineering challenges in bioinformatics

Data from biological research is proliferating rapidly and advanced data storage and analysis methods are required to manage it. We introduce the main sources of biological data available and outline some of the domain specific problems associated with automated analysis. We discuss two major areas in which we are likely experience software engineering challenges over the next ten years: data integration and presentation.

[1]  Tim J. P. Hubbard,et al.  SCOP: a structural classification of proteins database , 1998, Nucleic Acids Res..

[2]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[3]  Tim J. P. Hubbard,et al.  SCOP: a Structural Classification of Proteins database , 2000, Nucleic Acids Res..

[4]  Sameer Velankar,et al.  E-MSD: the European Bioinformatics Institute Macromolecular Structure Database , 2003, Nucleic Acids Res..

[5]  Evelyn Camon,et al.  The EMBL Nucleotide Sequence Database , 2000, Nucleic Acids Res..

[6]  C. Sander,et al.  Dali: a network tool for protein structure comparison. , 1995, Trends in biochemical sciences.

[7]  David C. Jones,et al.  CATH--a hierarchic classification of protein domain structures. , 1997, Structure.

[8]  Frances M. G. Pearl,et al.  The CATH Dictionary of Homologous Superfamilies (DHS): a consensus approach for identifying distant structural homologues. , 2000, Protein engineering.

[9]  G J Williams,et al.  The Protein Data Bank: a computer-based archival file for macromolecular structures. , 1978, Archives of biochemistry and biophysics.

[10]  David R. Barstow,et al.  Proceedings of the 25th International Conference on Software Engineering , 1978, ICSE.

[11]  W R Taylor,et al.  Protein structure alignment. , 1989, Journal of molecular biology.

[12]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[13]  Amos Bairoch,et al.  PROSITE: A Documented Database Using Patterns and Profiles as Motif Descriptors , 2002, Briefings Bioinform..

[14]  Roman A. Laskowski,et al.  PDBsum: summaries and analyses of PDB structures , 2001, Nucleic Acids Res..

[15]  Sergio Contrino,et al.  ArrayExpress—a public repository for microarray gene expression data at the EBI , 2004, Nucleic Acids Res..