An Application for Downloading and Integrating Molecular Biology Data

INTRODUCTION AND BACKGROUND Integrating large volumes of data from diverse sources is a formidable challenge for many investigators in the field of molecular biology. Developing efficient methods for accessing and integrating this data is a major focus of investigation in the field of bioinformatics. In early 2003, the Hereditary Genomics division of the department of Medical and Molecular Genetics at IUPUI recognized the need for a software application that would automate many of the manual processes that were being used to obtain data for their research. The two primary objectives for this project were: 1) an application that would provide large-scale, integrated output tables to help answer questions that frequently arose in the course of their research, and 2) a graphic user interface (GUI) that would minimize or eliminate the need for technical expertise in computer programming or database operations on the part of the end-users. In early 2003, Indiana University (IU), IBM, and the Indiana Genomics Initiative (INGEN) introduced a new resource called Centralized Life Sciences Data Services (CLSD). CLSD is a centralized data repository that provides programmatic access to biological data that is collected and integrated from multiple public, online databases.