Finishing a whole-genome shotgun: Release 3 of the Drosophila melanogaster euchromatic genome sequence

BackgroundThe Drosophila melanogaster genome was the first metazoan genome to have been sequenced by the whole-genome shotgun (WGS) method. Two issues relating to this achievement were widely debated in the genomics community: how correct is the sequence with respect to base-pair (bp) accuracy and frequency of assembly errors? And, how difficult is it to bring a WGS sequence to the accepted standard for finished sequence? We are now in a position to answer these questions.ResultsOur finishing process was designed to close gaps, improve sequence quality and validate the assembly. Sequence traces derived from the WGS and draft sequencing of individual bacterial artificial chromosomes (BACs) were assembled into BAC-sized segments. These segments were brought to high quality, and then joined to constitute the sequence of each chromosome arm. Overall assembly was verified by comparison to a physical map of fingerprinted BAC clones. In the current version of the 116.9 Mb euchromatic genome, called Release 3, the six euchromatic chromosome arms are represented by 13 scaffolds with a total of 37 sequence gaps. We compared Release 3 to Release 2; in autosomal regions of unique sequence, the error rate of Release 2 was one in 20,000 bp.ConclusionsThe WGS strategy can efficiently produce a high-quality sequence of a metazoan genome while generating the reagents required for sequence finishing. However, the initial method of repeat assembly was flawed. The sequence we report here, Release 3, is a reliable resource for molecular genetic experimentation and computational analysis.

[1]  D. Hogness,et al.  The organization of the histone genes in Drosophila melanogaster: functional and evolutionary implications. , 1978, Cold Spring Harbor symposia on quantitative biology.

[2]  D. Mccormick Sequence the Human Genome , 1986, Bio/Technology.

[3]  L. L. Searles,et al.  Molecular characterization of the Drosophila vermilion locus and its suppressible alleles. , 1986, Proceedings of the National Academy of Sciences of the United States of America.

[4]  S. Henikoff,et al.  The brown protein of Drosophila melanogaster is similar to the white protein and to components of active transport complexes , 1988, Molecular and cellular biology.

[5]  Veikko Sorsa,et al.  Chromosome maps of Drosophila , 1988 .

[6]  V. Corces,et al.  Tissue‐specific transcriptional enhancers may act in trans on the gene located in the homologous chromosome: the molecular basis of transvection in Drosophila. , 1990, The EMBO journal.

[7]  M. Gatti,et al.  Functional elements in Drosophila melanogaster heterochromatin. , 1992, Annual review of genetics.

[8]  A. Spradling,et al.  Analysis of subtelomeric heterochromatin in the Drosophila minichromosome Dp1187 by single P element insertional mutagenesis. , 1992, Genetics.

[9]  J. Kennison,et al.  Genetic analysis of the brahma gene of Drosophila melanogaster and polytene chromosome subdivisions 72AB. , 1994, Genetics.

[10]  R. Saunders,et al.  A physical map of the X chromosome of Drosophila melanogaster: cosmid contigs and sequence tagged sites. , 1995, Genetics.

[11]  R A Gibbs,et al.  A "double adaptor" method for improved shotgun library construction. , 1996, Analytical biochemistry.

[12]  R. Wilson,et al.  High throughput fingerprint analysis of large-insert clones. , 1997, Genome research.

[13]  Gapped BLAST and PSI-BLAST: A new , 1997 .

[14]  P. Green,et al.  Against a whole-genome shotgun. , 1997, Genome research.

[15]  W. Reznikoff,et al.  Tn5 in Vitro Transposition* , 1998, The Journal of Biological Chemistry.

[16]  P. Green,et al.  Base-calling of automated sequencer traces using phred. I. Accuracy assessment. , 1998, Genome research.

[17]  G. Rubin,et al.  A computer program for aligning a cDNA sequence with a genomic DNA sequence. , 1998, Genome research.

[18]  P. Green,et al.  Consed: a graphical tool for sequence finishing. , 1998, Genome research.

[19]  G. Benson,et al.  Tandem repeats finder: a program to analyze DNA sequences. , 1999, Nucleic acids research.

[20]  Stephen M. Mount,et al.  The genome sequence of Drosophila melanogaster. , 2000, Science.

[21]  G M Rubin,et al.  A BAC-based physical map of the major autosomes of Drosophila melanogaster. , 2000, Science.

[22]  Eugene W. Myers,et al.  A whole-genome assembly of Drosophila. , 2000, Science.

[23]  R. Hodgetts,et al.  A physical map of the polytenized region (101EF-102F) of chromosome 4 in Drosophila melanogaster. , 2000, Genetics.

[24]  J. Steitz,et al.  Non-coding snoRNA host genes in Drosophila: expression strategies for modification guide snoRNAs. , 2001, European journal of cell biology.

[25]  E. Eichler,et al.  Recent duplication, domain accretion and the dynamic mutation of the human genome. , 2001, Trends in genetics : TIG.

[26]  D. Haussler,et al.  Assembly of the working draft of the human genome with GigAssembler. , 2001, Genome research.

[27]  C. Desmarais,et al.  Automated finishing with autofinish. , 2001, Genome research.

[28]  B. Barrell,et al.  From first base: the sequence of the tip of the X chromosome of Drosophila melanogaster, a comparison of two sequencing strategies. , 2001, Genome research.

[29]  Gerald M Rubin,et al.  Heterochromatic sequences in a Drosophila whole-genome shotgun assembly , 2002, Genome Biology.

[30]  S. Lewis,et al.  An integrated computational pipeline and database to support whole-genome sequence annotation , 2002, Genome Biology.

[31]  K. O'hare,et al.  A 5.9-kb tandem repeat at the euchromatin-heterochromatin boundary of the X chromosome of Drosophila melanogaster , 2002, Molecular Genetics and Genomics.

[32]  B. Barrell,et al.  Mapping and identification of essential gene functions on the X chromosome of Drosophila , 2002, EMBO reports.

[33]  M. Ashburner,et al.  The transposable elements of the Drosophila melanogaster euchromatin: a genomics perspective , 2002, Genome Biology.

[34]  B. Berger,et al.  ARACHNE: a whole-genome shotgun assembler. , 2002, Genome research.

[35]  Paramvir S. Dehal,et al.  Whole-Genome Shotgun Assembly and Analysis of the Genome of Fugu rubripes , 2002, Science.

[36]  E. Birney,et al.  Apollo: a sequence annotation editor , 2002, Genome Biology.

[37]  S. Salzberg,et al.  Fast algorithms for large-scale genome alignment and comparison. , 2002, Nucleic acids research.

[38]  William H. Majoros,et al.  A Comparison of Whole-Genome Shotgun-Derived Mouse Chromosome 16 and the Human Genome , 2002, Science.

[39]  Daniel H. Huson,et al.  Segment Match Refinement and Applications , 2002, WABI.

[40]  Michael Ashburner,et al.  Annotation of the Drosophila melanogaster euchromatic genome: a systematic review , 2002, Genome Biology.

[41]  M. Adams,et al.  Y Chromosome and Other Heterochromatic Sequences of the Drosophila Melanogaster Genome: How Far can we go? , 2003, Genetica.

[42]  D. Petrov,et al.  Characterization of bacteriophage P1 library containing inserts of Drosophila DNA of 75–100 kilobase pairs , 1991, Chromosoma.