ORIGINAL ARTICLE

Major feedstock sources for future biofuel production are likely to be high biomass producing plant species such as poplar, pine, switchgrass, sorghum and maize. One active area of research in these species is genome-enabled improvement of lignocellulosic biofuel feedstock quality and yield. To facilitate genomic-based investigations in these species, we developed the Biofuel Feedstock Genomic Resource (BFGR), a database and web-portal that provides high-quality, uniform and integrated functional annotation of gene and transcript assembly sequences from species of interest to lignocellulosic biofuel feedstock researchers. The BFGR includes sequence data from 54 species and permits researchers to view, analyze and obtain annotation at the gene, transcript, protein and genome level. Annotation of biochemical pathways permits the identification of key genes and transcripts central to the improvement of lignocellulosic properties in these species. The integrated nature of the BFGR in terms of annotation methods, orthologous/paralogous relationships and linkage to seven species with complete genome sequences allows comparative analyses for biofuel feedstock species with limited sequence resources. Database URL: http://bfgr.plantbiology.msu.edu

[1]  S Rozen,et al.  Primer3 on the WWW for general users and for biologist programmers. , 2000, Methods in molecular biology.

[2]  J. Poulain,et al.  The grapevine genome sequence suggests ancestral hexaploidization in major angiosperm phyla , 2007, Nature.

[3]  Peili Zhang,et al.  Using Chado to store genome annotation data. , 2006, Current protocols in bioinformatics.

[4]  The Arabidopsis Genome Initiative Analysis of the genome sequence of the flowering plant Arabidopsis thaliana , 2000, Nature.

[5]  Robert D. Finn,et al.  New developments in the InterPro database , 2007, Nucleic Acids Res..

[6]  Chris Somerville,et al.  Feedstocks for Lignocellulosic Biofuels , 2010, Science.

[7]  Lincoln Stein,et al.  Gramene: a growing plant comparative genomics resource , 2007, Nucleic Acids Res..

[8]  W. Cichy,et al.  Chemical and energetical properties of selected lignocellulosic raw materials , 2009 .

[9]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[10]  D. Scheel,et al.  Physiology and Molecular Biology of Phenylpropanoid Metabolism , 1989 .

[11]  Gregory D. Schuler,et al.  Database resources of the National Center for Biotechnology Information: update , 2004, Nucleic acids research.

[12]  Prasant Kumar Rout,et al.  Characterization of Canadian biomass for alternative renewable biofuel , 2010 .

[13]  M. Hinchee,et al.  Short-rotation woody crops for bioenergy and biofuels applications , 2009, In Vitro Cellular & Developmental Biology - Plant.

[14]  David J. States,et al.  Identification of protein coding regions by database similarity search , 1993, Nature Genetics.

[15]  Chris Mungall,et al.  A Chado case study: an ontology-based modular schema for representing genome-associated biological information , 2007, ISMB/ECCB.

[16]  Robert C. Edgar,et al.  MUSCLE: multiple sequence alignment with high accuracy and high throughput. , 2004, Nucleic acids research.

[17]  Mihaela M. Martis,et al.  The Sorghum bicolor genome and the diversification of grasses , 2009, Nature.

[18]  Paul R. Adler,et al.  Perennial Forages as Second Generation Bioenergy Crops , 2008, International journal of molecular sciences.

[19]  Keith L. Kline,et al.  Woody energy crops in the southeastern United States: Two centuries of practitioner experience , 2010 .

[20]  Emily Dimmer,et al.  The Gene Ontology Annotation (GOA) Database: sharing knowledge in Uniprot with Gene Ontology , 2004, Nucleic Acids Res..

[21]  Dawn H. Nagel,et al.  The B73 Maize Genome: Complexity, Diversity, and Dynamics , 2009, Science.

[22]  Sai Guna Ranjan Gurazada,et al.  Genome sequencing and analysis of the model grass Brachypodium distachyon , 2010, Nature.

[23]  María Martín,et al.  Ongoing and future developments at the Universal Protein Resource , 2010, Nucleic Acids Res..

[24]  M. Gribskov,et al.  The Genome of Black Cottonwood, Populus trichocarpa (Torr. & Gray) , 2006, Science.

[25]  李佩芳 International Rice Genome Sequencing Project. 2005. The map-based sequence of the rice genome. , 2005 .

[26]  W. Tyner The integration of energy and agricultural markets , 2010 .

[27]  Arindam Banerjee,et al.  Food, feed, fuel: transforming the competition for grains. , 2011, Development and change.

[28]  Carolyn J. Lawrence,et al.  MaizeGDB: The Maize Model Organism Database for Basic, Translational, and Applied Research , 2008, International journal of plant genomics.

[29]  Yoshihiro Yamanishi,et al.  KEGG for linking genomes to life and the environment , 2007, Nucleic Acids Res..

[30]  Wei Zhu,et al.  The TIGR Plant Transcript Assemblies database , 2006, Nucleic Acids Res..

[31]  Matthew D. Wilkerson,et al.  PlantGDB: a resource for comparative plant genomics , 2007, Nucleic Acids Res..

[32]  Gurdev S. Khush,et al.  The International Rice Genome Sequencing Project: progress and prospects. , 2001 .

[33]  Timothy J. Wallington,et al.  Impact of biofuel production and other supply and demand factors on food price increases in 2008 , 2011 .

[34]  S. Lewis,et al.  The generic genome browser: a building block for a model organism system database. , 2002, Genome research.

[35]  Takuji Sasaki,et al.  The map-based sequence of the rice genome , 2005, Nature.

[36]  C. V. Jongeneel,et al.  ESTScan: A Program for Detecting, Evaluating, and Reconstructing Potential Coding Regions in EST Sequences , 1999, ISMB.

[37]  C. Stoeckert,et al.  OrthoMCL: identification of ortholog groups for eukaryotic genomes. , 2003, Genome research.

[38]  X. Huang,et al.  On global sequence alignment , 1994, Comput. Appl. Biosci..