PosMed-plus: an intelligent search engine that inferentially integrates cross-species information resources for molecular breeding of plants.

Molecular breeding of crops is an efficient way to upgrade plant functions useful to mankind. A key step is forward genetics or positional cloning to identify the genes that confer useful functions. In order to accelerate the whole research process, we have developed an integrated database system powered by an intelligent data-retrieval engine termed PosMed-plus (Positional Medline for plant upgrading science), allowing us to prioritize highly promising candidate genes in a given chromosomal interval(s) of Arabidopsis thaliana and rice, Oryza sativa. By inferentially integrating cross-species information resources including genomes, transcriptomes, proteomes, localizomes, phenomes and literature, the system compares a user's query, such as phenotypic or functional keywords, with the literature associated with the relevant genes located within the interval. By utilizing orthologous and paralogous correspondences, PosMed-plus efficiently integrates cross-species information to facilitate the ranking of rice candidate genes based on evidence from other model species such as Arabidopsis. PosMed-plus is a plant science version of the PosMed system widely used by mammalian researchers, and provides both a powerful integrative search function and a rich integrative display of the integrated databases. PosMed-plus is the first cross-species integrated database that inferentially prioritizes candidate genes for forward genetics approaches in plant science, and will be expanded for wider use in plant upgrading in many species.

[1]  David J. Porteous,et al.  SUSPECTS : enabling fast and effective prioritization of positional candidates , 2005 .

[2]  Frédéric Alexandre,et al.  Connectionist-Symbolic Integration: From Unified to Hybrid Approaches , 1996 .

[3]  K. Oda,et al.  Systematic approaches to using the FOX hunting system to identify useful rice genes. , 2009, The Plant journal : for cell and molecular biology.

[4]  M. Yano,et al.  Ehd 1 , a B-type response regulator in rice , confers short-day promotion of flowering and controls FT-like gene expression independently of Hd 1 , 2004 .

[5]  Kazuyuki Doi,et al.  Ehd1, a B-type response regulator in rice, confers short-day promotion of flowering and controls FT-like gene expression independently of Hd1. , 2004, Genes & development.

[6]  Yoshiyuki Sakaki,et al.  OmicBrowse: a browser of multidimensional omics annotations , 2007, Bioinform..

[7]  P. Robinson,et al.  Walking the interactome for prioritization of candidate disease genes. , 2008, American journal of human genetics.

[8]  Peter M Visscher,et al.  Prioritization of Positional Candidate Genes Using Multiple Web-Based Software Tools , 2007, Twin Research and Human Genetics.

[9]  Kazuo Shinozaki,et al.  A collection of 11 800 single-copy Ds transposon insertion lines in Arabidopsis. , 2004, The Plant journal : for cell and molecular biology.

[10]  Melissa D. Lehti-Shiu,et al.  Importance of Lineage-Specific Expansion of Plant Tandem Duplicates in the Adaptive Response to Environmental Stimuli1[W][OA] , 2008, Plant Physiology.

[11]  K. Akiyama,et al.  A trial of phenome analysis using 4000 Ds-insertional mutants in gene-coding regions of Arabidopsis. , 2006, The Plant journal : for cell and molecular biology.

[12]  Naoyuki Kamatani,et al.  Identification of diabetes susceptibility loci in db mice by combined quantitative trait loci analysis and haplotype mapping. , 2006, Genomics.

[13]  Yunfeng Li,et al.  Characterization and mapping of a new male sterility mutant of anther advanced dehiscence (t) in rice. , 2008, Journal of genetics and genomics = Yi chuan xue bao.

[14]  P. Zimmermann,et al.  Genome-Scale Proteomics Reveals Arabidopsis thaliana Gene Models and Proteome Dynamics , 2008, Science.

[15]  L. Stein,et al.  Development and mapping of 2240 new SSR markers for rice (Oryza sativa L.). , 2002, DNA research : an international journal for rapid publication of reports on genes and genomes.

[16]  H. Leung,et al.  Rice Mutant Resources for Gene Discovery , 2004, Plant Molecular Biology.

[17]  Masakazu Satou,et al.  Arabidopsis transcriptome analysis under drought, cold, high-salinity and ABA treatment conditions using a tiling array. , 2008, Plant & cell physiology.

[18]  Kazuo Shinozaki,et al.  The AtGenExpress hormone and chemical treatment data set: experimental design, data evaluation, model data analysis and data access. , 2008 .

[19]  S. Luan,et al.  A rice quantitative trait locus for salt tolerance encodes a sodium transporter , 2005, Nature Genetics.

[20]  S. Lin,et al.  A high-density rice genetic linkage map with 2275 markers using a single F2 population. , 1998, Genetics.

[21]  Wei Huang,et al.  A QTL for rice grain width and weight encodes a previously unknown RING-type E3 ubiquitin ligase , 2007, Nature Genetics.

[22]  The Arabidopsis Genome Initiative Analysis of the genome sequence of the flowering plant Arabidopsis thaliana , 2000, Nature.

[23]  H. Suzuki,et al.  Mapping quantitative trait loci for proteinuria-induced renal collagen deposition. , 2008, Kidney international.

[24]  M. Yano,et al.  Hd6, a rice quantitative trait locus involved in photoperiod sensitivity, encodes the α subunit of protein kinase CK2 , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[25]  M. Schiøtt,et al.  A plant plasma membrane Ca2+ pump is required for normal pollen tube growth and fertilization. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[26]  Howard L. Bleich,et al.  Technical Milestone: Medical Subject Headings Used to Search the Biomedical Literature , 2001, J. Am. Medical Informatics Assoc..

[27]  Bassem A. Hassan,et al.  Gene prioritization through genomic data fusion , 2006, Nature Biotechnology.

[28]  The UniProt Consortium,et al.  The Universal Protein Resource (UniProt) 2009 , 2008, Nucleic Acids Res..

[29]  Gert Vriend,et al.  GeneSeeker: extraction and integration of human disease-related information from web-based genetic databases , 2005, Nucleic Acids Res..

[30]  Q. Qian,et al.  Cytokinin Oxidase Regulates Rice Grain Production , 2005, Science.

[31]  Tanya Z. Berardini,et al.  The Arabidopsis Information Resource (TAIR): gene structure and function annotation , 2007, Nucleic Acids Res..

[32]  Yoshihiro Kawahara,et al.  The Rice Annotation Project Database (RAP-DB): 2008 update , 2007, Nucleic Acids Res..

[33]  Tetsuro Toyoda,et al.  Statistical search on the Semantic Web , 2008, Bioinform..

[34]  Kaworu Ebana,et al.  Deletion in a gene associated with grain size increased yields during rice domestication , 2008, Nature Genetics.

[35]  Tetsuro Toyoda,et al.  PosMed (Positional Medline): prioritizing genes with an artificial neural network comprising medical documents to accelerate positional cloning , 2009, Nucleic Acids Res..

[36]  Kengo Kinoshita,et al.  ATTED-II provides coexpressed gene networks for Arabidopsis , 2008, Nucleic Acids Res..

[37]  Jana Marie Schwarz,et al.  GeneDistiller—Distilling Candidate Genes from Linkage Intervals , 2008, PloS one.

[38]  Joshua L. Heazlewood,et al.  SUBA: the Arabidopsis Subcellular Database , 2006, Nucleic Acids Res..

[39]  J. Sebastian,et al.  The Arabidopsis-mei2-Like Genes Play a Role in Meiosis and Vegetative Growth in Arabidopsis[W] , 2006, The Plant Cell Online.

[40]  Shuangcheng Li,et al.  Phenotypic characterization, genetic analysis, and molecular mapping of a new mutant gene for male sterility in rice. , 2008, Genome.

[41]  K. Akiyama,et al.  Functional Annotation of a Full-Length Arabidopsis cDNA Collection , 2002, Science.

[42]  Ulf Leser,et al.  What makes a gene name? Named entity recognition in the biomedical literature , 2005, Briefings Bioinform..

[43]  Ryo Umetsu,et al.  OmicBrowse: a Flash-based high-performance graphics interface for genomic resources , 2009, Nucleic Acids Res..

[44]  Joseph M. Dale,et al.  Empirical Analysis of Transcriptional Activity in the Arabidopsis Genome , 2003, Science.

[45]  Guang Li,et al.  AtPID: Arabidopsis thaliana protein interactome database—an integrative platform for plant systems biology , 2007, Nucleic Acids Res..

[46]  Cathy H. Wu,et al.  The Universal Protein Resource (UniProt) , 2004, Nucleic Acids Res..