TXTGate: profiling gene groups with text-based information

We implemented a framework called TXTGate that combines literature indices of selected public biological resources in a flexible text-mining system designed towards the analysis of groups of genes. By means of tailored vocabularies, term- as well as gene-centric views are offered on selected textual fields and MEDLINE abstracts used in LocusLink and the Saccharomyces Genome Database. Subclustering and links to external resources allow for in-depth analysis of the resulting term profiles.

[1]  F B ROGERS,et al.  Medical Subject Headings , 1948, Nature.

[2]  V. McKusick Mendelian inheritance in man , 1971 .

[3]  Martin F. Porter,et al.  An algorithm for suffix stripping , 1997, Program.

[4]  Anil K. Jain,et al.  Algorithms for Clustering Data , 1988 .

[5]  J. Rashbass Online Mendelian Inheritance in Man. , 1995, Trends in genetics : TIG.

[6]  K. Kas,et al.  Promoter swapping between the genes for a novel zinc finger protein and β-catenin in pleiomorphic adenomas with t(3;8)(p21;q12) translocations , 1997, Nature Genetics.

[7]  Jaime Prilusky,et al.  GeneCards: a novel functional genomics compendium with automated data mining and query reformulation support , 1998, Bioinform..

[8]  Michael Ruogu Zhang,et al.  Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization. , 1998, Molecular biology of the cell.

[9]  D. Botstein,et al.  Cluster analysis and display of genome-wide expression patterns. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[10]  L Hunter,et al.  MedMiner: an Internet text-mining tool for biomedical information, with application to gene expression profiling. , 1999, BioTechniques.

[11]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[12]  M. Ashburner,et al.  The Gene Ontology Consortium , 2000 .

[13]  T. Jenssen,et al.  A literature network of human genes for high-throughput analysis of gene expression , 2001, Nature Genetics.

[14]  Mark Gerstein,et al.  Blurring the boundaries between the scientific 'papers' and biological databases , 2001 .

[15]  Javed Mostafa,et al.  Detecting Gene Relations from MEDLINE Abstracts , 2000, Pacific Symposium on Biocomputing.

[16]  A. Valencia,et al.  Mining functional information associated with expression arrays , 2001, Functional & Integrative Genomics.

[17]  Michael Gribskov,et al.  Use of keyword hierarchies to interpret gene expression patterns , 2001, Bioinform..

[18]  Donna R. Maglott,et al.  RefSeq and LocusLink: NCBI gene-centered resources , 2001, Nucleic Acids Res..

[19]  A. F. Scott,et al.  OMIM: Online Mendelian Inheritance in Man , 2002 .

[20]  P. Bork,et al.  Association of genes to genetically inherited diseases using data mining , 2002, Nature Genetics.

[21]  Jeffrey T. Chang,et al.  Associating genes with gene ontology codes using a maximum entropy analysis of biomedical literature. , 2002, Genome research.

[22]  D. Chaussabel,et al.  Mining microarray expression data by literature profiling , 2002, Genome Biology.

[23]  R. Altman,et al.  Using text analysis to identify functionally coherent gene groups. , 2002, Genome research.

[24]  Hagit Shatkay,et al.  Information Retrieval Meets Gene Analysis , 2002, IEEE Intell. Syst..

[25]  Jeffrey B. Colombe,et al.  Finding relevant references to genes and proteins in Medline using a Bayesian approach , 2002, Bioinform..

[26]  C. Ball,et al.  Saccharomyces Genome Database. , 2002, Methods in enzymology.

[27]  Jeffrey T. Chang,et al.  The computational analysis of scientific literature to define and recognize gene expression clusters. , 2003, Nucleic acids research.

[28]  Bart De Moor,et al.  Evaluation of the Vector Space Representation in Text-Based Gene Clustering , 2002, Pacific Symposium on Biocomputing.

[29]  M. Rivera,et al.  Analysis of genomic and proteomic data using advanced literature mining. , 2003, Journal of proteome research.

[30]  C. V. Jongeneel,et al.  eVOC: a controlled vocabulary for unifying gene expression data. , 2003, Genome research.

[31]  Jung-Hsien Chiang,et al.  MeKE: Discovering the Functions of Gene Products from Biomedical Literature Via Sentence Alignment , 2003, Bioinform..

[32]  E. Rossi,et al.  MedMOLE : Mining literature to extract biological knowledge by microarray data , 2003 .

[33]  B. Moor,et al.  Microarray screening for target genes of the proto-oncogene PLAG1 , 2004, Oncogene.