Technologies and Solutions for Trend Detection in Public Literature for Biomarker Discovery

Data sets arising in biomedicine and bioinformatics are often huge and consist of quite different types of data (eg, sequence data and microarray measurements). Consequently, standard machine learning techniques usually cannot be directly applied. In this talk, I will describe an algorithm called affinity propagation and discuss why it offers flexibility in analyzing the kinds of data sets arising in bioinformatics and biomedicine. I'll describe applications in the areas of whole-genome transcript detection using microarrays, image segmentation, text analysis and motif discovery. Affinity propagation can implemented in a couple dozen lines of MATLAB or C and is suitable for distributed computing environments, making it attractive for high-throughput computations.

[1]  Charles DeLisi,et al.  Binding Site Graphs: A New Graph Theoretical Framework for Prediction of Transcription Factor Binding Sites , 2007, PLoS Comput. Biol..

[2]  Charles DeLisi,et al.  Machine learning for regulatory analysis and transcription factor target prediction in yeast , 2006, Systems and Synthetic Biology.

[3]  Yoav Freund,et al.  A classification-based framework for predicting and analyzing gene regulatory response , 2006, BMC Bioinformatics.

[4]  Ting Wang,et al.  An improved map of conserved regulatory sites for Saccharomyces cerevisiae , 2006, BMC Bioinformatics.

[5]  Alexander E. Kel,et al.  TRANSFAC® and its module TRANSCompel®: transcriptional gene regulation in eukaryotes , 2005, Nucleic Acids Res..

[6]  Anshul Kundaje,et al.  Combining sequence and time series expression data to learn transcriptional modules , 2005, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[7]  Qing Zhou,et al.  A boosting approach for motif modeling using ChIP-chip data , 2005, Bioinform..

[8]  Yoav Freund,et al.  Predicting genetic regulatory response using classification , 2004, ISMB/ECCB.

[9]  Nello Cristianini,et al.  A statistical framework for genomic data fusion , 2004, Bioinform..

[10]  Shoshana J. Wodak,et al.  Combining pattern discovery and discriminant analysis to predict gene co-regulation , 2004, Bioinform..

[11]  Nicola J. Rinaldi,et al.  Transcriptional regulatory code of a eukaryotic genome , 2004, Nature.

[12]  Jun S. Liu,et al.  Integrating regulatory motif discovery and genome-wide expression analysis , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[13]  Jun S. Liu,et al.  An algorithm for finding protein–DNA binding sites with applications to chromatin-immunoprecipitation microarray experiments , 2002, Nature Biotechnology.

[14]  T. Graves,et al.  Surveying Saccharomyces genomes to identify functional elements by comparative DNA sequence analysis. , 2001, Genome research.

[15]  Jason Weston,et al.  Gene functional classification from heterogeneous data , 2001, RECOMB.

[16]  G. Church,et al.  Finding DNA regulatory motifs within unaligned noncoding sequences clustered by whole-genome mRNA quantitation , 1998, Nature Biotechnology.

[17]  C. Elkan,et al.  Unsupervised learning of multiple motifs in biopolymers using expectation maximization , 1995, Machine Learning.

[18]  Terrence S. Furey,et al.  The UCSC Genome Browser Database , 2003, Nucleic Acids Res..

[19]  Kara Dolinski,et al.  Saccharomyces Genome Database (SGD) provides secondary gene annotation using the Gene Ontology (GO) , 2002, Nucleic Acids Res..