A mixture model-based discriminate analysis for identifying ordered transcription factor binding site pairs in gene promoters directly regulated by estrogen receptor-alpha

MOTIVATION To detect and select patterns of transcription factor binding sites (TFBSs) which distinguish genes directly regulated by estrogen receptor-alpha (ERalpha), we developed an innovative mixture model-based discriminate analysis for identifying ordered TFBS pairs. RESULTS Biologically, our proposed new algorithm clearly suggests that TFBSs are not randomly distributed within ERalpha target promoters (P-value < 0.001). The up-regulated targets significantly (P-value < 0.01) possess TFBS pairs, (DBP, MYC), (DBP, MYC/MAX heterodimer), (DBP, USF2) and (DBP, MYOGENIN); and down-regulated ERalpha target genes significantly (P-value < 0.01) possess TFBS pairs, such as (DBP, c-ETS1-68), (DBP, USF2) and (DBP, MYOGENIN). Statistically, our proposed mixture model-based discriminate analysis can simultaneously perform TFBS pattern recognition, TFBS pattern selection, and target class prediction; such integrative power cannot be achieved by current methods. AVAILABILITY The software is available on request from the authors. CONTACT lali@iupui.edu SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.

[1]  Brian D. Ripley,et al.  Pattern Recognition and Neural Networks , 1996 .

[2]  Saurabh Sinha,et al.  A probabilistic method to detect regulatory modules , 2003, ISMB.

[3]  G. Rubin,et al.  Exploiting transcription factor binding site clustering to identify cis-regulatory modules involved in pattern formation in the Drosophila genome , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[4]  A. Seth,et al.  ETS transcription factors and their emerging roles in human cancer. , 2005, European journal of cancer.

[5]  Alexander E. Kel,et al.  Eukaryotic promoter recognition by binding sites for transcription factors , 1995, Comput. Appl. Biosci..

[6]  Stella Pelengaris,et al.  c-MYC: more than just a matter of life and death , 2002, Nature Reviews Cancer.

[7]  Jun S. Liu,et al.  Discovery of Conserved Sequence Patterns Using a Stochastic Dictionary Model , 2003 .

[8]  Allen Chong,et al.  Discovery of estrogen receptor α target genes and response elements in breast tumor cells , 2004, Genome Biology.

[9]  J. Thompson,et al.  CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. , 1994, Nucleic acids research.

[10]  M. LeBlanc,et al.  Logic Regression , 2003 .

[11]  Sandya Liyanarachchi,et al.  Combinatorial analysis of transcription factor partners reveals recruitment of c-MYC to estrogen receptor-alpha responsive promoters. , 2006, Molecular cell.

[12]  C. Geserick,et al.  The role of DNA response elements as allosteric modulators of steroid receptor function , 2005, Molecular and Cellular Endocrinology.

[13]  R. Schiff,et al.  Crosstalk between estrogen receptor and growth factor receptor pathways as a cause for endocrine therapy resistance in breast cancer. , 2005, Clinical Cancer Research.

[14]  P. Phillips,et al.  Estrogen-Induced Ets-1 Promotes Capillary Formation in an in vitro Tumor Angiogenesis Model , 2003, Breast Cancer Research and Treatment.

[15]  Douglas L. Brutlag,et al.  BioProspector: Discovering Conserved DNA Motifs in Upstream Regulatory Regions of Co-Expressed Genes , 2000, Pacific Symposium on Biocomputing.

[16]  Mouse Genome Sequencing Consortium Initial sequencing and comparative analysis of the mouse genome , 2002, Nature.

[17]  Michael Gribskov,et al.  Combining evidence using p-values: application to sequence homology searches , 1998, Bioinform..

[18]  D. S. Prestridge Predicting Pol II promoter sequences using transcription factor binding sites. , 1995, Journal of molecular biology.

[19]  Charles Elkan,et al.  Fitting a Mixture Model By Expectation Maximization To Discover Motifs In Biopolymer , 1994, ISMB.

[20]  Yoav Goldberg,et al.  Repression of AP-1-stimulated transcription by c-Ets-1. , 1994, The Journal of biological chemistry.

[21]  Donald P. McDonnell,et al.  Connections and Regulation of the Human Estrogen Receptor , 2002, Science.

[22]  J. Rushton,et al.  Distinct changes in gene expression induced by A-Myb, B-Myb and c-Myb proteins , 2003, Oncogene.

[23]  Colin N. Dewey,et al.  Initial sequencing and comparative analysis of the mouse genome. , 2002 .

[24]  K. Roeder,et al.  A statistical model for locating regulatory regions in genomic DNA. , 1997, Journal of molecular biology.

[25]  Z. Weng,et al.  Statistical significance of clusters of motifs represented by position specific scoring matrices in nucleotide sequences. , 2002, Nucleic acids research.

[26]  K. Bland,et al.  Oncogene Protein Co‐Expression Value of Ha-ras, c-myc, c-fos, and p53 as Prognostic Discriminants for Breast Carcinoma , 1995, Annals of surgery.

[27]  Keji Zhao,et al.  Active chromatin domains are defined by acetylation islands revealed by genome-wide mapping. , 2005, Genes & development.

[28]  W. Wasserman,et al.  A predictive model for regulatory sequences directing liver-specific transcription. , 2001, Genome research.

[29]  William Stafford Noble,et al.  Searching for statistically significant regulatory modules , 2003, ECCB.

[30]  Sandya Liyanarachchi,et al.  Identifying estrogen receptor α target genes using integrated computational genomics and chromatin immunoprecipitation microarray , 2004 .

[31]  T. Werner,et al.  A novel method to develop highly specific models for regulatory units detects a new LTR in GenBank which contains a functional promoter. , 1997, Journal of molecular biology.

[32]  Kathleen Bove,et al.  The transcription factor Ets-1 in breast cancer. , 2005, Frontiers in bioscience : a journal and virtual library.

[33]  H. Bussemaker,et al.  Regulatory element detection using correlation with expression , 2001, Nature Genetics.

[34]  C. Sander,et al.  Growth in Bioinformatics , 2003, Bioinform..

[35]  G. Church,et al.  Finding DNA regulatory motifs within unaligned noncoding sequences clustered by whole-genome mRNA quantitation , 1998, Nature Biotechnology.

[36]  D. A. Foster,et al.  Survival Signals Generated by Estrogen and Phospholipase D in MCF-7 Breast Cancer Cells Are Dependent on Myc , 2005, Molecular and Cellular Biology.

[37]  G Leclercq,et al.  About GATA3, HNF3A, and XBP1, three genes co-expressed with the oestrogen receptor-α gene (ESR1) in breast cancer , 2004, Molecular and Cellular Endocrinology.

[38]  Jun S. Liu,et al.  Bayesian Models for Multiple Local Sequence Alignment and Gibbs Sampling Strategies , 1995 .

[39]  Martin C. Frith,et al.  Detection of cis -element clusters in higher eukaryotic DNA , 2001, Bioinform..

[40]  E. McDermott,et al.  Associations and Interactions between Ets-1 and Ets-2 and Coregulatory Proteins, SRC-1, AIB1, and NCoR in Breast Cancer , 2005, Clinical Cancer Research.

[41]  Mark J. van der Laan,et al.  Regulatory motif finding by logic regression , 2004, Bioinform..

[42]  R. Schiff,et al.  Endocrinology and hormone therapy in breast cancer: New insight into estrogen receptor-α function and its implication for endocrine therapy resistance in breast cancer , 2005, Breast Cancer Research.