cTAP: A Machine Learning Framework for Predicting Target Genes of a Transcription Factor using a Cohort of Gene Expression Data Sets

Identifying target genes of a transcription factor is crucial in biomedical research. Thanks to ChIP-seq technology, scientists can estimate potential genome-wide target genes of a transcription factor. However, finding the consistently behaving Up/Down targets of a transcription factor in a given biological context is difficult because it requires analysis of a large number of studies under the same or comparable context. We present a transcription target prediction method, called Cohort-based TF target prediction system (cTAP). This method assumes that the pathway involving the transcription factor of interest is featured with multiple functional groups of marker genes pertaining to the concerned biological process. It uses the notion of gene-presence and gene-absence in addition to log2 ratios of gene expression values for the prediction. Target prediction is made by applying multiple machine-learning models that learn the patterns of genepresence and gene-absence from log2 ratio and four types of Z scores from the normalized cohort’s gene expression data. The learned patterns are then associated with the putative targets of the concerned transcription factor to elicit genes exhibiting Up/Down gene regulation patterns “consistently” within the cohort. Totally 11 publicly available GEO data sets related to osteoclastogenesis are used in our experiment. The learned models using gene-presence and gene-absence produce target genes different from using only log2 ratios such as CASP1, BID, and IRF5. Our literature survey reveals that all these predicted targets have known roles in bone remodeling, specifically related to immune and osteoclasts, suggesting confidence in our method and potential merit for a wet-lab experiment for validation.

[1]  T. Yoshikawa,et al.  A Critical Role for Allograft Inflammatory Factor-1 in the Pathogenesis of Rheumatoid Arthritis , 2007, The Journal of Immunology.

[2]  Xia Li,et al.  The Roles of Acidosis in Osteoclast Biology , 2016, Front. Physiol..

[3]  R. Faccio,et al.  RelA/p65 promotes osteoclast differentiation by blocking a RANKL-induced apoptotic JNK pathway in mice. , 2008, The Journal of clinical investigation.

[4]  H. Takayanagi,et al.  Interferon regulatory factor-8 regulates bone metabolism by suppressing osteoclastogenesis , 2009, Nature Medicine.

[5]  H. Morse,et al.  IRF8 Governs Expression of Genes Involved in Innate and Adaptive Immunity in Human and Mouse Germinal Center B Cells , 2011, PloS one.

[6]  S. Wallet,et al.  Relevance of Caspase-1 and Nlrp3 Inflammasome on Inflammatory Bone Resorption in A Murine Model of Periodontitis , 2020, Scientific Reports.

[7]  Quy Xiao Xuan Lin,et al.  TFregulomeR reveals transcription factors’ context-specific features and functions , 2019, Nucleic acids research.

[8]  C. Giardina,et al.  BioTarget: A Computational Framework Identifying Cancer Type Specific Transcriptional Targets of Immune Response Pathways , 2019, Scientific Reports.

[9]  Jung Ha Kim,et al.  Regulation of NFATc1 in Osteoclast Differentiation , 2014, Journal of bone metabolism.

[10]  R. Quigg,et al.  Use of signal thresholds to determine significant changes in microarray data analyses , 2005 .

[11]  Antti Honkela,et al.  Model-based method for transcription factor target identification with limited data , 2010, Proceedings of the National Academy of Sciences.