Modeling Biological Data Through Dynamic Bayesian Networks for Oral Squamous Cell Carcinoma Classification

We propose a computational approach for modeling the progression of Oral Squamous Cell Carcinoma (OSCC) through Dynamic Bayesian Network (DBN) models. RNA-Seq transcriptomics data, available from public functional genomics data repositories, are exploited to find genes related to disease progression (i.e. recurrence or no recurrence). Our primary aim is to perform a computational analysis based on the differentially expressed genes identified. More specifically, a search for putative transcription factor binding sites (TFBSs), in the promoters of the input gene set, as well as an analysis of the pathways of the suggested transcription factors is conducted. Activities of transcription factors which are regulated by upstream signaling cascades are further discovered. These activities converge in certain nodes, representing molecules which are potential regulators of OSCC progression. The resulting gene list is further exploited for the inference of their causal relationships and for disease classification in terms of DBN models. The structure and the parameters of the models are defined subsequently, revealing the changes in gene-gene interactions with reference to disease recurrence after surgery. The objectives of the proposed methodology are to: (i) accurately estimate OSCC progression, and (ii) provide better insights into the regulatory mechanisms of the disease. Moreover, we can conjecture about the interactions among genes based on the inferred network models. The proposed approach implies that the resulting regulatory molecules along with the differentially expressed genes extracted, can be considered as new targets, and are candidates for further experimental and in silico validation.