Finding Influential Genes Using Gene Expression Data and Boolean Models of Metabolic Networks

Selection of influential genes using gene expression data from normal and disease samples is an important topic in bioinformatics. In this paper, we propose a novel computational method for the problem, which combines gene expression patterns from normal and disease samples with a mathematical model of metabolic networks. This method seeks a set of k genes knockout of which drives the state of the metabolic network towards that in the disease samples. We adopt a Boolean model of metabolic networks and formulate the problem as a maximization problem under an integer linear programming framework. We applied the proposed method to selection of influential genes using gene expression data from normal samples and disease (head and neck cancer) samples. The result suggests that the proposed method can select more biologically relevant genes than an existing P-value based ranking method can.

[1]  Shuigeng Zhou,et al.  Compensatory ability to null mutation in metabolic networks , 2009, Biotechnology and bioengineering.

[2]  G. Church,et al.  Analysis of optimality in natural and perturbed metabolic networks , 2002 .

[3]  Albert-László Barabási,et al.  Observability of complex systems , 2013, Proceedings of the National Academy of Sciences.

[4]  M. Newton,et al.  Fundamental differences in cell cycle deregulation in human papillomavirus-positive and human papillomavirus-negative head/neck and cervical cancers. , 2007, Cancer research.

[5]  Assieh Saadatpour,et al.  Boolean modeling of biological regulatory networks: a methodology tutorial. , 2013, Methods.

[6]  Tatsuya Akutsu,et al.  Flux balance impact degree: a new definition of impact degree to properly treat reversible reactions in metabolic networks , 2013, Bioinform..

[7]  A. Burgard,et al.  Optknock: A bilevel programming framework for identifying gene knockout strategies for microbial strain optimization , 2003, Biotechnology and bioengineering.

[8]  Gavin D. Grant,et al.  Common markers of proliferation , 2006, Nature Reviews Cancer.

[9]  Tatsuya Akutsu,et al.  Integer Programming-Based Approach to Attractor Detection and Control of Boolean Networks , 2012, IEICE Trans. Inf. Syst..

[10]  M. Marra,et al.  Driver and passenger mutations in cancer. , 2015, Annual review of pathology.

[11]  Tatsuya Akutsu,et al.  Finding Minimum Reaction Cuts of Metabolic Networks Under a Boolean Model Using Integer Programming and Feedback Vertex Sets , 2010, Int. J. Knowl. Discov. Bioinform..

[12]  Julio M. Ottino,et al.  Cascading failure and robustness in metabolic networks , 2008, Proceedings of the National Academy of Sciences.

[13]  K. Hunter,et al.  Host genetics influence tumour metastasis , 2006, Nature Reviews Cancer.

[14]  M. Paul,et al.  Tyrosine kinase – Role and significance in Cancer , 2004, International journal of medical sciences.

[15]  Benjamin J. Raphael,et al.  Expanding the computational toolbox for mining cancer genomes , 2014, Nature Reviews Genetics.

[16]  Minoru Kanehisa,et al.  KEGG as a reference resource for gene and protein annotation , 2015, Nucleic Acids Res..

[17]  Bin Song,et al.  Mining Metabolic Networks for Optimal Drug Targets , 2007, Pacific Symposium on Biocomputing.

[18]  Lue Ping Zhao,et al.  Gene Expression Profiling Identifies Genes Predictive of Oral Squamous Cell Carcinoma , 2008, Cancer Epidemiology Biomarkers & Prevention.

[19]  Albert-László Barabási,et al.  Control of fluxes in metabolic networks , 2016, Genome research.

[20]  Dennis B. Troup,et al.  NCBI GEO: archive for high-throughput functional genomic data , 2008, Nucleic Acids Res..

[21]  Luonan Chen,et al.  Network biomarkers, interaction networks and dynamical network biomarkers in respiratory diseases , 2014, Clinical and Translational Medicine.

[22]  Jiangning Song,et al.  Computing Smallest Intervention Strategies for Multiple Metabolic Networks in a Boolean Model , 2015, J. Comput. Biol..

[23]  Benjamin J. Raphael,et al.  Identifying driver mutations in sequenced cancer genomes: computational approaches to enable precision medicine , 2014, Genome Medicine.

[24]  T. Ideker,et al.  Network-based classification of breast cancer metastasis , 2007, Molecular systems biology.

[25]  Gordon K Smyth,et al.  Statistical Applications in Genetics and Molecular Biology Linear Models and Empirical Bayes Methods for Assessing Differential Expression in Microarray Experiments , 2011 .

[26]  Li Mao,et al.  Transcriptomic dissection of tongue squamous cell carcinoma , 2008, BMC Genomics.

[27]  Ming-Feng Hou,et al.  Long-term effects of continuing adjuvant tamoxifen to 10 years versus stopping at 5 years after diagnosis of oestrogen receptor-positive breast cancer: ATLAS, a randomised trial , 2013, The Lancet.

[28]  Albert-László Barabási,et al.  Controllability of complex networks , 2011, Nature.

[29]  Niklas Een,et al.  MiniSat v1.13 - A SAT Solver with Conflict-Clause Minimization , 2005 .

[30]  C K Redmond,et al.  Tamoxifen for prevention of breast cancer: report of the National Surgical Adjuvant Breast and Bowel Project P-1 Study. , 1999, Journal of the National Cancer Institute.

[31]  Tatsuya Akutsu,et al.  Complex network-based approaches to biomarker discovery. , 2016, Biomarkers in medicine.

[32]  Rafael A Irizarry,et al.  Exploration, normalization, and summaries of high density oligonucleotide array probe level data. , 2003, Biostatistics.

[33]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .