The OncoFinder algorithm for minimizing the errors introduced by the high-throughput methods of transcriptome analysis

The diversity of the installed sequencing and microarray equipment make it increasingly difficult to compare and analyze the gene expression datasets obtained using the different methods. Many applications requiring high-quality and low error rates cannot make use of available data using traditional analytical approaches. Recently, we proposed a new concept of signalome-wide analysis of functional changes in the intracellular pathways termed OncoFinder, a bioinformatic tool for quantitative estimation of the signaling pathway activation (SPA). We also developed methods to compare the gene expression data obtained using multiple platforms and minimizing the error rates by mapping the gene expression data onto the known and custom signaling pathways. This technique for the first time makes it possible to analyze the functional features of intracellular regulation on a mathematical basis. In this study we show that the OncoFinder method significantly reduces the errors introduced by transcriptome-wide experimental techniques. We compared the gene expression data for the same biological samples obtained by both the next generation sequencing (NGS) and microarray methods. For these different techniques we demonstrate that there is virtually no correlation between the gene expression values for all datasets analyzed (R2 < 0.1). In contrast, when the OncoFinder algorithm is applied to the data we observed clear-cut correlations between the NGS and microarray gene expression datasets. The SPA profiles obtained using NGS and microarray techniques were almost identical for the same biological samples allowing for the platform-agnostic analytical applications. We conclude that this feature of the OncoFinder enables to characterize the functional states of the transcriptomes and interactomes more accurately as before, which makes OncoFinder a method of choice for many applications including genetics, physiology, biomedicine, and molecular diagnostics.

[1]  Lincoln Stein,et al.  Using the Reactome Database , 2004, Current protocols in bioinformatics.

[2]  N. Kuzmina,et al.  Handling Complex Rule-Based Models of Mitogenic Cell Signaling (on the Example of ERK Activation upon EGF Stimulation) , 2011 .

[3]  Ron Shamir,et al.  SPIKE – a database, visualization and analysis tool of cellular signaling pathways , 2008, BMC Bioinformatics.

[4]  Anton Buzdin,et al.  Improving specificity of DNA hybridization-based methods. , 2004, Nucleic acids research.

[5]  M. Blagosklonny Rapalogs in cancer prevention , 2012, Cancer biology & therapy.

[6]  A. Zhavoronkov,et al.  Methods for Structuring Scientific Knowledge from Many Areas Related to Aging Research , 2011, PloS one.

[7]  Nikolay M. Borisov,et al.  Oncofinder, a new method for the analysis of intracellular signaling pathway activation using transcriptomic data , 2014, Front. Genet..

[8]  Alex Zhavoronkov,et al.  Genetics and epigenetics of aging and longevity , 2013, Cell cycle.

[9]  A. Buzdin,et al.  Nucleic Acids Hybridization Modern Applications , 2007 .

[10]  H. Lehrach,et al.  RNA-Seq provides new insights in the transcriptome responses induced by the carcinogen benzo[a]pyrene. , 2012, Toxicological sciences : an official journal of the Society of Toxicology.

[11]  A M Aliper,et al.  Silencing AML1-ETO gene expression leads to simultaneous activation of both pro-apoptotic and proliferation signaling , 2014, Leukemia.

[12]  Kumaran Kandasamy,et al.  An evaluation of human protein-protein interaction data in the public domain , 2006, BMC Bioinformatics.

[13]  Yoshihiro Yamanishi,et al.  KEGG OC: a large-scale automatic construction of taxonomy-based ortholog clusters , 2012, Nucleic Acids Res..

[14]  Nikolay M. Borisov,et al.  Signaling pathway cloud regulation for in silico screening and ranking of the potential geroprotective drugs , 2014, Front. Genet..

[15]  Ron Shamir,et al.  SPIKE: a database of highly curated human signaling pathways , 2010, Nucleic Acids Res..

[16]  E. E. Egorov,et al.  [Stimulation of proliferation by carnosine: cellular and transcriptome approaches]. , 2014, Molekuliarnaia biologiia.

[17]  Sergei Egorov,et al.  Pathway studio - the analysis and navigation of molecular networks , 2003, Bioinform..

[18]  M. Blagosklonny Selective anti-cancer agents as anti-aging drugs , 2013, Cancer biology & therapy.

[19]  D. Hanahan,et al.  The Hallmarks of Cancer , 2000, Cell.

[20]  J. Campisi,et al.  Cancer and aging: More puzzles, more promises? , 2008, Cell cycle.

[21]  Terence P. Speed,et al.  A comparison of normalization methods for high density oligonucleotide array data based on variance and bias , 2003, Bioinform..

[22]  F. Cleton Cancer and Aging , 1992, British Journal of Cancer.

[23]  Robin Haw,et al.  Using the Reactome Database , 2012, Current protocols in bioinformatics.

[24]  Pora Kim,et al.  A High-Dimensional, Deep-Sequencing Study of Lung Adenocarcinoma in Female Never-Smokers , 2013, PloS one.

[25]  Mikhail Shugay,et al.  Towards error-free profiling of immune repertoires , 2014, Nature Methods.

[26]  [Stimulation of proliferation by carnosine: cellular and transcriptome approaches]. , 2014 .

[27]  A. Bauer-Mehren,et al.  Pathway databases and tools for their exploitation: benefits, current limitations and challenges , 2009, Molecular systems biology.

[28]  Xiao Xu,et al.  Parallel comparison of Illumina RNA-Seq and Affymetrix microarray platforms on transcriptomic profiles generated from 5-aza-deoxy-cytidine treated HT-29 colon cancer cells and simulated datasets , 2013, BMC Bioinformatics.

[29]  Eytan Ruppin,et al.  Model-based identification of drug targets that revert disrupted metabolism and its application to ageing , 2013, Nature Communications.

[30]  B. Kholodenko,et al.  Quantification of Short Term Signaling by the Epidermal Growth Factor Receptor* , 1999, The Journal of Biological Chemistry.