Cancer computational biology

EditorialIntroduction of high-throughput measurement technolo-gies combined with the increase of the scientific knowl-edge base, with respect to our understanding of cellularand biological processes, resulted in establishing compu-ter and information science as an important and funda-mental component of modern biology. High-throughputmeasurement technologies, such as microarray-basedprofiling, mass spectrometry screens, and high-through-put sequencing, give rise to several computational chal-lenges. On one hand, they require a rigorous approachto assay design. Scientists and technology developerswork on optimizing assay components so as to maxi-mize the information obtained through the measure-ment. On the other hand, the use of high-throughputmeasurement gives rise to large quantities of data thatneeds to be pre-processed and analyzed to obtain mean-ingful knowledge. This processing and analysis is per-formed on various levels - from pre-processing the rawdata, such as images from microarrays or raw sequencereads - to analyzing the data and to the discovery of bio-markers or other biologically meaningful characteristics.Measurement technology addresses several aspects ofcellular processes such as DNA, RNA, proteomics,metabolomics, epigenetics and pathways. This increasein the scientific knowledge base also leads to a centralrole played by data analysis and modeling, stronglygrounded in computational methods. Systems biology orintegrative biology approaches and network analysis areof specific importance in this context.The above is even further emphasized in the contextof cancer research. Samples are complex and heteroge-neous, and cancer related mechanisms involve manylayers of the process that leads from the genome to cel-lular function. One example of a specific need of canceris the study of large scale aberrations in the genome.CNVs (copy number variations) were recentlyrecognized as abundant in normal cell populations andas related to many other disease types but they are stilla hallmark of cancer [1,2]. Genomes in cancer cellsoften have a structure that allows them to bypassgrowth control cellular processes. Regions coding fortumor suppressor genes are often deleted and regionsharboring oncogenes may be amplified. This is the case,for example, for p16 and myc, respectively [3-5]. Rear-rangements, such as inversions and translocations, giverise to tumor-driving fusion products as in the case ofBCR-Abl and the Philadelphia Chromosome as well asin more recent findings implicating fusion structures insolid tumors. Cancer research therefore makes use ofdata analysis methods and tools that address interpreta-tion of copy number data and the understanding of theeffect of genome changes on transcriptome level as wellas proteome level profiles of tumors. Other specificcomputational needs of cancer research are related toepigenetic changes, somatic evolution, definition of genesets in the context of specific cancer types, and to drugsand data that measures the effects of drugs.Computational biologists focusing on cancer developmethods for the genome scale characterization of tumors,on various levels of the molecular process. Data analysismethods often rely on the analysis of high-throughputmeasurement data and they provide understanding of therelationship between various molecular characteristics ofcells. For example - how do genome structural aberra-tions and changes in copy number, a result of increasedgenome instability in cancer, affect the expression ofgenes and other functional elements such as miRNA, andhow do the latter changes affect the function of relatedproteins. Understanding of the association of genomiccharacteristics and clinical properties of primary tumorsamples, xenografts or cell lines contributes to persona-lized cancer medicine through the development of pre-dictive biomarkers of drug efficacy. Many researchprojects therefore aim to discover biomarkers, at eithergenome, transcriptome or proteome level that are prog-nostic of cancer progression or predictive of response tospecific therapeutic agents [6,7]. Cancer computationalbiology also focuses on analyzing molecules and

[1]  P. Shannon,et al.  Cytoscape: a software environment for integrated models of biomolecular interaction networks. , 2003, Genome research.

[2]  Igor Jurisica,et al.  Prognostic and predictive gene signature for adjuvant chemotherapy in resected non-small-cell lung cancer. , 2010, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[3]  Igor Jurisica,et al.  NAViGaTOR: Network Analysis, Visualization and Graphing Toronto , 2009, Bioinform..

[4]  I. Jurisica,et al.  Unequal evolutionary conservation of human protein interactions in interologous networks , 2007, Genome Biology.

[5]  Alfonso Valencia,et al.  Implementing the iHOP concept for navigation of biomedical literature , 2005, ECCB/JBI.

[6]  Susumu Goto,et al.  The KEGG databases at GenomeNet , 2002, Nucleic Acids Res..

[7]  Pablo Tamayo,et al.  Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[8]  Tsviya Olender,et al.  Human Gene-Centric Databases at the Weizmann Institute of Science: GeneCards, UDB, CroW 21 and HORDE , 2003, Nucleic Acids Res..

[9]  Genome instability: Chrombling into pieces , 2011, Nature Reviews Cancer.

[10]  Christian von Mering,et al.  STRING: a database of predicted functional associations between proteins , 2003, Nucleic Acids Res..

[11]  Israel Steinfeld,et al.  BMC Bioinformatics BioMed Central , 2008 .

[12]  I. Jurisica,et al.  NAViGaTing the Micronome – Using Multiple MicroRNA Prediction Databases to Identify Signalling Pathway-Associated MicroRNAs , 2011, PloS one.

[13]  Alfonso Valencia,et al.  Extending pathways and processes using molecular interaction networks to analyse cancer genome data , 2010, BMC Bioinformatics.

[14]  J. Squire,et al.  Cause and Consequences of Genetic and Epigenetic Alterations in Human Cancer , 2008, Current genomics.

[15]  Hiroko K. Solvang,et al.  Linear and non-linear dependencies between copy number aberrations and mRNA expression reveal distinct molecular pathways in breast cancer , 2011, BMC Bioinformatics.

[16]  H. Hermeking,et al.  The MYC oncogene as a cancer drug target. , 2003, Current cancer drug targets.

[17]  Igor Jurisica,et al.  Prognostic gene signatures for non-small-cell lung cancer , 2009, Proceedings of the National Academy of Sciences.

[18]  Gary D. Bader,et al.  cPath: open source software for collecting, storing, and querying biological pathways , 2006, BMC Bioinformatics.

[19]  Benjamin J. Raphael,et al.  Detection of recurrent rearrangement breakpoints from copy number data , 2011, BMC Bioinformatics.

[20]  John D. Storey,et al.  A network-based analysis of systemic inflammation in humans , 2005, Nature.

[21]  Henning Hermjakob,et al.  Submit Your Interaction Data the IMEx Way , 2007, Proteomics.

[22]  John N. Weinstein,et al.  Framework for Identifying Common Aberrations in DNA Copy Number Data , 2007, RECOMB.

[23]  S. Fröhling,et al.  Chromosomal abnormalities in cancer. , 2008, The New England journal of medicine.

[24]  Vishal N. Patel,et al.  PETALS: Proteomic Evaluation and Topological Analysis of a mutated Locus' Signaling , 2010, BMC Bioinformatics.