CellExpress: a comprehensive microarray-based cancer cell line and clinical sample gene expression analysis online system

Abstract With the advancement of high-throughput technologies, gene expression profiles in cell lines and clinical samples are widely available in the public domain for research. However, a challenge arises when trying to perform a systematic and comprehensive analysis across independent datasets. To address this issue, we developed a web-based system, CellExpress, for analyzing the gene expression levels in more than 4000 cancer cell lines and clinical samples obtained from public datasets and user-submitted data. First, a normalization algorithm can be utilized to reduce the systematic biases across independent datasets. Next, a similarity assessment of gene expression profiles can be achieved through a dynamic dot plot, along with a distance matrix obtained from principal component analysis. Subsequently, differentially expressed genes can be visualized using hierarchical clustering. Several statistical tests and analytical algorithms are implemented in the system for dissecting gene expression changes based on the groupings defined by users. Lastly, users are able to upload their own microarray and/or next-generation sequencing data to perform a comparison of their gene expression patterns, which can help classify user data, such as stem cells, into different tissue types. In conclusion, CellExpress is a user-friendly tool that provides a comprehensive analysis of gene expression levels in both cell lines and clinical samples. The website is freely available at http://cellexpress.cgm.ntu.edu.tw/. Source code is available at https://github.com/LeeYiFang/Carkinos under the MIT License. Database URL: http://cellexpress.cgm.ntu.edu.tw/

[1]  P. Hevezi,et al.  Gene expression analyses reveal molecular relationships among 20 regions of the human CNS , 2006, Neurogenetics.

[2]  Javed Siddiqui,et al.  Activating ESR1 mutations in hormone-resistant metastatic breast cancer , 2013, Nature Genetics.

[3]  D. Hanahan,et al.  Hallmarks of Cancer: The Next Generation , 2011, Cell.

[4]  C. Rochlitz,et al.  Global Gene Expression Analysis of the Interaction between Cancer Cells and Osteoblasts to Predict Bone Metastasis in Breast Cancer , 2012, PloS one.

[5]  Sergio Contrino,et al.  ArrayExpress—a public repository for microarray gene expression data at the EBI , 2004, Nucleic Acids Res..

[6]  G. Sauter,et al.  Estrogen receptor alpha (ESR1) gene amplification is frequent in breast cancer , 2007, Nature Genetics.

[7]  Richard Lugg,et al.  Mutation analysis of 24 known cancer genes in the NCI-60 cell line set , 2006, Molecular Cancer Therapeutics.

[8]  D. Scudiero,et al.  Cell line designation change: multidrug-resistant cell line in the NCI anticancer screen. , 1998, Journal of the National Cancer Institute.

[9]  Vassilios Ioannidis,et al.  ExPASy: SIB bioinformatics resource portal , 2012, Nucleic Acids Res..

[10]  Terence P. Speed,et al.  A comparison of normalization methods for high density oligonucleotide array data based on variance and bias , 2003, Bioinform..

[11]  Yuan Qi,et al.  Estrogen receptor (ER) mRNA and ER-related gene expression in breast cancers that are 1% to 10% ER-positive by immunohistochemistry. , 2012, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[12]  J. Minna,et al.  Analysis of TP53 Mutation Status in Human Cancer Cell Lines: A Reassessment , 2014, Human mutation.

[13]  Alex E. Lash,et al.  Gene Expression Omnibus: NCBI gene expression and hybridization array data repository , 2002, Nucleic Acids Res..

[14]  S. Weaver,et al.  Differential Responses of Human Fetal Brain Neural Stem Cells to Zika Virus Infection , 2017, Stem cell reports.

[15]  Yi Sun,et al.  Deciphering the Correlation between Breast Tumor Samples and Cell Lines by Integrating Copy Number Changes and Gene Expression Profiles , 2015, BioMed research international.

[16]  Ton Feuth,et al.  Normalization of gene expression measurements in tumor tissues: comparison of 13 endogenous control genes , 2005, Laboratory Investigation.

[17]  Vasileios Stathias,et al.  Data Portal for the Library of Integrated Network-based Cellular Signatures (LINCS) program: integrated access to diverse large-scale cellular perturbation response data , 2017, Nucleic Acids Res..

[18]  Christian A. Rees,et al.  Systematic variation in gene expression patterns in human cancer cell lines , 2000, Nature Genetics.

[19]  Yudong D. He,et al.  Gene expression profiling predicts clinical outcome of breast cancer , 2002, Nature.

[20]  Mingming Jia,et al.  COSMIC: exploring the world's knowledge of somatic mutations in human cancer , 2014, Nucleic Acids Res..

[21]  R. Shoemaker The NCI60 human tumour cell line anticancer drug screen , 2006, Nature Reviews Cancer.

[22]  Adam A. Margolin,et al.  The Cancer Cell Line Encyclopedia enables predictive modeling of anticancer drug sensitivity , 2012, Nature.

[23]  R. Barber,et al.  GAPDH as a housekeeping gene: analysis of GAPDH mRNA expression in a panel of 72 human tissues. , 2005, Physiological genomics.

[24]  Christopher M. Bishop,et al.  Mixtures of Probabilistic Principal Component Analyzers , 1999, Neural Computation.

[25]  D. Haber,et al.  Cell line-based platforms to evaluate the therapeutic efficacy of candidate anticancer agents , 2010, Nature Reviews Cancer.

[26]  Laura M. Heiser,et al.  Tumor-Derived Cell Lines as Molecular Models of Cancer Pharmacogenomics , 2015, Molecular Cancer Research.

[27]  A. Nobel,et al.  Supervised risk predictor of breast cancer based on intrinsic subtypes. , 2009, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[28]  Chad J. Creighton,et al.  MDA-MB-435 cells are derived from M14 Melanoma cells––a loss for breast cancer, but a boon for melanoma research , 2007, Breast Cancer Research and Treatment.

[29]  C. Abboud,et al.  Flavopiridol induces apoptosis and caspase-3 activation of a newly characterized Burkitt's lymphoma cell line containing mutant p53 genes. , 2001, Blood cells, molecules & diseases.

[30]  T. Barrette,et al.  ONCOMINE: a cancer microarray database and integrated data-mining platform. , 2004, Neoplasia.

[31]  Yusuke Nakamura,et al.  Prediction of outcome of advanced cervical cancer to thermoradiotherapy according to expression profiles of 35 genes selected by cDNA microarray analysis. , 2004, International journal of radiation oncology, biology, physics.

[32]  Carl R. Pelz,et al.  Global rank-invariant set normalization (GRSN) to reduce systematic distortions in microarray data , 2008, BMC Bioinformatics.

[33]  R. Sandberg,et al.  Assessment of tumor characteristic gene expression in cell lines using a tissue similarity index (TSI). , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[34]  D. DiMaio,et al.  Repression of human papillomavirus oncogenes in HeLa cervical carcinoma cells causes the orderly reactivation of dormant tumor suppressor pathways. , 2000, Proceedings of the National Academy of Sciences of the United States of America.