A novel strategy of integrated microarray analysis identifies CENPA, CDK1 and CDC20 as a cluster of diagnostic biomarkers in lung adenocarcinoma.

Lung adenocarcinoma (LAC) is the most lethal cancer and the leading cause of cancer-related death worldwide. The identification of meaningful clusters of co-expressed genes or representative biomarkers may help improve the accuracy of LAC diagnoses. Public databases, such as the Gene Expression Omnibus (GEO), provide rich resources of valuable information for clinics, however, the integration of multiple microarray datasets from various platforms and institutes remained a challenge. To determine potential indicators of LAC, we performed genome-wide relative significance (GWRS), genome-wide global significance (GWGS) and support vector machine (SVM) analyses progressively to identify robust gene biomarker signatures from 5 different microarray datasets that included 330 samples. The top 200 genes with robust signatures were selected for integrative analysis according to "guilt-by-association" methods, including protein-protein interaction (PPI) analysis and gene co-expression analysis. Of these 200 genes, only 10 genes showed both intensive PPI network and high gene co-expression correlation (r > 0.8). IPA analysis of this regulatory networks suggested that the cell cycle process is a crucial determinant of LAC. CENPA, as well as two linked hub genes CDK1 and CDC20, are determined to be potential indicators of LAC. Immunohistochemical staining showed that CENPA, CDK1 and CDC20 were highly expressed in LAC cancer tissue with co-expression patterns. A Cox regression model indicated that LAC patients with CENPA+/CDK1+ and CENPA+/CDC20+ were high-risk groups in terms of overall survival. In conclusion, our integrated microarray analysis demonstrated that CENPA, CDK1 and CDC20 might serve as novel cluster of prognostic biomarkers for LAC, and the cooperative unit of three genes provides a technically simple approach for identification of LAC patients.

[1]  Y. Miyoshi,et al.  Determination of the specific activity of CDK1 and CDK2 as a novel prognostic indicator for early breast cancer. , 2008, Annals of oncology : official journal of the European Society for Medical Oncology.

[2]  V. Barra,et al.  CENP-A Is Dispensable for Mitotic Centromere Function after Initial Centromere/Kinetochore Assembly. , 2016, Cell reports.

[3]  G. Karpen,et al.  FBW7 Loss Promotes Chromosomal Instability and Tumorigenesis via Cyclin E1/CDK2-Mediated Phosphorylation of CENP-A. , 2017, Cancer research.

[4]  Yuxin Sun,et al.  Feature selection using a one dimensional naïve Bayes’ classifier increases the accuracy of support vector machine classification of CDR3 repertoires , 2017, Bioinform..

[5]  Atul J. Butte,et al.  Systematic survey reveals general applicability of "guilt-by-association" within gene coexpression networks , 2005, BMC Bioinformatics.

[6]  Patrick Cahan,et al.  Meta-analysis of microarray results: challenges, opportunities, and recommendations for standardization. , 2007, Gene.

[7]  Yonghong Peng,et al.  A new 12-gene diagnostic biomarker signature of melanoma revealed by integrated microarray analysis , 2013, PeerJ.

[8]  A. Porter,et al.  Cdk1 phosphorylates the Rac activator Tiam1 to activate centrosomal Pak and promote mitotic spindle formation , 2015, Nature Communications.

[9]  Paul Taylor,et al.  Integrated Omic analysis of lung cancer reveals metabolism proteome signatures with prognostic impact , 2014, Nature Communications.

[10]  H. Cui,et al.  Identification of potential therapeutic target genes and mechanisms in non-small-cell lung carcinoma in non-smoking women based on bioinformatics analysis. , 2015, European review for medical and pharmacological sciences.

[11]  Audrey Kauffmann,et al.  Bioinformatics Applications Note Arrayqualitymetrics—a Bioconductor Package for Quality Assessment of Microarray Data , 2022 .

[12]  Kerry Bloom,et al.  Centromeres: unique chromatin structures that drive chromosome segregation , 2011, Nature Reviews Molecular Cell Biology.

[13]  L. O’Driscoll,et al.  Correlating transcriptional networks to breast cancer survival: a large-scale coexpression analysis. , 2013, Carcinogenesis.

[14]  Qing‐Yu He,et al.  Isodeoxyelephantopin induces protective autophagy in lung cancer cells via Nrf2-p62-keap1 feedback loop , 2017, Cell Death & Disease.

[15]  Raphael A Nemenoff,et al.  Tumorigenesis and Neoplastic Progression Analysis of Orthologous Gene Expression between Human Pulmonary Adenocarcinoma and a Carcinogen-Induced Murine Model , 2010 .

[16]  Wei Zheng,et al.  Prognostic and predictive values of CDK1 and MAD2L1 in lung adenocarcinoma , 2016, Oncotarget.

[17]  D. Hanahan,et al.  Hallmarks of Cancer: The Next Generation , 2011, Cell.

[18]  R. Margolis,et al.  A 17-kD centromere protein (CENP-A) copurifies with nucleosome core particles and with histones , 1987, The Journal of cell biology.

[19]  Yusuke Nakamura,et al.  Activation of CDCA1-KNTC2, members of centromere protein complex, involved in pulmonary carcinogenesis. , 2006, Cancer research.

[20]  T. Barrette,et al.  ONCOMINE: a cancer microarray database and integrated data-mining platform. , 2004, Neoplasia.

[21]  Michael A. Sennett,et al.  CENP-C directs a structural transition of the CENP-A nucleosome mainly through sliding of DNA gyres , 2016, Nature Structural &Molecular Biology.

[22]  Y. Furukawa,et al.  CDC20, a potential cancer therapeutic target, is negatively regulated by p53 , 2008, Oncogene.

[23]  S. Lam,et al.  Genome-scale analysis of DNA methylation in lung adenocarcinoma and integration with mRNA expression , 2012, Genome research.

[24]  Yonghong Peng,et al.  A novel ensemble machine learning for robust microarray data classification , 2006, Comput. Biol. Medicine.

[25]  Nam-Soon Kim,et al.  Identification of gastric cancer-related genes using a cDNA microarray containing novel expressed sequence tags expressed in gastric cancer cells. , 2005, Clinical cancer research : an official journal of the American Association for Cancer Research.

[26]  Kenji Mizuguchi,et al.  Network analysis and in silico prediction of protein-protein interactions with applications in drug discovery. , 2017, Current opinion in structural biology.

[27]  F. Khuri,et al.  Lung cancer: New biological insights and recent therapeutic advances , 2011, CA: a cancer journal for clinicians.

[28]  M. Barbacid,et al.  Cell cycle, CDKs and cancer: a changing paradigm , 2009, Nature Reviews Cancer.

[29]  Guanming Wu,et al.  ReactomeFIViz : a Cytoscape app for pathway and network-based data analysis , 2022 .

[30]  J. Minna,et al.  Ras regulates kinesin 13 family members to control cell migration pathways in transformed human bronchial epithelial cells , 2013, Oncogene.

[31]  J. Wong,et al.  Dynamic phosphorylation of CENP-A at Ser68 orchestrates its cell-cycle-dependent deposition at centromeres. , 2015, Developmental cell.

[32]  Yiling Feng,et al.  Prognostic role of the long non-coding RNA, SPRY4 Intronic Transcript 1, in patients with cancer: a meta-analysis , 2017, Oncotarget.

[33]  Andrea Musacchio,et al.  The Mad1/Mad2 Complex as a Template for Mad2 Activation in the Spindle Assembly Checkpoint , 2005, Current Biology.

[34]  Shiladitya Sengupta,et al.  Overexpression of Cdc20 leads to impairment of the spindle assembly checkpoint and aneuploidization in oral cancer. , 2007, Carcinogenesis.

[35]  D. Tenen,et al.  Targeting CDK1 promotes FLT3-activated acute myeloid leukemia differentiation through C/EBPα. , 2012, Journal of Clinical Investigation.

[36]  S. Wacholder,et al.  Gene Expression Signature of Cigarette Smoking and Its Role in Lung Adenocarcinoma Development and Survival , 2008, PloS one.

[37]  Xiao-Jun Feng,et al.  Expression and prognostic significance of centromere protein A in human lung adenocarcinoma. , 2012, Lung cancer.

[38]  Ce Shen,et al.  Gene Expression Analysis of Lung Adenocarcinoma and Matched Adjacent Non-tumor Lung Tissue , 2014, Tumori.

[39]  Chin-Lee Wu,et al.  Co-expression network analysis identified six hub genes in association with progression and prognosis in human clear cell renal cell carcinoma (ccRCC) , 2017, Genomics data.

[40]  Junfeng Xia,et al.  Do cancer proteins really interact strongly in the human protein-protein interaction network? , 2011, Comput. Biol. Chem..

[41]  Siyu Sun,et al.  Cell cycle-dependent deposition of CENP-A requires the Dos1/2–Cdc20 complex , 2012, Proceedings of the National Academy of Sciences.

[42]  Rafael Rosell,et al.  Gene expression profiling reveals novel biomarkers in nonsmall cell lung cancer , 2011, International journal of cancer.

[43]  A. Fusco,et al.  UbcH10 overexpression in human lung carcinomas and its correlation with EGFR and p53 mutational status. , 2013, European journal of cancer.

[44]  Nuno M. C. Martins,et al.  Cdk activity couples epigenetic centromere inheritance to cell cycle progression. , 2012, Developmental cell.