Topological Features in Cancer Gene Expression Data>

We present a new method for exploring cancer gene expression data based on tools from algebraic topology. Our method selects a small relevant subset from tens of thousands of genes while simultaneously identifying nontrivial higher order topological features, i.e., holes, in the data. We first circumvent the problem of high dimensionality by dualizing the data, i.e., by studying genes as points in the sample space. Then we select a small subset of the genes as landmarks to construct topological structures that capture persistent, i.e., topologically significant, features of the data set in its first homology group. Furthermore, we demonstrate that many members of these loops have been implicated for cancer biogenesis in scientific literature. We illustrate our method on five different data sets belonging to brain, breast, leukemia, and ovarian cancers.

[1]  James R. Munkres,et al.  Elements of algebraic topology , 1984 .

[2]  김삼묘,et al.  “Bioinformatics” 특집을 내면서 , 2000 .

[3]  A. John MINING GRAPH DATA , 2022 .

[4]  M. Ledoux The concentration of measure phenomenon , 2001 .

[5]  J. Herskowitz,et al.  Proceedings of the National Academy of Sciences, USA , 1996, Current Biology.

[6]  British Ornithologists,et al.  Bulletin of the , 1999 .

[7]  W. Hellstrom,et al.  J Urol , 2014 .

[8]  C. Bachoc,et al.  Applied and Computational Harmonic Analysis Tight P-fusion Frames , 2022 .

[9]  B. Garcia,et al.  Proteomics , 2011, Journal of biomedicine & biotechnology.

[10]  Afra Zomorodian,et al.  Computational topology , 2010 .

[12]  A. Chadli THE CANCER CELL , 1924, La Presse medicale.

[13]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[14]  F. Young Biochemistry , 1955, The Indian Medical Gazette.

[15]  K. P. Hart,et al.  Topology and its Applications , 2007 .

[16]  H. E. Kuhn,et al.  BULLETIN OF THE AMERICAN MATHEMATICAL SOCIETY, , 2007 .

[17]  O. Bagasra,et al.  Proceedings of the National Academy of Sciences , 1914, Science.