Class cover catch digraphs for latent class discovery in gene expression monitoring by DNA microarrays

The purpose of this article is to introduce a data visualization technique for class cover catch digraphs which allows for the discovery of latent subclasses. We illustrate the technique via a pedagogical example and an application to data sets from artificial nose chemical sensing and gene expression monitoring by DNA microarrays. Of particular interest is the discovery of latent subclasses representing chemical concentration in the artificial nose data and two subtypes of acute lymphoblastic leukemia in the gene expression data and the associated conjectures pertaining to the geometry of these subclasses in their respective high-dimensional observation spaces.

[1]  P. Bickel,et al.  Mathematical Statistics: Basic Ideas and Selected Topics , 1977 .

[2]  Carey E. Priebe,et al.  Olfactory Classification via Interpoint Distance Analysis , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[3]  E. Wegman Hyperdimensional Data Analysis Using Parallel Coordinates , 1990 .

[4]  Carey E. Priebe,et al.  A Visualization Framework for the Analysis of Hyperdimensional Data , 2002, Int. J. Image Graph..

[5]  Brian Everitt,et al.  Cluster analysis , 1974 .

[6]  C. Priebe,et al.  On the distribution of the domination number for random class cover catch digraphs , 2001 .

[7]  J. Mesirov,et al.  Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. , 1999, Science.

[8]  David J. Marchette,et al.  Characterizing the scale dimension of a high-dimensional classification problem , 2003, Pattern Recognit..

[9]  Carey E. Priebe,et al.  Classification Using Class Cover Catch Digraphs , 2003, J. Classif..

[10]  Edward J. Wegman Visions: New techniques and technologies in statistics , 2000, Comput. Stat..

[11]  G. Getz,et al.  Coupled two-way clustering analysis of gene microarray data. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[12]  Vincent Kanade,et al.  Clustering Algorithms , 2021, Wireless RF Energy Transfer in the Massive IoT Era.

[13]  H. Toutenburg,et al.  Lehmann, E. L., Nonparametrics: Statistical Methods Based on Ranks, San Francisco. Holden‐Day, Inc., 1975. 480 S., $ 22.95 . , 1977 .

[14]  R. Hanka,et al.  The scientific use of factor analysis: Raymond B. Cattell Plenum Press, £20.48 , 1981 .

[15]  David J. Groggel,et al.  Practical Nonparametric Statistics , 2000, Technometrics.