Detecting and Visualizing Profile Correlation in Subspace

In this paper, we propose a novel method for detecting and visualizing profile correlation in subspace. The proposed method is able to (1) detect shifting-and-scaling correlated profiles in subspace, where the correlation can be either positive or negative, (2) summarize and visualize the shifting-and-scaling correlation in subspace, and (3) allow users to explore interested correlation subspace interactively. Initially, shifting and scaling were regarded as two different correlation patterns, in which profiles in subspace can overlap by a single shifting and scaling respectively. In later work, a much more generous correlation, shifting-and-scaling correlation, was studied, compared to which, shifting correlation and scaling correlation are just two special cases. Shifting-and-scaling correlation ensures subspace profile coherence not only in tendency as those tendency-based methods do, but also in value change proportion in subspace. However, no work has been focused on visualization of subspace shifting-and-scaling correlation yet. Our work is the first one to enable interactive exploration and visualization of subspace shifting-and-scaling correlation.

[1]  Christian Böhm,et al.  Computing Clusters of Correlation Connected objects , 2004, SIGMOD '04.

[2]  Richard M. Karp,et al.  Discovering local structure in gene expression data: the order-preserving submatrix problem. , 2003 .

[3]  Jeffrey L. Solka,et al.  Cluster Subspace Identification via Conditional Entropy Calculations , 2007 .

[4]  Anthony K. H. Tung,et al.  Mining Shifting-and-Scaling Co-Regulation Patterns on Gene Expression Profiles , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[5]  Mohammed J. Zaki,et al.  TRICLUSTER: an effective algorithm for mining coherent clusters in 3D microarray data , 2005, SIGMOD '05.

[6]  Elke Achtert,et al.  Detection and Visualization of Subspace Cluster Hierarchies , 2007, DASFAA.

[7]  Ira Assent,et al.  VISA: visual subspace clustering analysis , 2007, SKDD.

[8]  Philip S. Yu,et al.  /spl delta/-clusters: capturing subspace correlation in a large data set , 2002, Proceedings 18th International Conference on Data Engineering.

[9]  Daniel A. Keim,et al.  Optimal Grid-Clustering: Towards Breaking the Curse of Dimensionality in High-Dimensional Clustering , 1999, VLDB.

[10]  Philip S. Yu,et al.  Finding generalized projected clusters in high dimensional spaces , 2000, SIGMOD '00.

[11]  T. M. Murali,et al.  A Monte Carlo algorithm for fast projective clustering , 2002, SIGMOD '02.

[12]  Philip S. Yu,et al.  Clustering by pattern similarity in large data sets , 2002, SIGMOD '02.

[13]  Philip S. Yu,et al.  Fast algorithms for projected clustering , 1999, SIGMOD '99.

[14]  Yi Zhang,et al.  Entropy-based subspace clustering for mining numerical data , 1999, KDD '99.

[15]  Wei Wang,et al.  OP-cluster: clustering by tendency in high dimensional space , 2003, Third IEEE International Conference on Data Mining.

[16]  Weiqi Wang,et al.  Gene ontology friendly biclustering of expression profiles , 2004 .

[17]  Richard M. Karp,et al.  Discovering local structure in gene expression data: the order-preserving submatrix problem , 2002, RECOMB '02.