The History of the Cluster Heat Map

The cluster heat map is an ingenious display that simultaneously reveals row and column hierarchical cluster structure in a data matrix. It consists of a rectangular tiling, with each tile shaded on a color scale to represent the value of the corresponding element of the data matrix. The rows (columns) of the tiling are ordered such that similar rows (columns) are near each other. On the vertical and horizontal margins of the tiling are hierarchical cluster trees. This cluster heat map is a synthesis of several different graphic displays developed by statisticians over more than a century. We locate the earliest sources of this display in late 19th century publications, and trace a diverse 20th century statistical literature that provided a foundation for this most widely used of all bioinformatics displays.

[1]  李幼升,et al.  Ph , 1989 .

[2]  Wm. B. Bailey,et al.  Graphic Methods for Presenting Facts , 1914 .

[3]  Chun-Houh Chen GENERALIZED ASSOCIATION PLOTS: INFORMATION VISUALIZATION VIA ITERATIVELY GENERATED CORRELATION MATRICES , 2002 .

[4]  Lawrence Hubert,et al.  SOME APPLICATIONS OF GRAPH THEORY AND RELATED NON‐METRIC TECHNIQUES TO PROBLEMS OF APPROXIMATE SERIATION: THE CASE OF SYMMETRIC PROXIMITY MEASURES , 1974 .

[5]  Rein Kuusik,et al.  Pattern Discovery Using Seriation and Matrix Reordering : A Unified View, Extensions and an Application to Inventory Management.Mustrite avastamine kasutades järjestamist ning maatriksi ümberkorrastamist: unifitseeritud vaade, edasiarendused ning rakendus ladude juhtimises , 2008 .

[6]  H. Wainer,et al.  TWO ADDITIONS TO HIERARCHICAL CLUSTER ANALYSIS , 1972 .

[7]  Willard Cope Brinton Graphic Methods for Presenting Facts , 1915 .

[8]  Mounir Errami,et al.  Detection of unrelated proteins in sequences multiple alignments by using predicted secondary structures , 2003, Bioinform..

[9]  Taylor Francis Online,et al.  The American statistician , 1947 .

[10]  W. M. Flinders Petrie,et al.  Sequences in Prehistoric Remains , 1899 .

[11]  Robert F. Ling,et al.  A computer generated aid for cluster analysis , 1973, CACM.

[12]  L. Hubert SERIATION USING ASYMMETRIC PROXIMITY MEASURES , 1976 .

[13]  J. Weinstein A Postgenomic Visual Icon , 2008, Science.

[14]  Jon R. Kettenring,et al.  The Practice of Cluster Analysis , 2006, J. Classif..

[15]  F. Harary,et al.  Close-Proximity Analysis: Another Variation on the Minimum-Spanning-Tree Problem , 1995, Current Anthropology.

[16]  Gary G. Yen,et al.  Dendrogram Seriation Using Simulated Annealing , 2003, Inf. Vis..

[17]  W. S. Robinson A Method for Chronologically Ordering Archaeological Deposits , 1951, American Antiquity.

[18]  Erik D. Demaine,et al.  K-ary Clustering with Optimal Leaf Ordering for Gene Expression Data , 2002, WABI.

[19]  D. Botstein,et al.  Cluster analysis and display of genome-wide expression patterns. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[20]  Mark Bailey,et al.  The Grammar of Graphics , 2007, Technometrics.

[21]  M. Friendly Corrgrams , 2002 .

[22]  Vincent Kanade,et al.  Clustering Algorithms , 2021, Wireless RF Energy Transfer in the Massive IoT Era.

[23]  J. Gower,et al.  Expressing complex relationships in two dimensions , 1981 .

[24]  David S. Wishart,et al.  Clustan Graphics3 Interactive Graphics for Cluster Analysis , 1999 .

[25]  Leland Wilkinson The Grammar of Graphics , 1999 .

[26]  Leo A. Goodman,et al.  A New Model for Scaling Response Patterns: An Application of the Quasi-Independence Concept , 1975 .

[27]  Li Liu,et al.  Robust singular value decomposition analysis of microarray data , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[28]  P. Sneath The application of computers to taxonomy. , 1957, Journal of general microbiology.

[29]  Erkki Mäkinen,et al.  Constructing and Reconstructing the Reorderable Matrix , 2005, Inf. Vis..

[30]  Paul J. Schweitzer,et al.  Problem Decomposition and Data Reorganization by a Clustering Technique , 1972, Oper. Res..

[31]  William C. Halperin,et al.  Unclassed matrix shading and optimal ordering in hierarchical cluster analysis , 1984 .

[32]  Michael Friendly,et al.  Effect ordering for data displays , 2003, Comput. Stat. Data Anal..

[33]  tloft Berliner Gesellschaft für Anthropologie, Ethnologie und Urgeschichte , 2005, Die Naturwissenschaften.

[34]  John A. Hartigan,et al.  Clustering Algorithms , 1975 .

[35]  D. Andrich A rating formulation for ordered response categories , 1978 .

[36]  M. L. Plume,et al.  SPSS (Statistical Package for the Social Sciences) , 2002, Encyclopedia of Information Systems.

[37]  G. S. Johnson,et al.  An Information-Intensive Approach to the Molecular Pharmacology of Cancer , 1997, Science.

[38]  Jan Karel Lenstra,et al.  Technical Note - Clustering a Data Array and the Traveling-Salesman Problem , 1974, Oper. Res..

[39]  Weixiong Zhang,et al.  Rearrangement Clustering: Pitfalls, Remedies, and Applications , 2006, J. Mach. Learn. Res..