Targeted Projection Pursuit for Interactive Exploration of High- Dimensional Data Sets

High-dimensional data is, by its nature, difficult to visualise. Many current techniques involve reducing the dimensionality of the data, which results in a loss of information. Targeted Projection Pursuit is a novel method for visualising high-dimensional datasets which allows the user to interactively explore the space of possible views to find those that meet their requirements. A prototype tool that utilises this method is introduced, and is shown to allow users to explore data through an interface that is transparent and efficient. The tool and underlying technique are general purpose - applicable to any high-dimensional numeric data, and supporting a wide range of exploratory data analysis activities - but are evaluated on three particular tasks using gene expression data: identifying discriminatory genes, visualising diagnostic classes, and detecting misdiagnosed samples. It is found to perform well in comparison with standard techniques.

[1]  J.Faith,et al.  Targeted Projection Pursuit for Gene Expression Data Classification and Visualisation , 2006 .

[2]  Maia Angelova,et al.  Gene expression Targeted projection pursuit for visualizing gene expression data classifications , 2006 .

[3]  Ronald W. Davis,et al.  Quantitative Monitoring of Gene Expression Patterns with a Complementary DNA Microarray , 1995, Science.

[4]  M. Q. Zhang Large-scale gene expression data analysis: a new challenge to computational biologists. , 1999, Genome research.

[5]  P. Went,et al.  [Meningothelial meningioma in a mature cystic teratoma of the ovary]. , 2007, Der Pathologe.

[6]  John W. Sammon,et al.  A Nonlinear Mapping for Data Structure Analysis , 1969, IEEE Transactions on Computers.

[7]  Wolfgang Hammerschmidt,et al.  Latent Membrane Protein 1 of Epstein-Barr Virus Induces CD83 by the NF-κB Signaling Pathway , 2003, Journal of Virology.

[8]  R. Blake,et al.  Perception of Biological Motion , 1997, Perception.

[9]  Robert Mintram,et al.  Targeted Projection Pursuit for Gene Expression Data Classification and Visualisation , 2006 .

[10]  G. Johansson Visual perception of biological motion and a model for its analysis , 1973 .

[11]  D. Botstein,et al.  A gene expression database for the molecular pharmacology of cancer , 2000, Nature Genetics.

[12]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .

[13]  Andreas Buja,et al.  Grand tour and projection pursuit , 1995 .

[14]  Daniel Asimov,et al.  The grand tour: a tool for viewing multidimensional data , 1985 .

[15]  M. Ringnér,et al.  Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks , 2001, Nature Medicine.

[16]  S. Dudoit,et al.  Comparison of Discrimination Methods for the Classification of Tumors Using Gene Expression Data , 2002 .

[17]  Ben Shneiderman,et al.  A Rank-by-Feature Framework for Interactive Exploration of Multidimensional Data , 2005, Inf. Vis..

[18]  John W. Tukey,et al.  A Projection Pursuit Algorithm for Exploratory Data Analysis , 1974, IEEE Transactions on Computers.

[19]  H. Adams,et al.  Meningotheliomatöses Meningeom in einem reifen zystischen Teratom des Ovars , 2006, Der Pathologe.

[20]  Eun-Kyung Lee,et al.  Projection Pursuit for Exploratory Supervised Classification , 2005 .

[21]  Joe Faith,et al.  Targeted Projection Pursuit Tool for Gene Expression Visualisation , 2006, J. Integr. Bioinform..

[22]  Dianne Cook,et al.  Manual Controls for High-Dimensional Data Projections , 1997 .

[23]  J. Mesirov,et al.  Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. , 1999, Science.