Interpretable projection pursuit

The goal of this thesis is to modify projection pursuit by trading accuracy for interpretability. The modification produces a more parsimonious and understandable model without sacrificing the structure which projection pursuit seeks. The method retains the nonlinear versatility of projection pursuit while clarifying the results. Following an introduction which outlines the dissertation, the first and second chapters contain the technique as applied to exploratory projection pursuit and projection pursuit regression respectively. The interpretability of a description is measured as the simplicity of the coefficients which define its linear projections. Several interpretability indices for a set of vectors are defined based on the ideas of rotation in factor analysis and entropy. The two methods require slightly different indices due to their contrary goals. A roughness penalty weighting approach is used to search for a more parsimonious description, with interpretability replacing smoothness. The computational algorithms for both interpretable exploratory projection pursuit and interpretable projection pursuit regression are described. In the former case, a rotationally invariant projection index is needed and defined. In the latter, alterations in the original algorithm are required. Examples of real data are considered in each situation. The third chapter deals with the connections between the proposed modification and other ideas which seek to produce more interpretable models. The Abstract Page iv modification as applied to linear regression is shown to be analogous to a nonlinear continuous method of variable selection. It is compared with other variable selection techniques and is analyzed in a Bayesian context. Possible extensions to other data analysis methods are cited and avenues for future research are identified. The conclusion addresses the issue of sacrificing accuracy for parsimony in general. An example of calculating the tradeoff between accuracy and interpretability due to a common simplifying action, namely rounding the binwidth for a histogram, illustrates the applicability of the approach.Page iv modification as applied to linear regression is shown to be analogous to a nonlinear continuous method of variable selection. It is compared with other variable selection techniques and is analyzed in a Bayesian context. Possible extensions to other data analysis methods are cited and avenues for future research are identified. The conclusion addresses the issue of sacrificing accuracy for parsimony in general. An example of calculating the tradeoff between accuracy and interpretability due to a common simplifying action, namely rounding the binwidth for a histogram, illustrates the applicability of the approach.

[1]  C. J. Stone,et al.  Local asymptotic admissibility of a generalization of Akaike's model selection rule , 1982 .

[2]  I. Olkin,et al.  Inequalities: Theory of Majorization and Its Applications , 1980 .

[3]  M. Lundy Applications of the annealing algorithm to combinatorial problems in statistics , 1985 .

[4]  W. Krzanowski Selection of Variables to Preserve Multivariate Data Structure, Using Principal Components , 1987 .

[5]  R. A. Gaskins,et al.  Nonparametric roughness penalties for probability densities , 2022 .

[6]  Philip E. Gill,et al.  Practical optimization , 1981 .

[7]  M. Stone Comments on Model Selection Criteria of Akaike and Schwarz , 1979 .

[8]  Andrew Ehrenberg,et al.  The Problem of Numeracy , 1981 .

[9]  John W. Tukey,et al.  A Projection Pursuit Algorithm for Exploratory Data Analysis , 1974, IEEE Transactions on Computers.

[10]  G. Box,et al.  Scientific Inference, Data Analysis and Robustness. , 1985 .

[11]  D. Freedman,et al.  On the histogram as a density estimator:L2 theory , 1981 .

[12]  I. Johnstone,et al.  Projection-Based Approximation and a Duality with Kernel Methods , 1989 .

[13]  P. Diaconis,et al.  On Nonlinear Functions of Linear Combinations , 1984 .

[14]  H. Akaike A new look at the statistical model identification , 1974 .

[15]  M. Braga,et al.  Exploratory Data Analysis , 2018, Encyclopedia of Social Network Analysis and Mining. 2nd Ed..

[16]  John W. Tukey,et al.  Another Look at the Future , 1983 .

[17]  J. Friedman,et al.  Estimating Optimal Transformations for Multiple Regression and Correlation. , 1985 .

[18]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[19]  Jerome H. Friedman,et al.  SMART User's Guide , 1984 .

[20]  Daniel Asimov,et al.  The grand tour: a tool for viewing multidimensional data , 1985 .

[21]  D. W. Scott On optimal and data based histograms , 1979 .

[22]  John Alan McDonald,et al.  Interactive graphics for data analysis , 1982 .

[23]  P. Hall On Polynomial-Based Projection Indices for Exploratory Projection Pursuit , 1989 .

[24]  G. S. Watson Statistics on Spheres , 1983 .

[25]  Bradley Efron,et al.  Computer-Intensive Methods in Statistical Regression , 1988 .

[26]  I. Good Corroboration, Explanation, Evolving Probability, Simplicity and a Sharpened Razor , 1968, The British Journal for the Philosophy of Science.

[27]  Daryl Pregibon,et al.  Data analysis as search , 1988 .

[28]  J. Friedman A VARIABLE SPAN SMOOTHER , 1984 .

[29]  J. Friedman,et al.  Projection Pursuit Regression , 1981 .

[30]  Jerome H Friedman,et al.  Classification and Multiple Regression through Projection Pursuit , 1985 .

[31]  J. Tukey Discussion, Emphasizing the Connection Between Analysis of Variance and Spectrum Analysis* , 1961 .