Statistical Theory of Generalization (Abstract)

and (approximate) plan of lectures Abstract: Nearly 50 years ago Charles Stein discovered a startling mathematical fact relating to the most common of all statistical procedures. For simultaneously estimating several means based on sampled data the intuitively appealing and universally recommended statistical estimator was to use each sample mean as the estimate of its corresponding population mean. Stein discovered a way to do better. This estimator, and the variety of modifications and generalizations of it are often referred to as statistical shrinkage procedures because of the way they act to improve on the usual procedures. In the decades since his discovery this has led to new understanding of the nature of statistical procedures, to better understanding of the properties of high dimensional spaces and of probability distributions on those spaces, and to the creation of entirely new classes of statistical procedures for a variety of problems. This series of lectures is designed to explore the mathematical bases for shrinkage estimation and the variety of statistical contexts in which this idea plays a key role in contemporary statistical methodology. The concept of shrinkage began as a minimax solution to the ordinary problem of the estimation of several Normal means. These lectures thus begin with an exploration of this problem and then continue to study the various manifestations of this idea in various statistical directions. These include Empirical Bayes and Hierarchical Bayes procedures, Gaussian and non-Gaussian random effects and latent variables models, and nonparametric function estimation including recently developed adaptive procedures involving wavelet or other orthogonal basis systems. The emphasis in the lectures will be on intuition and heuristic connections among the various manifestations of shrinkage, but attention will also be paid to a careful overall description of the mathematical results supporting the general theory. Some proofs will be presented in detail, but more technical ones will only be sketched or described in general outline. Several extended examples involving data applications will be included. These include treatments related to baseball batting averages, analysis of housing prices, call center data, stock market volumes and the classical regression data gathered by F. Nearly 50 years ago Charles Stein discovered a startling mathematical fact relating to the most common of all statistical procedures. For simultaneously estimating several means based on sampled data the intuitively appealing and universally recommended statistical estimator was to use each sample mean as the estimate of its corresponding population mean. Stein discovered a way to do better. This estimator, and the variety of modifications and generalizations of it are often referred to as statistical shrinkage procedures because of the way they act to improve on the usual procedures. In the decades since his discovery this has led to new understanding of the nature of statistical procedures, to better understanding of the properties of high dimensional spaces and of probability distributions on those spaces, and to the creation of entirely new classes of statistical procedures for a variety of problems. This series of lectures is designed to explore the mathematical bases for shrinkage estimation and the variety of statistical contexts in which this idea plays a key role in contemporary statistical methodology. The concept of shrinkage began as a minimax solution to the ordinary problem of the estimation of several Normal means. These lectures thus begin with an exploration of this problem and then continue to study the various manifestations of this idea in various statistical directions. These include Empirical Bayes and Hierarchical Bayes procedures, Gaussian and non-Gaussian random effects and latent variables models, and nonparametric function estimation including recently developed adaptive procedures involving wavelet or other orthogonal basis systems. The emphasis in the lectures will be on intuition and heuristic connections among the various manifestations of shrinkage, but attention will also be paid to a careful overall description of the mathematical results supporting the general theory. Some proofs will be presented in detail, but more technical ones will only be sketched or described in general outline. Several extended examples involving data applications will be included. These include treatments related to baseball batting averages, analysis of housing prices, call center data, stock market volumes and the classical regression data gathered by F.