Data Analysis and Graphics Using R: An Example-Based Approach

Chapter 4 discusses the Horvitz–Thompson estimator for the population total. To avoid the problems of depending on the second-order inclusion probabilities when estimating the variance of the population total, a random permutation of the sample elements is introduced as a way of leading to 1⁄2z , the sampling autocorrelation between two sample elements after permuting. Chapter 5 focuses on the use of the 1⁄2z quantity to verify properties of common variance estimates for simple random sampling, stratiŽ ed sampling, ratio and product estimators, and other designs. Chapter 6 focuses on using 1⁄2z to develop estimating equations for the population total under multistage sampling and subsampling designs. The subject of Chapter 7 is systematic sampling; 1⁄2z plays a less prominent role here. The emphasis is on either deriving second-order inclusion probabilities via the Brewer or Sunter method or else working with directly speciŽ ed second-order probabilities. Given a particular realization of a systematic sampling design, 1⁄2z helps characterize its statistical properties. Chapter 8 turns to the problem of directly estimating 1⁄2z . The book examines a strategy of an estimator that is the sum of a function of the powers of the inclusion probabilities and uncorrelated residuals. When the variance of the residuals depends on how far apart the sample elements appear in permuted order, a heteroscedastic condition termed “gray noise” exists. In this case the estimator of 1⁄2z is a ratio of an unweighted sum of squares to a weighted sum of squares. The weights are the Ž rst-order inclusion probabilities. Chapter 9 discusses the conditions when the sampling autocorrelation coefŽ cient can be estimated well without knowledge of second-order inclusion probabilities. Under such conditions, ways open up to simplify calculation of the variance of the Horvitz–Thompson estimator of the population total. Chapter 10 reports the results of a simulation exercise using systematic probability proportional to size sampling to estimate the total municipal expenditures of 34 Dutch communities. Several alternative estimates of the variance of the Horvitz–Thompson estimator are compared. The comparisons do an excellent job illustrating the risks of simplifying assumptions in terms of their impact on estimation efŽ ciency. The book provides a clear and succinct explanation of the reasoning used to compute the number of simulation cycles required to estimate 1⁄2z. The book’s last three chapters take a different turn, covering minimum variance regression estimators, general restriction estimators, and weighting procedures. The highlight of Chapter 11 is a Pythagorean–Bayesian interpretation of the minimum variance regression estimator. Chapter 12 compares the Lagrangian and Pythagorean approaches to the development of general restriction estimators, explaining how the regression estimator is a partitioned restriction estimator. The weighting procedures discussed in Chapter 13 comment brie y on how a Pythagorean-style decomposition of the regression estimator in a vector space supports a useful recursive formula for sample weights. A comprehensive index is well keyed to the important concepts and authors mentioned in the book. The reference list is well balanced. The number of references from the past decade is plentiful, but the book does not neglect many useful survey sampling titles from earlier decades. The book redeems itself well in its subject matter and would be a Ž ne addition to the bookshelf of any Technometrics reader using sample survey techniques. The book is more theoretical than applications oriented. Readers desiring a more hands-on, how-to style should seek other books to supplement this one.