A General Algorithm for Univariate Stratification

This paper presents a general algorithm for constructing strata in a population using "X", a univariate stratification variable known for all the units in the population. Stratum "h" consists of all the units with an "X" value in the interval ["b" "h" - 1 , "b h ") . The stratum boundaries l"b h "r are obtained by minimizing the anticipated sample size for estimating the population total of a survey variable "Y" with a given level of precision. The stratification criterion allows the presence of a take-none and of a take-all stratum. The sample is allocated to the strata using a general rule that features proportional allocation, Neyman allocation, and power allocation as special cases. The optimization can take into account a stratum-specific anticipated non-response and a model for the relationship between the stratification variable "X" and the survey variable "Y". A loglinear model with stratum-specific mortality for "Y" given "X" is presented in detail. Two numerical algorithms for determining the optimal stratum boundaries, attributable to Sethi and Kozak, are compared in a numerical study. Several examples illustrate the stratified designs that can be constructed with the proposed methodology. All the calculations presented in this paper were carried out with stratification , an R package that will be available on CRAN (Comprehensive R Archive Network). Copyright (c) 2009 The Authors. Journal compilation (c) 2009 International Statistical Institute.

[1]  J. Horgan,et al.  Improving the Lavallée and Hidiroglou algorithm for stratification of skewed populations , 2007 .

[2]  Marcin Kozak,et al.  OPTIMAL STRATIFICATION USING RANDOM SEARCH METHOD IN AGRICULTURAL SURVEYS , 2004 .

[3]  R. G. Cornell,et al.  Quantifying Gains from Stratification for Optimum and Approximately Optimum Strata Using a Bivariate Normal Model , 1976 .

[4]  M. Kozak,et al.  Geometric Versus Optimization Approach to Stratification: A Comparison of Efficiency , 2006 .

[5]  S. Dayal Allocation of sample using values of auxiliary characteristic , 1985 .

[6]  K. P. Srinath,et al.  Problems Associated with Designing Subannual Business Surveys , 1993 .

[7]  J. L. Hodges,et al.  Minimum Variance Stratification , 1959 .

[8]  C. Trivisano,et al.  Efficient Stratification Based on Nonparametric Regression Methods , 2007 .

[9]  Michael D. Bankier,et al.  Power Allocations: Determining Sample Sizes for Subnational Areas , 1988 .

[10]  G. J. Glasser On the Complete Coverage of Large Units in a Statistical Study , 1962 .

[11]  Louis-Paul Rivest,et al.  A Generalization of the Lavallée and Hidiroglou Algorithm for Stratification in Business Surveys , 2002 .

[12]  N. L. Johnson,et al.  Continuous Univariate Distributions. , 1995 .

[13]  Carl-Erik Särndal,et al.  Model Assisted Survey Sampling , 1997 .

[14]  V. K. Sethi,et al.  A NOTE ON OPTIMUM STRATIFICATION OF POPULATIONS FOR ESTIMATING THE POPULATION MEANS , 1963 .

[15]  Jane M. Horgan,et al.  Stratification of Skewed Populations: A review , 2006 .

[16]  M. Hidiroglou The Construction of a Self-Representing Stratum of Large Units in Survey Design , 1986 .

[17]  Michael A. Hidiroglou,et al.  Sampling and Estimation Issues for Annual and Sub-annual Canadian Business Surveys , 2001 .