Impact of Non-Weighting in the Analysis of Data Obtained from Complex Samples

Objective:  To compare the estimates obtained, considering or not the weighting data .  Material and Methods:  Secondary data from the Oral Health Survey of the State of Sao Paulo (SBSP2015) was used for calculation of mean estimates, standard errors of the mean and confidence intervals (CI) for the DMFT index and components (decayed, lost and filled), in the age group of 35-44 years. Multiple logistic regression models were estimated, considering or not the weighting from the sampling plan (p<0.05).  Results:  It was observed that the estimates of the DMFT index and the carious component did not vary much when the design was considered or not (1.1% and 2.0%, respectively). However, the data referring to the lost and filled component showed greater differences between the values of the means.  The averages fluctuated up and down by up to 6.7% for weighted versus unweighted analyses. The standard error was underestimated in the unweighted analysis and the confidence interval showed variations. Differences between the regression models obtained by the weighted and unweighted analysis of the data were detected .  Conclusion:  Although weighted and unweighted models presented differences of less than 10% in estimates of the mean, confidence intervals, as well as statistical inferences, were different.  Thus, weighting should be applied in the population base data analysis collected by sampling with complex designs.

[1]  E L Korn,et al.  Epidemiologic studies utilizing surveys: accounting for the sampling design. , 1991, American journal of public health.

[2]  Nilza Nunes da Silva,et al.  Estimativas obtidas de um levantamento complexo , 2003 .

[3]  Frauke Kreuter,et al.  A Survey on Survey Statistics: What is Done and Can be Done in Stata , 2007 .

[4]  Giseli Nogueira Damacena,et al.  Amostras complexas em inquéritos populacionais: planejamento e implicações na análise estatística dos dados , 2008 .

[5]  Sander Greenland,et al.  Invited commentary: variable selection versus shrinkage in the control of multiple confounders. , 2007, American journal of epidemiology.

[6]  Alina Alfonso León,et al.  [Estimate methods used with complex sampling designs: their application in the Cuban 2001 health survey]. , 2004, Revista panamericana de salud publica = Pan American journal of public health.

[7]  K. Carriere,et al.  Assessing socioeconomic effects on different sized populations: To weight or not to weight? , 2001, Journal of epidemiology and community health.

[8]  M. Portela,et al.  [Brazilian Oral Health Survey (SB Brazil 2003): data do not allow for population estimates, but correction is possible]. , 2009, Cadernos de saude publica.

[9]  Marcia A Ciol,et al.  Understanding the use of weights in the analysis of data from multistage surveys. , 2006, Archives of physical medicine and rehabilitation.

[10]  B. West,et al.  Important considerations when analyzing health survey data collected using a complex sample design. , 2014, American journal of public health.

[11]  R. Cordeiro [Effect of design in cluster sampling to estimate the distribution of occupations among workers]. , 2001, Revista de saude publica.

[12]  Maurício Teixeira Leite de Vasconcellos,et al.  Pesquisa sobre as Condições de Saúde Bucal da População Brasileira (SB Brasil 2003): seus dados não produzem estimativas populacionais, mas há possibilidade de correção , 2009 .

[13]  Thomas Lumley,et al.  Analysis of Complex Survey Samples , 2004 .

[14]  W. DuMouchel,et al.  Using Sample Survey Weights in Multiple Regression Analyses of Stratified Samples , 1983 .