Modelling Complex Survey Data Using R, SAS, SPSS and Stata: A Comparison Using CLSA Datasets.

The R software has become popular among researchers due to its flexibility and open-source nature. However, researchers in the fields of public health and epidemiological studies are more customary to commercial statistical softwares such as SAS, SPSS and Stata. This paper provides a comprehensive comparison on analysis of health survey data using the R survey package, SAS, SPSS and Stata. We describe detailed R codes and procedures for other software packages on commonly encountered statistical analyses, such as estimation of population means and regression analysis, using datasets from the Canadian Longitudinal Study on Aging (CLSA). It is hoped that the paper stimulates interest among health science researchers to carry data analysis using R and also serves as a cookbook for statistical analysis using different software packages.