Measurement invariance of the SF-12 among different demographic groups: The HELIUS study

Aim To investigate whether items of the SF-12, widely used to assess health outcome in clinical practice and public health research, provide unbiased measurements of underlying constructs in different demographic groups regarding gender, age, educational level and ethnicity. Methods We included 23,146 men and women aged 18–70 of Dutch, South-Asian Surinamese, African Surinamese, Ghanaian, Turkish, or Moroccan origin from the HELIUS study. Both multiple group confirmatory factor analyses (MGCFA), with increasingly stringent model constraints (i.e. assessing Configural, Metric, Strong and Strict measurement invariance (MI)), and regression analysis were conducted to establish comparability of SF-12 items across demographic groups. Results MI regarding gender, age and education was tested in the ethnic Dutch group (N = 4,615). In each subsequent step of testing MI, change in goodness-of-fit measures did not exceed 0.010 (RMSEA) or 0.004 (CFI). Moreover, goodness-of-fit indices showed good fit for strict invariance models: RMSEA<0.055; CFI>0.97. Regarding ethnicity, RMSEA values of metric and subsequent models fell above 0.055, indicating violation of measurement invariance in factor loadings, thresholds and residual variances. Regression analysis revealed possible age-, education- and ethnicity-related DIF. Adjustment for this DIF had little impact on the magnitude of age and educational differences in physical and mental health, but ethnic inequalities in physical health–and to a lesser extent mental health—were reduced after DIF adjustment. Conclusions We found no evidence of violation of measurement invariance of the SF-12 regarding gender, age and educational level. If minor DIF would remain undetected in our MGCFA analyses, we showed that this would have negligible effect on the magnitude of demographic health inequalities. Regarding ethnicity, the SF-12 was not measurement invariant. After accounting for DIF, we observed a reduction of ethnic inequalities in health, in particular in physical health. Caution is warranted when comparing SF-12 scores across population groups with various ethnic backgrounds.

[1]  Gordon W. Cheung,et al.  Evaluating Goodness-of-Fit Indexes for Testing Measurement Invariance , 2002 .

[2]  A. Mielck,et al.  Trends in socioeconomic inequalities in self-assessed health in 10 European countries. , 2005, International journal of epidemiology.

[3]  M. Jylhä What is self-rated health and why does it predict mortality? Towards a unified conceptual model. , 2009, Social science & medicine.

[4]  F. Chen Sensitivity of Goodness of Fit Indexes to Lack of Measurement Invariance , 2007 .

[5]  Ware J.E.Jr.,et al.  THE MOS 36- ITEM SHORT FORM HEALTH SURVEY (SF- 36) CONCEPTUAL FRAMEWORK AND ITEM SELECTION , 1992 .

[6]  L. Kazis,et al.  Deriving SF-12v2 physical and mental health summary scores: a comparison of different scoring algorithms , 2010, Quality of Life Research.

[7]  Leslie Rutkowski,et al.  Measurement Invariance in International Surveys: Categorical Indicators and Fit Measure Performance , 2017 .

[8]  Martijn Huisman,et al.  A commentary on Marja Jylhä's "What is self-rated health and why does it predict mortality? Towards a unified conceptual model"(69:3, 2009, 307-316). , 2010, Social science & medicine.

[9]  Bengt,et al.  Latent Variable Analysis With Categorical Outcomes : Multiple-Group And Growth Modeling In Mplus , 2002 .

[10]  M. Felix,et al.  The SF-12 as a population health measure: an exploratory examination of potential for application. , 2000, Health services research.

[11]  Richard Fielding,et al.  The differential mediating effects of pain and depression on the physical and mental dimension of quality of life in Hong Kong Chinese adults , 2010, Health and quality of life outcomes.

[12]  M. Kosinski,et al.  Calibration of an item pool for assessing the burden of headaches: An application of item response theory to the Headache Impact Test (HIT™) , 2003, Quality of Life Research.

[13]  I. Hallberg,et al.  Quality of life in older people with cancer -- a gender perspective. , 2004, European journal of cancer care.

[14]  J. Mackenbach,et al.  Socioeconomic inequalities in health in 22 European countries. , 2008, The New England journal of medicine.

[15]  R. Adams,et al.  New Australian population scoring coefficients for the old version of the SF-36 and SF-12 health status questionnaires , 2010, Quality of Life Research.

[16]  B. Zumbo Does item-level DIF manifest itself in scale-level analyses? Implications for translating language tests , 2003 .

[17]  J. Ware,et al.  Cross-validation of item selection and scoring for the SF-12 Health Survey in nine countries: results from the IQOLA Project. International Quality of Life Assessment. , 1998, Journal of clinical epidemiology.

[18]  K. Stronks,et al.  Case Finding and Medical Treatment of Type 2 Diabetes among Different Ethnic Minority Groups: The HELIUS Study , 2017, Journal of diabetes research.

[19]  P. Fayers,et al.  Understanding self-rated health , 2002, The Lancet.

[20]  L. Goldman,et al.  Gender differences in 1-year survival and quality of life among patients admitted with congestive heart failure. , 1998, Medical care.

[21]  J E Ware,et al.  Methods for validating and norming translations of health status questionnaires: the IQOLA Project approach. International Quality of Life Assessment. , 1998, Journal of clinical epidemiology.

[22]  Phillip W. Braddy,et al.  Power and sensitivity of alternative fit indices in tests of measurement invariance. , 2008, The Journal of applied psychology.

[23]  J. Fleishman,et al.  Differential item functioning and health assessment , 2007, Quality of Life Research.

[24]  C. McHorney,et al.  Assessment of Differential Item Functioning for Demographic Comparisons in the MOS SF-36 Health Survey , 2006, Quality of Life Research.

[25]  Karien Stronks,et al.  Unravelling the impact of ethnicity on health in Europe: the HELIUS study , 2013, BMC Public Health.

[26]  K. Stronks,et al.  The utility of ‘country of birth’ for the classification of ethnic groups in health research: the Dutch experience , 2009, Ethnicity & health.

[27]  A. Westergren,et al.  Measurement properties of the SF-12 health survey in Parkinson's disease. , 2011, Journal of Parkinson's disease.

[28]  R. de Graaf,et al.  Health status of the advanced elderly in six european countries: results from a representative survey using EQ-5D and SF-12 , 2010, Health and quality of life outcomes.

[29]  K. Stronks,et al.  Measurement invariance testing of the PHQ-9 in a multi-ethnic population in Europe: the HELIUS study , 2017, BMC Psychiatry.

[30]  J. Dowd Whiners, deniers, and self-rated health: what are the implications for measuring health inequalities? A commentary on Layes, et al. , 2012, Social science & medicine.

[31]  I. Rudan,et al.  An evaluation of the emerging vaccines against influenza in children , 2013, BMC Public Health.

[32]  J B Bjorner,et al.  Test for item bias in a quality of life questionnaire. , 1995, Journal of clinical epidemiology.

[33]  S. Gregorich Do Self-Report Instruments Allow Meaningful Comparisons Across Diverse Population Groups?: Testing Measurement Invariance Using the Confirmatory Factor Analysis Framework , 2006, Medical care.

[34]  Xitao Fan,et al.  Sensitivity of Fit Indices to Model Misspecification and Model Types , 2007 .

[35]  N. Mickuvienė,et al.  Type D personality, mental distress, social support and health-related quality of life in coronary artery disease patients with heart failure: a longitudinal observational study , 2015, Health and Quality of Life Outcomes.

[36]  R. Schwan,et al.  Differential item functioning (DIF) of SF-12 and Q-LES-Q-SF items among french substance users , 2015, Health and Quality of Life Outcomes.

[37]  P. Bentler,et al.  Cutoff criteria for fit indexes in covariance structure analysis : Conventional criteria versus new alternatives , 1999 .

[38]  J. Ware,et al.  Differential item functioning in the Danish translation of the SF-36. , 1998, Journal of clinical epidemiology.

[39]  K. Stronks,et al.  The contribution of perceived ethnic discrimination to the prevalence of depression. , 2015, European journal of public health.

[40]  N. Aaronson,et al.  Translation, validation, and norming of the Dutch language version of the SF-36 Health Survey in community and chronic disease populations. , 1998, Journal of clinical epidemiology.

[41]  C. Terwee,et al.  Evaluation of the psychometric properties of the SF-36 health survey for use among Turkish and Moroccan ethnic minority populations in the Netherlands , 2009, Quality of Life Research.

[42]  L. Carstensen,et al.  Social and emotional aging. , 2010, Annual review of psychology.

[43]  J. Ware,et al.  A 12-Item Short-Form Health Survey: construction of scales and preliminary tests of reliability and validity. , 1996, Medical care.

[44]  M. Palta,et al.  Gender Differences in Multiple Underlying Dimensions of Health-related Quality of Life Are Associated with Sociodemographic and Socioeconomic Status , 2011, Medical care.

[45]  Martin McKee,et al.  Migration and health in an increasingly diverse Europe , 2013, The Lancet.

[46]  Johan P Mackenbach,et al.  Socioeconomic inequalities in morbidity and mortality in western Europe , 1997, The Lancet.

[47]  M. Bullinger,et al.  Factorial validity of the Short Form 12 (SF-12) in patients with diabetes mellitus , 2008 .

[48]  K. Stronks,et al.  Hypertension control in a large multi-ethnic cohort in Amsterdam, The Netherlands: the HELIUS study. , 2015, International journal of cardiology.

[49]  W. Brogden Annual Review of Psychology , 1957 .

[50]  K. Stronks,et al.  Cohort profile: the Healthy Life in an Urban Setting (HELIUS) study in Amsterdam, The Netherlands , 2017, BMJ Open.

[51]  J. Mackenbach,et al.  Socioeconomic inequalities in morbidity and mortality in western Europe , 1997, The Lancet.

[52]  J. Fleishman,et al.  Demographic Variation in SF‐12 Scores: True Differences or Differential Item Functioning? , 2003, Medical care.

[53]  D. Labarthe,et al.  Patient-Reported Health Status in Coronary Heart Disease in the United States: Age, Sex, Racial, and Ethnic Differences , 2008, Circulation.

[54]  A. Coulter,et al.  An assessment of the construct validity of the SF-12 summary scores across ethnic groups. , 2001, Journal of public health medicine.

[55]  Pablo A. Mora,et al.  Measurement invariance of the SF-12 across European-American, Latina, and African-American postpartum women , 2013, Quality of Life Research.

[56]  K. Schermelleh-Engel,et al.  Evaluating the Fit of Structural Equation Models: Tests of Significance and Descriptive Goodness-of-Fit Measures. , 2003 .

[57]  Jacob Cohen Statistical Power Analysis for the Behavioral Sciences , 1969, The SAGE Encyclopedia of Research Design.

[58]  A. Quesnel-Vallée Self-rated health: caught in the crossfire of the quest for 'true' health? , 2007, International journal of epidemiology.