Propensity score methods to adjust for confounding in assessing treatment effects: bias and precision

There is an increasing interest in the use of propensity score (PS) methods for confounding control, with generally three ways of estimating adjusted treatment effects in pharmacoepidemiological studies: 1) stratification on PS, 2) matching on PS and 3) using PS as a covariate. To assess bias and precision of different methods, we conducted simulations in three scenarios: 1) treatment had no effect but the crude estimate showed a protective effect; 2) treatment was protective and the crude estimate was more extreme; and 3) treatment increased the risk but the crude estimate showed protective. Adjusting for confounders in all methods shifted the effect estimates toward the true values. Adjusted odds ratios using the PS stratification and the method using PS as a covariate were biased due to either residual confounding or over-adjustment. Matching on PS produced less biased average estimates than other methods but the precision of effect estimates was lower. --------------------------------------------------------------------------------

[1]  G. Gmel,et al.  Recall bias for seven-day recall measurement of alcohol consumption among emergency department patients: implications for case-crossover designs. , 2007, Journal of studies on alcohol and drugs.

[2]  O. Ekholm Influence of the recall period on self-reported alcohol intake , 2004, European Journal of Clinical Nutrition.

[3]  K. Ghanem,et al.  Audio computer assisted self interview and face to face interview modes in assessing response bias among STD clinic patients , 2005, Sexually Transmitted Infections.

[4]  S. Schneeweiss,et al.  Evaluating uses of data mining techniques in propensity score estimation: a simulation study , 2008, Pharmacoepidemiology and drug safety.

[5]  B. Kuate-Defo,et al.  Reliability of reasons for early termination of breastfeeding: Application of a bivariate probability model with sample selection to data from surveys in Malaysia in 1976–77 and 1988–89 , 2006, Population studies.

[6]  Antonio Ciampi,et al.  Uses and limitations of statistical accounting for random error correlations, in the validation of dietary questionnaire assessments , 2002, Public Health Nutrition.

[7]  A. Goris,et al.  Validity of the assessment of dietary intake: problems of misreporting , 2002, Current opinion in clinical nutrition and metabolic care.

[8]  J. Ockene,et al.  Gender differences in social desirability and social approval bias in dietary self-report. , 1997, American journal of epidemiology.

[9]  J. Manson,et al.  Recall and selection bias in reporting past alcohol consumption among breast cancer cases , 1993, Cancer Causes & Control.

[10]  David W. Hosmer,et al.  Applied Logistic Regression , 1991 .

[11]  J Lomas,et al.  Evidence of self-report bias in assessing adherence to guidelines. , 1999, International journal for quality in health care : journal of the International Society for Quality in Health Care.

[12]  C. Drake Effects of misspecification of the propensity score on estimators of treatment effect , 1993 .

[13]  T. Johnson,et al.  Modeling sources of self-report bias in a survey of drug use epidemiology. , 2005, Annals of epidemiology.

[14]  Paul Zador,et al.  Variable selection and raking in propensity scoring. , 2007, Statistics in medicine.

[15]  D B Rubin,et al.  Matching using estimated propensity scores: relating theory to practice. , 1996, Biometrics.

[16]  E. Loftus,et al.  Reconstruction of automobile destruction: An example of the interaction between language and memory , 1974 .

[17]  L. Fisher,et al.  Prognostic models and the propensity score. , 1995, International journal of epidemiology.

[18]  N. Schwarz Self-reports: How the questions shape the answers. , 1999 .

[19]  Vincent Mor,et al.  Principles for modeling propensity scores in medical research: a systematic literature review , 2004, Pharmacoepidemiology and drug safety.

[20]  Scott B. MacKenzie,et al.  Common method biases in behavioral research: a critical review of the literature and recommended remedies. , 2003, The Journal of applied psychology.

[21]  D. Rubin Using Propensity Scores to Help Design Observational Studies: Application to the Tobacco Litigation , 2001, Health Services and Outcomes Research Methodology.

[22]  A. Crawford Bias in a survey of drinking habits. , 1987, Alcohol and alcoholism.

[23]  B. Gladen,et al.  Maternal recall of breastfeeding duration by elderly women. , 2005, American journal of epidemiology.

[24]  M. Russell,et al.  Influence of socially desirable responding in a study of stress and substance abuse. , 1993, Alcoholism, clinical and experimental research.

[25]  R. Snow,et al.  Maternal recall of symptoms associated with childhood deaths in rural east Africa. , 1993, International journal of epidemiology.

[26]  M Vrijheid,et al.  Who are the 'low energy reporters' in the dietary and nutritional survey of British adults? , 1997, International journal of epidemiology.

[27]  D. Dawson,et al.  Methodological Issues in Measuring Alcohol Use , 2003, Alcohol research & health : the journal of the National Institute on Alcohol Abuse and Alcoholism.

[28]  Peter C Austin,et al.  Propensity score methods gave similar results to traditional regression modeling in observational studies: a systematic review. , 2005, Journal of clinical epidemiology.

[29]  A. Whittemore,et al.  Reliability of recalled physical activity, cigarette smoking, and alcohol consumption. , 1992, Annals of epidemiology.

[30]  S. Huttly,et al.  Do mothers overestimate breast feeding duration? An example of recall bias from a study in southern Brazil. , 1990, American journal of epidemiology.

[31]  David Kriebel,et al.  Bias in occupational epidemiology studies , 2006, Occupational and Environmental Medicine.

[32]  M Soledad Cepeda,et al.  Comparison of logistic regression versus propensity score when the number of events is low and there are multiple confounders. , 2003, American journal of epidemiology.

[33]  Reliability of retrospective survey data on infant feeding , 1988, Demography.

[34]  Patrick Royston,et al.  The design of simulation studies in medical statistics , 2006, Statistics in medicine.

[35]  B. Rittenhouse Respondent-specific information from the randomized response interview: compliance assessment. , 1996, Journal of clinical epidemiology.

[36]  N. Rollins,et al.  Maternal recall of exclusive breast feeding duration , 2003, Archives of disease in childhood.

[37]  W. Willett,et al.  Nutritional epidemiology issues in chronic disease at the turn of the century. , 2000, Epidemiologic reviews.

[38]  P. Hessel Terminal digit preference in blood pressure measurements: effects on epidemiological associations. , 1986, International journal of epidemiology.

[39]  John Ludbrook,et al.  Statistical Techniques For Comparing Measurers And Methods Of Measurement: A Critical Review , 2002, Clinical and experimental pharmacology & physiology.

[40]  M. Frydenberg,et al.  Binge drinking during pregnancy--is it possible to obtain valid information on a weekly basis? , 2004, American journal of epidemiology.

[41]  D. Hamilton,et al.  Attribution difficulty and memory for attribution-relevant information. , 1990, Journal of personality and social psychology.

[42]  K. Ohsuka,et al.  Comparability of epidemiological information between self- and interviewer-administered questionnaires. , 2002, Journal of clinical epidemiology.

[43]  G. Guest,et al.  Fear, hope and social desirability bias among women at high risk for HIV in West Africa , 2005, Journal of Family Planning and Reproductive Health Care.

[44]  T. Byers,et al.  Effects of social approval bias on self-reported fruit and vegetable consumption: a randomized controlled trial , 2008, Nutrition Journal.

[45]  P C Whitehead,et al.  Validity and reliability of self-reported drinking behavior: dealing with the problem of response bias. , 1993, Journal of studies on alcohol.

[46]  Richard Brand,et al.  Recalling sexual behavior: A methodological analysis of memory recall bias via interview using the diary as the gold standard , 2003, Journal of sex research.

[47]  D. Rubin,et al.  The central role of the propensity score in observational studies for causal effects , 1983 .

[48]  D. Altman,et al.  Measuring agreement in method comparison studies , 1999, Statistical methods in medical research.

[49]  Feasibility of the randomized response technique in rural Ethiopia. , 1979, American journal of public health.

[50]  David R. Mullineaux,et al.  Assessment of Bias in Comparing Measurements: A Reliability Example , 1999 .

[51]  J. Avorn,et al.  Variable selection for propensity score models. , 2006, American journal of epidemiology.

[52]  Zhiqiang Wang Two Postestimation Commands for Assessing Confounding Effects in Epidemiological Studies , 2007 .

[53]  M B E Livingstone,et al.  Issues in dietary intake assessment of children and adolescents , 2004, British Journal of Nutrition.

[54]  Til Stürmer,et al.  A review of the application of propensity score methods yielded increasing use, advantages in specific settings, but not substantially different estimates compared with conventional multivariable methods. , 2006, Journal of clinical epidemiology.

[55]  Kang Lee,et al.  Do young children always say yes to yes-no questions? A metadevelopmental study of the affirmation bias. , 2003, Child development.

[56]  Peter C Austin,et al.  A comparison of the ability of different propensity score models to balance measured variables between treated and untreated subjects: a Monte Carlo study , 2007, Statistics in medicine.

[57]  K. Schwartz,et al.  Recall of age of weaning and other breastfeeding variables , 2006, International breastfeeding journal.

[58]  J. Ockene,et al.  Social desirability bias in dietary self-report may compromise the validity of dietary intake measures. , 1995, International journal of epidemiology.

[59]  K. Raphael,et al.  Recall bias: a proposal for assessment and control. , 1987, International journal of epidemiology.

[60]  Peter C Austin,et al.  Conditioning on the propensity score can result in biased estimation of common measures of treatment effect: a Monte Carlo study , 2007, Statistics in medicine.

[61]  T J Cole,et al.  Biased over- or under-reporting is characteristic of individuals whether over time or by different assessment methods. , 2001, Journal of the American Dietetic Association.

[62]  L. Lissner,et al.  Sources of bias in a dietary survey of children , 1998, European Journal of Clinical Nutrition.

[63]  J. Ludbrook Detecting systematic bias between two raters , 2004, Clinical and experimental pharmacology & physiology.

[64]  D. Altman,et al.  Applying the right statistics: analyses of measurement studies , 2003, Ultrasound in obstetrics & gynecology : the official journal of the International Society of Ultrasound in Obstetrics and Gynecology.

[65]  U I Ivens,et al.  Comparison of a self-administered questionnaire and a telephone interview of 146 Danish waste collectors. , 1997, American journal of industrial medicine.

[66]  S. Schneeweiss,et al.  Causation of Bias: The Episcope , 2001, Epidemiology.

[67]  Thorkild Tylleskär,et al.  Audio computer-assisted self-interviewing (ACASI) may avert socially desirable responses about infant feeding in the context of HIV , 2005, BMC Medical Informatics Decis. Mak..

[68]  S. Greenland,et al.  Simulation study of confounder-selection strategies. , 1993, American journal of epidemiology.