Linear mixed models with endogenous covariates: modeling sequential treatment effects with application to a mobile health study.

Mobile health is a rapidly developing field in which behavioral treatments are delivered to individuals via wearables or smartphones to facilitate health-related behavior change. Micro-randomized trials (MRT) are an experimental design for developing mobile health interventions. In an MRT the treatments are randomized numerous times for each individual over course of the trial. Along with assessing treatment effects, behavioral scientists aim to understand between-person heterogeneity in the treatment effect. A natural approach is the familiar linear mixed model. However, directly applying linear mixed models is problematic because potential moderators of the treatment effect are frequently endogenous-that is, may depend on prior treatment. We discuss model interpretation and biases that arise in the absence of additional assumptions when endogenous covariates are included in a linear mixed model. In particular, when there are endogenous covariates, the coefficients no longer have the customary marginal interpretation. However, these coefficients still have a conditional-on-the-random-effect interpretation. We provide an additional assumption that, if true, allows scientists to use standard software to fit linear mixed model with endogenous covariates, and person-specific predictions of effects can be provided. As an illustration, we assess the effect of activity suggestion in the HeartSteps MRT and analyze the between-person treatment effect heterogeneity.

[1]  P. Heagerty Marginally Specified Logistic‐Normal Models for Longitudinal Binary Data , 1999, Biometrics.

[2]  P. Albert,et al.  Models for longitudinal data: a generalized estimating equation approach. , 1988, Biometrics.

[3]  G. Pap,et al.  Asymptotic properties of maximum-likelihood estimators for Heston models based on continuous time observations , 2013, 1310.4783.

[4]  Inbal Nahum-Shani,et al.  Randomised trials for the Fitbit generation , 2015, Significance.

[5]  K. Liang,et al.  Asymptotic Properties of Maximum Likelihood Estimators and Likelihood Ratio Tests under Nonstandard Conditions , 1987 .

[6]  Xihong Lin,et al.  Semiparametric Regression of Multidimensional Genetic Pathway Data: Least‐Squares Kernel Machines and Linear Mixed Models , 2007, Biometrics.

[7]  Jee-Seon Kim,et al.  Omitted Variables in Multilevel Models , 2006 .

[8]  T. Louis,et al.  Marginalized Binary Mixed‐Effects Models with Covariate‐Dependent Random Effects and Likelihood Inference , 2004, Biometrics.

[9]  Matt P. Wand,et al.  Smoothing and mixed models , 2003, Comput. Stat..

[10]  S. Vansteelandt On Confounding, Prediction and Efficiency in the Analysis of Longitudinal and Cross‐sectional Clustered Data , 2007 .

[11]  J. Suls,et al.  How robust is the association between smoking and depression in adults? A meta-analysis using linear mixed-effects models. , 2014, Addictive behaviors.

[12]  Per B. Brockhoff,et al.  lmerTest Package: Tests in Linear Mixed Effects Models , 2017 .

[13]  P. Heagerty,et al.  Longitudinal structural mixed models for the analysis of surgical trials with noncompliance , 2012, Statistics in medicine.

[14]  Susan A. Murphy,et al.  Estimating Time-Varying Causal Excursion Effect in Mobile Health with Binary Outcomes , 2019 .

[15]  Ambuj Tewari,et al.  Just-in-Time Adaptive Interventions (JITAIs) in Mobile Health: Key Components and Design Principles for Ongoing Health Behavior Support , 2017, Annals of behavioral medicine : a publication of the Society of Behavioral Medicine.

[16]  D. Stram,et al.  Variance components testing in the longitudinal mixed effects model. , 1994, Biometrics.

[17]  J. Roy,et al.  Conditional Inference Methods for Incomplete Poisson Data With Endogenous Time-Varying Covariates , 2006 .

[18]  P. Heagerty,et al.  Regression analysis of longitudinal binary data with time-dependent environmental covariates: bias and efficiency. , 2005, Biostatistics.

[19]  U. Böckenholt,et al.  Regressor and random‐effects dependencies in multilevel models , 2004 .

[20]  T. Mostafa,et al.  Solving endogeneity problems in multilevel estimation: an example using education production functions , 2012 .

[21]  K Y Liang,et al.  Longitudinal data analysis for discrete and continuous outcomes. , 1986, Biometrics.

[22]  M. Pepe,et al.  A cautionary note on inference for marginal regression models with longitudinal data and general correlated response data , 1994 .

[23]  M. Kenward,et al.  The Analysis of Designed Experiments and Longitudinal Data by Using Smoothing Splines , 1999 .

[24]  F. E. Satterthwaite Synthesis of variance , 1941 .

[25]  N. Bolger,et al.  Intensive Longitudinal Methods: An Introduction to Diary and Experience Sampling Research , 2013 .

[26]  J. Robins,et al.  Marginal Structural Models to Estimate the Joint Causal Effect of Nonrandomized Treatments , 2001 .

[27]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[28]  Charles E. McCulloch,et al.  Separating between‐ and within‐cluster covariate effects by using conditional and partitioning methods , 2006 .

[29]  G. Gurtner,et al.  Statistics in medicine. , 2011, Plastic and reconstructive surgery.

[30]  Hairul Azlan Annuar,et al.  Foreign investors' interests and corporate tax avoidance: Evidence from an emerging economy , 2015 .

[31]  D. Ruppert,et al.  Likelihood ratio tests in linear mixed models with one variance component , 2003 .

[32]  M. Arellano,et al.  Some Tests of Specification for Panel Data: Monte Carlo Evidence and an Application to Employment Equations , 1991 .

[33]  M. Wand,et al.  Penalized Splines and Reproducing Kernel Methods , 2006 .

[34]  Nicholas J. Seewald,et al.  Efficacy of Contextually Tailored Suggestions for Physical Activity: A Micro-randomized Optimization Trial of HeartSteps. , 2018, Annals of behavioral medicine : a publication of the Society of Behavioral Medicine.

[35]  J. Robins,et al.  Specifying the correlation structure in inverse-probability- weighting estimation for repeated measures. , 2012, Epidemiology.

[36]  L. Dempfle Comparison of several sire evaluation methods in dairy cattle breeding , 1977, Annales de génétique et de sélection animale.

[37]  S. Murphy,et al.  Assessing Time-Varying Causal Effect Moderation in Mobile Health , 2016, Journal of the American Statistical Association.

[38]  S. Greenland,et al.  The Intensity‐Score Approach to Adjusting for Confounding , 2003, Biometrics.

[39]  Romain Neugebauer,et al.  Causal inference in longitudinal studies with history-restricted marginal structural models. , 2007, Electronic journal of statistics.

[40]  S. Vansteelandt,et al.  Analysis of Longitudinal Studies With Repeated Outcome Measures: Adjusting for Time-Dependent Confounding Using Conventional Methods , 2017, American journal of epidemiology.

[41]  Michelle Shardell,et al.  Joint mixed-effects models for causal inference with longitudinal data. , 2018, Statistics in medicine.

[42]  C. R. Henderson,et al.  Best linear unbiased estimation and prediction under a selection model. , 1975, Biometrics.

[43]  Ambuj Tewari,et al.  Sample size calculations for micro‐randomized trials in mHealth , 2015, Statistics in medicine.

[44]  Semiparametric estimation in generalized linear mixed models with auxiliary covariates: A pairwise likelihood approach , 2014, Biometrics.

[45]  D. Miglioretti,et al.  Marginal modeling of multilevel binary data with time-varying covariates. , 2004, Biostatistics.

[46]  B L De Stavola,et al.  Methods for dealing with time‐dependent confounding , 2013, Statistics in medicine.

[47]  G. Wahba Spline models for observational data , 1990 .

[48]  Takeshi Amemiya,et al.  Instrumental-variable estimation of an error-components model , 1986 .

[49]  J. Booth,et al.  2. Random-Effects Modeling of Categorical Response Data , 2000 .

[50]  M. Berger,et al.  Robust designs for linear mixed effects models , 2004 .

[51]  Jerry A. Hausman,et al.  Panel Data and Unobservable Individual Effects , 1981 .

[52]  James M. Robins,et al.  Marginal Structural Models versus Structural nested Models as Tools for Causal inference , 2000 .

[53]  Tanya P. Garcia,et al.  Optimal Estimator for Logistic Model with Distribution‐free Random Intercept , 2016, Scandinavian journal of statistics, theory and applications.

[54]  M. Wand,et al.  Explicit connections between longitudinal data analysis and kernel machines , 2009 .

[55]  Jeffrey M. Wooldridge,et al.  Solutions Manual and Supplementary Materials for Econometric Analysis of Cross Section and Panel Data , 2003 .

[56]  M. Arellano,et al.  Another look at the instrumental variable estimation of error-components models , 1995 .

[57]  P. Diggle,et al.  Analysis of Longitudinal Data , 2003 .

[58]  Y. Mundlak On the Pooling of Time Series and Cross Section Data , 1978 .

[59]  Anthony S. Bryk,et al.  Hierarchical Linear Models: Applications and Data Analysis Methods , 1992 .

[60]  Predrag Klasnja,et al.  Rapidly Personalizing Mobile Health Treatment Policies with Limited Data , 2020, ArXiv.

[61]  Carla Rampichini,et al.  The Role of Sample Cluster Means in Multilevel Models , 2011 .

[62]  J. Pearl,et al.  Causal Inference , 2011, Twenty-one Mental Models That Can Change Policing.

[63]  D. Bates,et al.  Fitting Linear Mixed-Effects Models Using lme4 , 2014, 1406.5823.

[64]  J M Robins,et al.  Correction for non-compliance in equivalence trials. , 1998, Statistics in medicine.

[65]  J. Robins Correcting for non-compliance in randomized trials using structural nested mean models , 1994 .

[66]  J. Robins A new approach to causal inference in mortality studies with a sustained exposure period—application to control of the healthy worker survivor effect , 1986 .

[67]  Scott L. Zeger,et al.  Marginalized Multilevel Models and Likelihood Inference , 2000 .

[68]  T. Louis,et al.  A Note on Marginal Linear Regression with Correlated Response Data , 2000 .

[69]  J. Ware,et al.  Random-effects models for longitudinal data. , 1982, Biometrics.

[70]  D. Lindley,et al.  Bayes Estimates for the Linear Model , 1972 .

[71]  K Y Liang,et al.  An overview of methods for the analysis of longitudinal data. , 1992, Statistics in medicine.

[72]  M. Cheung A model for integrating fixed-, random-, and mixed-effects meta-analyses into structural equation modeling. , 2008, Psychological methods.

[73]  Jee-Seon Kim,et al.  Multilevel Modeling with Correlated Effects , 2007 .

[74]  G. Robinson That BLUP is a Good Thing: The Estimation of Random Effects , 1991 .

[75]  K. Larsen,et al.  Interpreting Parameters in the Logistic Regression Model with Random Effects , 2000, Biometrics.

[76]  James M. Robins,et al.  Causal Inference from Complex Longitudinal Data , 1997 .

[77]  Jeffrey M. Wooldridge,et al.  Estimating Panel Data Models in the Presence of Endogeneity and Selection , 2010 .

[78]  Michael Rosenblum,et al.  Marginal Structural Models , 2011 .

[79]  S. Vansteelandt,et al.  Conditional Generalized Estimating Equations for the Analysis of Clustered and Longitudinal Data , 2008, Biometrics.