Fused lasso with the adaptation of parameter ordering in combining multiple studies with repeated measurements

Combining multiple studies is frequently undertaken in biomedical research to increase sample sizes for statistical power improvement. We consider the marginal model for the regression analysis of repeated measurements collected in several similar studies with potentially different variances and correlation structures. It is of great importance to examine whether there exist common parameters across study‐specific marginal models so that simpler models, sensible interpretations, and meaningful efficiency gain can be obtained. Combining multiple studies via the classical means of hypothesis testing involves a large number of simultaneous tests for all possible subsets of common regression parameters, in which it results in unduly large degrees of freedom and low statistical power. We develop a new method of fused lasso with the adaptation of parameter ordering (FLAPO) to scrutinize only adjacent‐pair parameter differences, leading to a substantial reduction for the number of involved constraints. Our method enjoys the oracle properties as does the full fused lasso based on all pairwise parameter differences. We show that FLAPO gives estimators with smaller error bounds and better finite sample performance than the full fused lasso. We also establish a regularized inference procedure based on bias‐corrected FLAPO. We illustrate our method through both simulation studies and an analysis of HIV surveillance data collected over five geographic regions in China, in which the presence or absence of common covariate effects is reflective to relative effectiveness of regional policies on HIV control and prevention.

[1]  W. G. Cochran The combination of estimates from different experiments. , 1954 .

[2]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[3]  L. Hansen LARGE SAMPLE PROPERTIES OF GENERALIZED METHOD OF , 1982 .

[4]  K Y Liang,et al.  Longitudinal data analysis for discrete and continuous outcomes. , 1986, Biometrics.

[5]  D J Spiegelhalter,et al.  Bayesian approaches to random-effects meta-analysis: a comparative study. , 1995, Statistics in medicine.

[6]  R. Detels,et al.  Risk factors for intravenous drug use and sharing equipment among young male drug users in Longchuan County, south-west China , 1996, AIDS.

[7]  B. Lindsay,et al.  Improving generalised estimating equations using quadratic inference functions , 2000 .

[8]  Jianqing Fan,et al.  Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties , 2001 .

[9]  D K Smith,et al.  Numerical Optimization , 2001, J. Oper. Res. Soc..

[10]  Jianqing Fan,et al.  Nonconcave penalized likelihood with a diverging number of parameters , 2004, math/0406466.

[11]  Lurdes Y T Inoue,et al.  Combining longitudinal studies of PSA. , 2004, Biostatistics.

[12]  P. Müller,et al.  A method for combining inference across related nonparametric Bayesian models , 2004 .

[13]  R. Tibshirani,et al.  Sparsity and smoothness via the fused lasso , 2005 .

[14]  Dechang Chen,et al.  Integrated analysis of independent gene expression microarray datasets improves the predictability of breast cancer outcome , 2007, BMC Genomics.

[15]  Donald Hedeker,et al.  Longitudinal Data Analysis , 2006 .

[16]  H. Zou The Adaptive Lasso and Its Oracle Properties , 2006 .

[17]  D. Dunson Bayesian dynamic modeling of latent trait distributions. , 2006, Biostatistics.

[18]  Ning Wang,et al.  The development of HIV/AIDS surveillance in China. , 2007, AIDS.

[19]  Runze Li,et al.  Tuning parameter selectors for the smoothly clipped absolute deviation method. , 2007, Biometrika.

[20]  Chenlei Leng,et al.  Shrinkage tuning parameter selection with a diverging number of parameters , 2008 .

[21]  L. Carin,et al.  The Matrix Stick-Breaking Process , 2008 .

[22]  B. Sinha,et al.  Statistical Meta-Analysis with Applications , 2008 .

[23]  Xiao-Hua Zhou,et al.  Statistical Methods for Meta‐Analysis , 2008 .

[24]  Masao Ueki,et al.  A note on automatic variable selection using smooth-threshold estimating equations , 2009 .

[25]  Stephen P. Boyd,et al.  1 Trend Filtering , 2009, SIAM Rev..

[26]  Peter X-K Song,et al.  Quadratic inference functions in marginal models for longitudinal data , 2009, Statistics in medicine.

[27]  M. Thase,et al.  An Integrated Analysis of the Efficacy of Desvenlafaxine Compared with Placebo in Patients with Major Depressive Disorder , 2009, CNS Spectrums.

[28]  Samuel D. Oman,et al.  Easily simulated multivariate binary distributions with given positive and negative correlations , 2009, Comput. Stat. Data Anal..

[29]  Masao Ueki,et al.  Automatic grouping using smooth-threshold estimating equations , 2011 .

[30]  Fei Wang,et al.  Quadratic inference function approach to merging longitudinal studies: validation and joint estimation , 2012 .

[31]  S. Geer,et al.  On asymptotically optimal confidence regions and tests for high-dimensional models , 2013, 1303.0518.

[32]  Jianqing Fan,et al.  Homogeneity Pursuit , 2015, Journal of the American Statistical Association.