Measurement error correction in the least absolute shrinkage and selection operator model when validation data are available

Measurement of serum biomarkers by multiplex assays may be more variable as compared to single biomarker assays. Measurement error in these data may bias parameter estimates in regression analysis, which could mask true associations of serum biomarkers with an outcome. The Least Absolute Shrinkage and Selection Operator (LASSO) can be used for variable selection in these high-dimensional data. Furthermore, when the distribution of measurement error is assumed to be known or estimated with replication data, a simple measurement error correction method can be applied to the LASSO method. However, in practice the distribution of the measurement error is unknown and is expensive to estimate through replication both in monetary cost and need for greater amount of sample which is often limited in quantity. We adapt an existing bias correction approach by estimating the measurement error using validation data in which a subset of serum biomarkers are re-measured on a random subset of the study sample. We evaluate this method using simulated data and data from the Tucson Epidemiological Study of Airway Obstructive Disease (TESAOD). We show that the bias in parameter estimation is reduced and variable selection is improved.

[1]  A. Tsybakov,et al.  Sparse recovery under matrix uncertainty , 2008, 0812.2818.

[2]  Tatsuki Koyama,et al.  Accuracy and reproducibility of a multiplex immunoassay platform: a validation study. , 2011, Journal of immunological methods.

[3]  Andrew W. Roddam,et al.  Measurement Error in Nonlinear Models: a Modern Perspective , 2008 .

[4]  Runze Li,et al.  Variable Selection for Partially Linear Models With Measurement Errors , 2009, Journal of the American Statistical Association.

[5]  George G Klee,et al.  Measurement and quality control issues in multiplex protein assays: a case study. , 2009, Clinical chemistry.

[6]  R. Tibshirani,et al.  PATHWISE COORDINATE OPTIMIZATION , 2007, 0708.1485.

[7]  C. Wagner,et al.  Simultaneous Detection of Eight Analytes in Human Serum by Two Commercially Available Platforms for Multiplex Cytokine Analysis , 2007, Clinical and Vaccine Immunology.

[8]  Gerhard Walzl,et al.  An Evaluation of Commercial Fluorescent Bead-Based Luminex Cytokine Assays , 2008, PloS one.

[9]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[10]  Qinfeng Xu,et al.  Covariate Selection for Linear Errors-in-Variables Regression Models , 2007 .

[11]  Arnoldo Frigessi,et al.  Measurement error in Lasso: impact and likelihood bias correction , 2012, 1210.5378.

[12]  M. Lebowitz,et al.  Tucson epidemiologic study of obstructive lung diseases. I: Methodology and prevalence of disease. , 1975, American journal of epidemiology.

[13]  G. Kuchel,et al.  ELISA and multiplex technologies for cytokine measurement in inflammation and aging research. , 2008, The journals of gerontology. Series A, Biological sciences and medical sciences.

[14]  Maren S Fragala,et al.  Conceptual and methodological issues relevant to cytokine and inflammatory marker measurements in clinical research , 2010, Current opinion in clinical nutrition and metabolic care.

[15]  U. Olsson‐Strömberg,et al.  The use of multiplex platforms for absolute and relative protein quantification of clinical material , 2014 .

[16]  Ian Todd,et al.  ELISA in the multiplex era: Potentials and pitfalls , 2015, Proteomics. Clinical applications.