Modeling measurement error in tumor characterization studies

BackgroundEtiologic studies of cancer increasingly use molecular features such as gene expression, DNA methylation and sequence mutation to subclassify the cancer type. In large population-based studies, the tumor tissues available for study are archival specimens that provide variable amounts of amplifiable DNA for molecular analysis. As molecular features measured from small amounts of tumor DNA are inherently noisy, we propose a novel approach to improve statistical efficiency when comparing groups of samples. We illustrate the phenomenon using the MethyLight technology, applying our proposed analysis to compare MLH1 DNA methylation levels in males and females studied in the Colon Cancer Family Registry.ResultsWe introduce two methods for computing empirical weights to model heteroscedasticity that is caused by sampling variable quantities of DNA for molecular analysis. In a simulation study, we show that using these weights in a linear regression model is more powerful for identifying differentially methylated loci than standard regression analysis. The increase in power depends on the underlying relationship between variation in outcome measure and input DNA quantity in the study samples.ConclusionsTumor characteristics measured from small amounts of tumor DNA are inherently noisy. We propose a statistical analysis that accounts for the measurement error due to sampling variation of the molecular feature and show how it can improve the power to detect differential characteristics between patient groups.

[1]  P A Lachenbruch,et al.  Comparisons of two‐part models with competitors , 2001, Statistics in medicine.

[2]  Peter W. Laird,et al.  Molecular Characterization of MSI-H Colorectal Cancer by MLHI Promoter Methylation, Immunohistochemistry, and Mismatch Repair Germline Mutation Screening , 2008, Cancer Epidemiology Biomarkers & Prevention.

[3]  P. Laird,et al.  MethyLight: a high-throughput assay to measure DNA methylation. , 2000, Nucleic acids research.

[4]  Gordon K. Smyth,et al.  Empirical array quality weights in the analysis of microarray data , 2006, BMC Bioinformatics.

[5]  Shuji Ogino,et al.  MGMT germline polymorphism is associated with somatic MGMT promoter methylation and gene silencing in colorectal cancer. , 2007, Carcinogenesis.

[6]  Erik Kristiansson,et al.  Weighted Analysis of Paired Microarray Experiments , 2005, Statistical applications in genetics and molecular biology.

[7]  Göran Hallmans,et al.  One-carbon metabolism and CpG island methylator phenotype status in incident colorectal cancer: a nested case–referent study , 2010, Cancer Causes & Control.

[8]  Erik Kristiansson,et al.  BMC Bioinformatics BioMed Central Methodology article Weighted analysis of general microarray experiments , 2007 .

[9]  P. Laird,et al.  Hierarchical clustering of lung cancer cell lines using DNA methylation markers. , 2002, Cancer epidemiology, biomarkers & prevention : a publication of the American Association for Cancer Research, cosponsored by the American Society of Preventive Oncology.

[10]  Christian B. Woods,et al.  Analysis of repetitive element DNA methylation by MethyLight , 2005, Nucleic acids research.