Considerations for observational research using large data sets in radiation oncology.

The radiation oncology community has witnessed growing interest in observational research conducted using large-scale data sources such as registries and claims-based data sets. With the growing emphasis on observational analyses in health care, the radiation oncology community must possess a sophisticated understanding of the methodological considerations of such studies in order to evaluate evidence appropriately to guide practice and policy. Because observational research has unique features that distinguish it from clinical trials and other forms of traditional radiation oncology research, the International Journal of Radiation Oncology, Biology, Physics assembled a panel of experts in health services research to provide a concise and well-referenced review, intended to be informative for the lay reader, as well as for scholars who wish to embark on such research without prior experience. This review begins by discussing the types of research questions relevant to radiation oncology that large-scale databases may help illuminate. It then describes major potential data sources for such endeavors, including information regarding access and insights regarding the strengths and limitations of each. Finally, it provides guidance regarding the analytical challenges that observational studies must confront, along with discussion of the techniques that have been developed to help minimize the impact of certain common analytical issues in observational analysis. Features characterizing a well-designed observational study include clearly defined research questions, careful selection of an appropriate data source, consultation with investigators with relevant methodological expertise, inclusion of sensitivity analyses, caution not to overinterpret small but significant differences, and recognition of limitations when trying to evaluate causality. This review concludes that carefully designed and executed studies using observational data that possess these qualities hold substantial promise for advancing our understanding of many unanswered questions of importance to the field of radiation oncology.

[1]  D. Rubin,et al.  Statistical Analysis with Missing Data. , 1989 .

[2]  Eduardo Orihuela,et al.  Characteristics of urologists predict the use of androgen deprivation therapy for prostate cancer. , 2007, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[3]  Ruth Etzioni,et al.  Estimating Health Care Costs Related to Cancer Treatment From SEER-Medicare Data , 2002, Medical care.

[4]  R. Jagsi,et al.  Adoption of intensity-modulated radiation therapy for breast cancer in the United States. , 2011, Journal of the National Cancer Institute.

[5]  Aileen B Chen Comparative effectiveness research in radiation oncology: assessing technology. , 2014, Seminars in radiation oncology.

[6]  Ronald C. Chen,et al.  Intensity-modulated radiation therapy, proton therapy, or conformal radiation therapy and morbidity and disease control in localized prostate cancer. , 2012, JAMA.

[7]  D. Rubin,et al.  The central role of the propensity score in observational studies for causal effects , 1983 .

[8]  Variation in the Utilization of Reconstruction Following Mastectomy in Elderly Women , 2012, Annals of Surgical Oncology.

[9]  Craig K. Enders,et al.  An introduction to modern missing data analyses. , 2010, Journal of school psychology.

[10]  C. Ko,et al.  The National Cancer Data Base: A Powerful Initiative to Improve Cancer Care in the United States , 2008, Annals of Surgical Oncology.

[11]  Sebastian Schneeweiss,et al.  Instrumental variable methods in comparative safety and effectiveness research , 2010, Pharmacoepidemiology and drug safety.

[12]  M L Kilgore,et al.  Mortality following bone metastasis and skeletal-related events among men with prostate cancer: a population-based analysis of US Medicare beneficiaries, 1999–2006 , 2011, Prostate Cancer and Prostatic Diseases.

[13]  A. Basu,et al.  Estimating marginal and incremental effects on health outcomes using flexible link and variance function models. , 2005, Biostatistics.

[14]  L. Potters,et al.  Practice-based evidence to evidence-based practice: building the National Radiation Oncology Registry. , 2013, Journal of oncology practice.

[15]  T. Buchholz,et al.  A method to predict breast cancer stage using Medicare claims , 2010, Epidemiologic perspectives & innovations : EP+I.

[16]  Deborah Schrag,et al.  Overview of the SEER-Medicare Data: Content, Research Applications, and Generalizability to the United States Elderly Population , 2002, Medical care.

[17]  Michael L. Johnson,et al.  Good research practices for comparative effectiveness research: analytic methods to improve causal inference from nonrandomized studies of treatment effects using secondary data sources: the ISPOR Good Research Practices for Retrospective Database Analysis Task Force Report--Part III. , 2009, Value in health : the journal of the International Society for Pharmacoeconomics and Outcomes Research.

[18]  C. Gross,et al.  Patterns of care and outcomes associated with intensity-modulated radiation therapy versus conventional radiation therapy for older patients with head-and-neck cancer. , 2012, International journal of radiation oncology, biology, physics.

[19]  Joseph C Gardiner,et al.  Longitudinal analysis of censored medical cost data. , 2006, Health economics.

[20]  C. Earle,et al.  Impact of interval from breast conserving surgery to radiotherapy on local recurrence in older women with breast cancer: retrospective cohort analysis , 2010, BMJ : British Medical Journal.

[21]  T. Ten Have,et al.  Radical cystectomy versus bladder-preserving therapy for muscle-invasive urothelial carcinoma: examining confounding and misclassification biasin cancer observational comparative effectiveness research. , 2013, Value in health : the journal of the International Society for Pharmacoeconomics and Outcomes Research.

[22]  R. Salloum,et al.  Guideline-discordant androgen deprivation therapy in localized prostate cancer: patterns of use in the medicare population and cost implications. , 2013, Annals of oncology : official journal of the European Society for Medical Oncology.

[23]  Kathleen Lang,et al.  Identifying Cancer Relapse Using SEER-Medicare Data , 2002, Medical care.

[24]  J. Newhouse,et al.  Econometrics in outcomes research: the use of instrumental variables. , 1998, Annual review of public health.

[25]  Purushottam W. Laud,et al.  An Algorithm for the Use of Medicare Claims Data to Identify Women with Incident Breast Cancer , 2004 .

[26]  D. Wazer Association Between Treatment With Brachytherapy vs Whole-Breast Irradiation and Subsequent Mastectomy, Complications, and Survival Among Older Women With Invasive Breast Cancer , 2012 .

[27]  Melanie A. Williams,et al.  Muddy water? Variation in reporting receipt of breast cancer radiation therapy by population-based tumor registries. , 2013, International journal of radiation oncology, biology, physics.

[28]  Ming-Hui Chen,et al.  Outcomes in stage I testicular seminoma: A population‐based study of 9193 patients , 2013, Cancer.

[29]  Ruth Etzioni,et al.  Survival benefit associated with adjuvant androgen deprivation therapy combined with radiotherapy for high- and low-risk patients with nonmetastatic prostate cancer. , 2006, International journal of radiation oncology, biology, physics.

[30]  R. Jagsi,et al.  Trends and variation in use of breast reconstruction in patients with breast cancer undergoing mastectomy in the United States. , 2014, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[31]  Sally Okun,et al.  Making the Case for Continuous Learning from Routinely Collected Data , 2012 .

[32]  Timothy P. Johnson,et al.  Response rates and nonresponse errors in surveys. , 2012, JAMA.

[33]  D. Hershman,et al.  Comparative effectiveness research in oncology methodology: observational data. , 2012, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[34]  L. Garrison,et al.  Using real-world data for coverage and payment decisions: the ISPOR Real-World Data Task Force report. , 2007, Value in health : the journal of the International Society for Pharmacoeconomics and Outcomes Research.

[35]  LeighAnne Olsen,et al.  The Learning Health System Series: Workshop Common Themes , 2011 .

[36]  Reshma Jagsi,et al.  Underascertainment of radiotherapy receipt in Surveillance, Epidemiology, and End Results registry data , 2012, Cancer.

[37]  J. Griggs,et al.  Patterns and correlates of adjuvant radiotherapy receipt after lumpectomy and after mastectomy for breast cancer. , 2010, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[38]  S. Hahn,et al.  Implications of comparative effectiveness research for radiation oncology. , 2011, Practical radiation oncology.

[39]  D. Fryback,et al.  Long-term survival among men with conservatively treated localized prostate cancer. , 1995, JAMA.

[40]  Sharyl J. Nass,et al.  Delivering High-Quality Cancer Care: Charting a New Course for a System in Crisis , 2014 .

[41]  David J Harrison,et al.  The cost of treating skeletal-related events in patients with prostate cancer. , 2008, The American journal of managed care.

[42]  Nikki M. Carroll,et al.  Validating Billing/Encounter Codes as Indicators of Lung, Colorectal, Breast, and Prostate Cancer Recurrence Using 2 Large Contemporary Cohorts , 2014, Medical care.

[43]  M. Stolar,et al.  Identification of metastatic cancer in claims data , 2012, Pharmacoepidemiology and drug safety.

[44]  A. Garden,et al.  Improved survival using intensity‐modulated radiation therapy in head and neck cancers: A SEER‐Medicare analysis , 2014, Cancer.

[45]  D. Schrag,et al.  Survival outcomes after radiation therapy for stage III non-small-cell lung cancer after adoption of computed tomography-based simulation. , 2011, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[46]  Ronald C. Chen Comparative effectiveness research in oncology: the promise, challenges, and opportunities. , 2014, Seminars in radiation oncology.

[47]  Sasa Mutic,et al.  Developing a national radiation oncology registry: From acorns to oaks. , 2012, Practical radiation oncology.

[48]  G. Fitzmaurice,et al.  Surveillance after resection for colorectal cancer , 2013, Cancer.

[49]  Gregory S. Cooper,et al.  Studying Radiation Therapy Using SEER-Medicare-Linked Data , 2002, Medical care.

[50]  D. Lin,et al.  Proportional Means Regression for Censored Medical Costs , 2000, Biometrics.

[51]  C. Ko,et al.  Comparison of commission on cancer-approved and -nonapproved hospitals in the United States: implications for studies that use the National Cancer Data Base. , 2009, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[52]  G. Lyman,et al.  Comparative effectiveness research in oncology: an overview. , 2012, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[53]  Yijian Huang,et al.  Cost analysis with censored data. , 2009, Medical care.

[54]  J. Hayman,et al.  Expectations about the effectiveness of radiation therapy among patients with incurable lung cancer. , 2013, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[55]  Beth A Virnig,et al.  Utility of the SEER-Medicare Data to Identify Chemotherapy Use , 2002, Medical care.

[56]  G. Lyman Comparative effectiveness research in oncology. , 2013, The oncologist.

[57]  W. Shih,et al.  Late gastrointestinal toxicities following radiation therapy for prostate cancer. , 2011, European urology.

[58]  Joe Y. Chang,et al.  Intensity modulated radiotherapy for stage III non-small cell lung cancer in the United States: predictors of use and association with toxicities. , 2013, Lung cancer.

[59]  J. Hayman,et al.  Palliative radiation therapy practice in patients with metastatic non-small-cell lung cancer: a Cancer Care Outcomes Research and Surveillance Consortium (CanCORS) Study. , 2013, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[60]  D. Lin,et al.  Regression analysis of incomplete medical cost data , 2003, Statistics in medicine.

[61]  Joe Y. Chang,et al.  Clinical Investigation : Thoracic Cancer Comparative Effectiveness of 5 Treatment Strategies for Early-Stage Non-Small Cell Lung Cancer in the Elderly , 2012 .

[62]  L. Wilson,et al.  Population Based Cancer Registry Analysis of Primary Tracheal Carcinoma , 2011, American journal of clinical oncology.

[63]  Benjamin D. Smith,et al.  Factors contributing to underuse of radiation among younger women with breast cancer. , 2014, Journal of the National Cancer Institute.

[64]  Anirban Basu,et al.  Comparing alternative models: log vs Cox proportional hazard? , 2004, Health economics.

[65]  C. Klabunde,et al.  Linking Physician Characteristics and Medicare Claims Data: Issues in Data Availability, Quality, and Measurement , 2002, Medical care.

[66]  Chalapathy Neti,et al.  Rapid-learning system for cancer care. , 2010, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[67]  K. Kahn,et al.  Understanding cancer treatment and outcomes: the Cancer Care Outcomes Research and Surveillance Consortium. , 2004, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[68]  D. Rubin,et al.  Statistical Analysis with Missing Data , 1988 .

[69]  C. Gross,et al.  Assessing the impact of a cooperative group trial on breast cancer care in the medicare population. , 2012, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[70]  E. Emanuel,et al.  Single- vs multiple-fraction radiotherapy for bone metastases from prostate cancer. , 2013, JAMA.

[71]  W. Manning,et al.  Estimating Log Models: To Transform or Not to Transform? , 1999, Journal of health economics.

[72]  Stephanie B. Wheeler,et al.  Preoperative breast MRI and surgical outcomes in elderly women with invasive ductal and lobular carcinoma: a population-based study , 2013, Breast Cancer Research and Treatment.

[73]  T. Lancaster,et al.  Instrumental variables and inverse probability weighting for causal inference from longitudinal observational studies , 2004, Statistical methods in medical research.

[74]  T. Buchholz,et al.  Changing trends in radiation therapy technologies in the last year of life for patients diagnosed with metastatic cancer in the United States , 2013, Cancer.

[75]  Dirk F Moore,et al.  Outcomes of localized prostate cancer following conservative management. , 2009, JAMA.

[76]  J. Robins,et al.  Marginal Structural Models and Causal Inference in Epidemiology , 2000, Epidemiology.

[77]  D R Griffin,et al.  Letters to the editor. , 1974, Science.

[78]  J. Warren,et al.  Identifying and Measuring Hospital Characteristics Using the SEER-Medicare Data and Other Claims-Based Sources , 2002, Medical care.

[79]  B. Freidlin,et al.  Methodology for comparative effectiveness research: potential and limitations. , 2012, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[80]  Peter Davey,et al.  A checklist for retrospective database studies--report of the ISPOR Task Force on Retrospective Databases. , 2003, Value in health : the journal of the International Society for Pharmacoeconomics and Outcomes Research.

[81]  S. Aneja,et al.  Comparative effectiveness research in radiation oncology: stereotactic radiosurgery, hypofractionation, and brachytherapy. , 2014, Seminars in radiation oncology.

[82]  Grace L. Smith,et al.  Trends in the utilization of brachytherapy in cervical cancer in the United States. In regard to Han et al. , 2014, International journal of radiation oncology, biology, physics.

[83]  Ya-Chen Tina Shih,et al.  Nomogram to predict the benefit of radiation for older patients with breast cancer treated with conservative surgery. , 2012, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[84]  C. Eheman,et al.  Enhancement of identifying cancer specialists through the linkage of Medicare claims to additional sources of physician specialty. , 2009, Health services research.

[85]  J. Lunceford,et al.  Stratification and weighting via the propensity score in estimation of causal treatment effects: a comparative study , 2004, Statistics in medicine.

[86]  J. Brooks Why most published research findings are false: Ioannidis JP, Department of Hygiene and Epidemiology, University of Ioannina School of Medicine, Ioannina, Greece , 2008 .

[87]  Asif Ahmad,et al.  Electronic Patient-Reported Data Capture as a Foundation of Rapid Learning Cancer Care , 2010, Medical care.

[88]  Marilyn M. Schapira,et al.  Methodological Issues in the Use of Administrative Claims Data to Study Surveillance After Cancer Treatment , 2002, Medical care.

[89]  J. Efstathiou,et al.  Outcomes after intensity-modulated versus conformal radiotherapy in older men with nonmetastatic prostate cancer. , 2011, International journal of radiation oncology, biology, physics.

[90]  J. Stock,et al.  Instrumental Variables Regression with Weak Instruments , 1994 .

[91]  Emma Hall,et al.  Parotid-sparing intensity modulated versus conventional radiotherapy in head and neck cancer (PARSPORT): a phase 3 multicentre randomised controlled trial , 2011, The Lancet. Oncology.

[92]  Cary P Gross,et al.  Proton versus intensity-modulated radiotherapy for prostate cancer: patterns of care and early toxicity. , 2013, Journal of the National Cancer Institute.

[93]  Ronald C. Chen,et al.  Recommendations for post-prostatectomy radiation therapy in the United States before and after the presentation of randomized trials. , 2011, The Journal of urology.

[94]  A. Abernethy,et al.  Stakeholder perspectives on implementing the National Cancer Institute’s patient-reported outcomes version of the Common Terminology Criteria for Adverse Events (PRO-CTCAE) , 2011, Translational behavioral medicine.

[95]  J. Heyse,et al.  A regression-based method for estimating mean treatment cost in the presence of right-censoring. , 2000, Biostatistics.

[96]  Ronald C. Chen,et al.  An overview of methods for comparative effectiveness research. , 2014, Seminars in radiation oncology.

[97]  David Atkins,et al.  Good research practices for comparative effectiveness research: defining, reporting and interpreting nonrandomized studies of treatment effects using secondary data sources: the ISPOR Good Research Practices for Retrospective Database Analysis Task Force Report--Part I. , 2009, Value in health : the journal of the International Society for Pharmacoeconomics and Outcomes Research.

[98]  Ilene Brill,et al.  Mortality following bone metastasis and skeletal-related events among women with breast cancer: a population-based analysis of U.S. Medicare beneficiaries, 1999–2006 , 2011, Breast Cancer Research and Treatment.

[99]  黄亚明(整理),et al.  Equator network , 2012 .

[100]  Zhigang Duan,et al.  Limits of observational data in determining outcomes from cancer therapy , 2008, Cancer.

[101]  W. Manning,et al.  The logged dependent variable, heteroscedasticity, and the retransformation problem. , 1998, Journal of health economics.

[102]  Steven E Schild,et al.  Re: Proton vs intensity-modulated radiotherapy for prostate cancer: patterns of care and early toxicity. , 2013, Journal of the National Cancer Institute.

[103]  M. Kenward,et al.  Multiple imputation for missing data in epidemiological and clinical research: potential and pitfalls , 2009, BMJ : British Medical Journal.

[104]  C. Gross,et al.  Patterns of use and short-term complications of breast brachytherapy in the national medicare population from 2008-2009. , 2012, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[105]  Reply to L.W. Cuttino et al. , 2013, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[106]  Variation in the Utilization of Reconstruction Following Mastectomy in Elderly Women , 2013, Annals of surgical oncology.

[107]  Yi-Ting Hwang,et al.  An exploratory instrumental variable analysis of the outcomes of localized breast cancer treatments in a medicare population. , 2003, Health economics.

[108]  S. Thompson,et al.  Analysis of cost data in randomized trials: an application of the non-parametric bootstrap. , 2000, Statistics in medicine.

[109]  J. Warren,et al.  Overview of the SEER—Medicare Health Outcomes Survey Linked Dataset , 2008, Health care financing review.

[110]  L. Wilson,et al.  Poster Viewing PresentationSurveillance, Epidemiology, and End Results (SEER) Database Analysis of Microcystic Adnexal Carcinoma (Sclerosing Sweat Duct Carcinoma) of the Skin , 2007 .

[111]  J. Schafer Multiple imputation: a primer , 1999, Statistical methods in medical research.

[112]  Ya-Chen Tina Shih,et al.  Oncology comparative effectiveness research: a multistakeholder perspective on principles for conduct and reporting. , 2013, The oncologist.

[113]  Scott R. Smith,et al.  Developing a Protocol for Observational Comparative Effectiveness Research: A User's Guide , 2013 .

[114]  Ya-Chen Tina Shih,et al.  A checklist for ascertaining study cohorts in oncology health services research using secondary data: report of the ISPOR oncology good outcomes research practices working group. , 2013, Value in health : the journal of the International Society for Pharmacoeconomics and Outcomes Research.

[115]  Michael P. Jones Indicator and stratification methods for missing explanatory variables in multiple linear regression , 1996 .

[116]  A. Garden,et al.  Evaluating the impact of patient, tumor, and treatment characteristics on the development of jaw complications in patients treated for oral cancers: A SEER–Medicare analysis , 2013, Head & neck.

[117]  Colin B Begg,et al.  Measuring Complications of Cancer Treatment Using the SEER-Medicare Data , 2002, Medical care.

[118]  Uwe Siebert,et al.  Good research practices for comparative effectiveness research: approaches to mitigate bias and confounding in the design of nonrandomized studies of treatment effects using secondary data sources: the International Society for Pharmacoeconomics and Outcomes Research Good Research Practices for Retr , 2009, Value in health : the journal of the International Society for Pharmacoeconomics and Outcomes Research.