Reporting to Improve Reproducibility and Facilitate Validity Assessment for Healthcare Database Studies V1.0.

PURPOSE Defining a study population and creating an analytic dataset from longitudinal healthcare databases involves many decisions. Our objective was to catalogue scientific decisions underpinning study execution that should be reported to facilitate replication and enable assessment of validity of studies conducted in large healthcare databases. METHODS We reviewed key investigator decisions required to operate a sample of macros and software tools designed to create and analyze analytic cohorts from longitudinal streams of healthcare data. A panel of academic, regulatory, and industry experts in healthcare database analytics discussed and added to this list. CONCLUSION Evidence generated from large healthcare encounter and reimbursement databases is increasingly being sought by decision-makers. Varied terminology is used around the world for the same concepts. Agreeing on terminology and which parameters from a large catalogue are the most essential to report for replicable research would improve transparency and facilitate assessment of validity. At a minimum, reporting for a database study should provide clarity regarding operational definitions for key temporal anchors and their relation to each other when creating the analytic dataset, accompanied by an attrition table and a design diagram. A substantial improvement in reproducibility, rigor and confidence in real world evidence generated from healthcare databases could be achieved with greater transparency about operational study parameters used to create analytic datasets from longitudinal healthcare databases.

[1]  Guidelines for good pharmacoepidemiology practice (GPP) , 2016, Pharmacoepidemiology and drug safety.

[2]  Christopher D. Chambers,et al.  Transparency and Openness Promotion (TOP) Guidelines , 2014 .

[3]  David Madigan,et al.  Good practices for real‐world data studies of treatment and/or comparative effectiveness: Recommendations from the joint ISPOR‐ISPE Special Task Force on real‐world evidence in health care decision making , 2017, Pharmacoepidemiology and drug safety.

[4]  Patrick B. Ryan,et al.  Transparent Reporting of Data Quality in Distributed Data Networks , 2015, EGEMS.

[5]  W. Ray,et al.  Evaluating medication effects outside of clinical trials: new-user designs. , 2003, American journal of epidemiology.

[6]  A. Levy,et al.  CNODES: the Canadian Network for Observational Drug Effect Studies , 2012, Open medicine : a peer-reviewed, independent, open-access journal.

[7]  Frank de Vries,et al.  Reanalysis of two studies with contrasting results on the association between statin use and fracture risk: the General Practice Research Database. , 2006, International journal of epidemiology.

[8]  Martijn J. Schuemie,et al.  Replication of the OMOP Experiment in Europe: Evaluating Methods for Risk Identification in Electronic Health Record Databases , 2013, Drug Safety.

[9]  Samy Suissa,et al.  Immortal time bias in pharmaco-epidemiology. , 2008, American journal of epidemiology.

[10]  J. Rassen,et al.  Confounding Control in Healthcare Database Research: Challenges and Potential Approaches , 2010, Medical care.

[11]  Erik Schultes,et al.  The FAIR Guiding Principles for scientific data management and stewardship , 2016, Scientific Data.

[12]  E. Steyerberg,et al.  Reporting and Methods in Clinical Prediction Research: A Systematic Review , 2012, PLoS medicine.

[13]  Kevin Haynes,et al.  Identifying health outcomes in healthcare databases , 2015, Pharmacoepidemiology and drug safety.

[14]  F. Collins,et al.  Policy: NIH plans to enhance reproducibility , 2014, Nature.

[15]  Andrew Bate,et al.  An Evaluation of the THIN Database in the OMOP Common Data Model for Active Drug Safety Surveillance , 2013, Drug Safety.

[16]  N. Dreyer,et al.  Lessons learned on the design and the conduct of Post‐Authorization Safety Studies: review of 3 years of PRAC oversight , 2017, British journal of clinical pharmacology.

[17]  M. Maclure The case-crossover design: a method for studying transient effects on the risk of acute events. , 1991, American journal of epidemiology.

[18]  David Moher,et al.  The REporting of Studies Conducted Using Observational Routinely-Collected Health Data (RECORD) Statement: Methods for Arriving at Consensus and Developing Reporting Guidelines , 2015, PloS one.

[19]  Til Stürmer,et al.  Indications for propensity scores and review of their use in pharmacoepidemiology. , 2006, Basic & clinical pharmacology & toxicology.

[20]  Jocelyn Kaiser,et al.  The cancer test. , 2015, Science.

[21]  G. Rasi,et al.  Drug Regulation and Pricing--Can Regulators Influence Affordability? , 2016, The New England journal of medicine.

[22]  J. Lei,et al.  Combining multiple healthcare databases for postmarketing drug and vaccine safety surveillance: why and how? , 2014, Journal of internal medicine.

[23]  David Moher,et al.  Setting the RECORD straight: developing a guideline for the REporting of studies Conducted using Observational Routinely collected Data , 2013, Clinical epidemiology.

[24]  Malcolm Maclure,et al.  DECISION-MAKING ALIGNED WITH RAPID-CYCLE EVALUATION IN HEALTH CARE , 2015, International Journal of Technology Assessment in Health Care.

[25]  Canary Wharf,et al.  The European Network of Centres for Pharmacoepidemiology and Pharmacovigilance (ENCePP) , 2012 .

[26]  J. Avorn,et al.  A review of uses of health care utilization databases for epidemiologic research on therapeutics. , 2005, Journal of clinical epidemiology.

[27]  Brian A. Nosek,et al.  Making sense of replications , 2017, eLife.

[28]  John P. A. Ioannidis,et al.  Reproducible Research Practices and Transparency across the Biomedical Literature , 2016, PLoS biology.

[29]  Teresa To,et al.  Development and use of reporting guidelines for assessing the quality of validation studies of health administrative data. , 2011, Journal of clinical epidemiology.

[30]  Richard Platt,et al.  Launching PCORnet, a national patient-centered clinical research network , 2014, Journal of the American Medical Informatics Association : JAMIA.

[31]  Brian Sauer,et al.  Guidelines for good database selection and use in pharmacoepidemiology research , 2012, Pharmacoepidemiology and drug safety.

[32]  J. Oliveira,et al.  The EU‐ADR Web Platform: delivering advanced pharmacovigilance tools , 2013, Pharmacoepidemiology and drug safety.

[33]  M. Kahn,et al.  Data Quality Assessment for Comparative Effectiveness Research in Distributed Data Networks , 2013, Medical care.

[34]  C. Begley,et al.  Drug development: Raise standards for preclinical cancer research , 2012, Nature.

[35]  R. Platt,et al.  Developing the Sentinel System--a national resource for evidence development. , 2011, The New England journal of medicine.

[36]  Brian A. Nosek,et al.  Promoting an open research culture , 2015, Science.

[37]  David Moher,et al.  The REporting of Studies Conducted Using Observational Routinely-Collected Health Data (RECORD) Statement: Methods for Arriving at Consensus and Developing Reporting Guidelines , 2015, PloS one.

[38]  A. Arana,et al.  Guide on methodological standards in pharmacoepidemiology , 2016 .

[39]  S. Papson “Model” , 1981 .

[40]  Bruce M Psaty,et al.  Mini-Sentinel and regulatory science--big data rendered fit and functional. , 2014, The New England journal of medicine.

[41]  S. Pocock,et al.  The Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) statement: guidelines for reporting observational studies. , 2007, Preventive medicine.

[42]  Gary S Collins,et al.  A systematic review finds prediction models for chronic kidney disease were poorly reported and often developed using inappropriate methods. , 2013, Journal of clinical epidemiology.

[43]  Jeremy A. Rassen,et al.  Transparency and reproducibility of published observational cohort studies , 2015 .

[44]  H. Jick,et al.  Risk of venous thromboembolism among users of third generation oral contraceptives compared with users of oral contraceptives with levonorgestrel before and after 1995: cohort and case-control analysis , 2000, BMJ : British Medical Journal.

[45]  K. Rothman Induction and latent periods. , 1981, American journal of epidemiology.

[46]  JA Rassen,et al.  Transparency and Reproducibility of Observational Cohort Studies Using Large Healthcare Databases. , 2016, Clinical pharmacology and therapeutics.

[47]  N. Pratt,et al.  The Asian Pharmacoepidemiology Network (AsPEN): promoting multi‐national collaboration for pharmacoepidemiologic research in Asia , 2013, Pharmacoepidemiology and drug safety.

[48]  H. Hillege,et al.  Policies for Use of Real-World Data in Health Technology Assessment (HTA): A Comparative Study of Six HTA Agencies. , 2016, Value in health : the journal of the International Society for Pharmacoeconomics and Outcomes Research.

[49]  Edoardo Vacchi,et al.  Data Extraction and Management in Networks of Observational Health Care Databases for Scientific Research: A Comparison of EU-ADR, OMOP, Mini-Sentinel and MATRICE Strategies , 2016, EGEMS.

[50]  Bartha M. Knoppers,et al.  Data sharing, year 1--access to data from industry-sponsored clinical trials. , 2014, The New England journal of medicine.

[51]  Petra Kaufmann,et al.  Transforming Evidence Generation to Support Health and Health Care Decisions. , 2016, The New England journal of medicine.

[52]  Francis S. Collins,et al.  PCORnet: turning a dream into reality , 2014, J. Am. Medical Informatics Assoc..

[53]  M. Epstein,et al.  Guidelines for good pharmacoepidemiology practices (GPP) , 2008 .