A Method for Analyzing Commonalities in Clinical Trial Target Populations

ClinicalTrials.gov presents great opportunities for analyzing commonalities in clinical trial target populations to facilitate knowledge reuse when designing eligibility criteria of future trials or to reveal potential systematic biases in selecting population subgroups for clinical research. Towards this goal, this paper presents a novel data resource for enabling such analyses. Our method includes two parts: (1) parsing and indexing eligibility criteria text; and (2) mining common eligibility features and attributes of common numeric features (e.g., A1c). We designed and built a database called "Commonalities in Target Populations of Clinical Trials" (COMPACT), which stores structured eligibility criteria and trial metadata in a readily computable format. We illustrate its use in an example analytic module called CONECT using COMPACT as the backend. Type 2 diabetes is used as an example to analyze commonalities in the target populations of 4,493 clinical trials on this disease.

[1]  C Weijer,et al.  A study in contrasts: eligibility criteria in a twenty-year sample of NSABP and POG clinical trials. National Surgical Adjuvant Breast and Bowel Program. Pediatric Oncology Group. , 1998, Journal of clinical epidemiology.

[2]  Jang Seok Oh,et al.  Use of Hangeul Twitter to Track and Predict Human Influenza Infection , 2013, PloS one.

[3]  Riccardo Miotto,et al.  Unsupervised mining of frequent tags for clinical eligibility text indexing , 2013, J. Biomed. Informatics.

[4]  P. Rothwell,et al.  External validity of randomised controlled trials: “To whom do the results of this trial apply?” , 2005, The Lancet.

[5]  Riccardo Miotto,et al.  A Method for Probing Disease Relatedness Using Common Clinical Eligibility Criteria , 2013, MedInfo.

[6]  B. Thiers,et al.  Eligibility Criteria of Randomized Controlled Trials Published in High-Impact General Medical Journals: A Systematic Sampling Review , 2008 .

[7]  Olivier Bodenreider,et al.  The Unified Medical Language System (UMLS): integrating biomedical terminology , 2004, Nucleic Acids Res..

[8]  G Hripcsak,et al.  A Distribution-based Method for Assessing The Differences between Clinical Trial Target Populations and Patient Populations in Electronic Health Records , 2014, Applied Clinical Informatics.

[9]  Riccardo Miotto,et al.  A human-computer collaborative approach to identifying common data elements in clinical trial eligibility criteria , 2013, J. Biomed. Informatics.

[10]  Ricardo Pietrobon,et al.  The Database for Aggregate Analysis of ClinicalTrials.gov (AACT) and Subsequent Regrouping by Clinical Specialty , 2012, PloS one.

[11]  Tianyong Hao,et al.  Clustering clinical trials with similar eligibility criteria features , 2014, J. Biomed. Informatics.

[12]  Chunhua Weng,et al.  Semi-Automatically Inducing Semantic Classes of Clinical Research Eligibility Criteria Using UMLS and Hierarchical Clustering. , 2010, AMIA ... Annual Symposium proceedings. AMIA Symposium.

[13]  S. Tu,et al.  Analysis of Eligibility Criteria Complexity in Clinical Trials , 2010, Summit on translational bioinformatics.