Since 1998, the US Food and Drug Administration (FDA) has been exploring new automated and rapid Bayesian data mining techniques. These techniques have been used to systematically screen the FDA’s huge MedWatch database of voluntary reports of adverse drug events for possible events of concern.The data mining method currently being used is the Multi-Item Gamma Poisson Shrinker (MGPS) program that replaced the Gamma Poisson Shrinker (GPS) program we originally used with the legacy database. The MGPS algorithm, the technical aspects of which are summarised in this paper, computes signal scores for pairs, and for higher-order (e.g. triplet, quadruplet) combinations of drugs and events that are significantly more frequent than their pair-wise associations would predict. MGPS generates consistent, redundant, and replicable signals while minimising random patterns. Signals are generated without using external exposure data, adverse event background information, or medical information on adverse drug reactions. The MGPS interface streamlines multiple input-output processes that previously had been manually integrated. The system, however, cannot distinguish between already-known associations and new associations, so the reviewers must filter these events.In addition to detecting possible serious single-drug adverse event problems, MGPS is currently being evaluated to detect possible synergistic interactions between drugs (drug interactions) and adverse events (syndromes), and to detect differences among subgroups defined by gender and by age, such as paediatrics and geriatrics.In the current data, only 3.4% of all 1.2 million drug-event pairs ever reported (with frequencies ≥ 1) generate signals [lower 95% confidence interval limit of the adjusted ratios of the observed counts over expected (O/E) counts (denoted EB05) of ≥ 2]. The total frequency count that contributed to signals comprised 23% (2.4 million) of the total number, 10.4 million of drug-event pairs reported, greatly facilitating a more focused follow-up and evaluation.The algorithm provides an objective, systematic view of the data alerting reviewers to critically important, new safety signals. The study of signals detected by current methods, signals stored in the Center for Drug Evaluation and Research’s Monitoring Adverse Reports Tracking System, and the signals regarding cerivastatin, a cholesterol-lowering drug voluntarily withdrawn from the market in August 2001, exemplify the potential of data mining to improve early signal detection. The operating characteristics of data mining in detecting early safety signals, exemplified by studying a drug recently well characterised by large clinical trials confirms our experience that the signals generated by data mining have high enough specificity to deserve further investigation. The application of these tools may ultimately improve usage recommendations.
[1]
William DuMouchel,et al.
Bayesian Data Mining in Large Frequency Tables, with an Application to the FDA Spontaneous Reporting System
,
1999
.
[2]
R. Ball,et al.
Data Mining for Post-Licensure Vaccine Safety and Policy Implications for Using Results
,
2001
.
[3]
Robert T. O'Neill,et al.
Some US Food and Drug Administration Perspectives on Data Mining for Pediatric Safety Assessment
,
2001
.
[4]
M. Braun,et al.
Data mining in the US Vaccine Adverse Event Reporting System (VAERS): early detection of intussusception and other events after rotavirus vaccination.
,
2001,
Vaccine.
[5]
William DuMouchel,et al.
Empirical bayes screening for multi-item associations
,
2001,
KDD '01.
[6]
Anne E. Trontell,et al.
How the US Food and Drug Administration Defines and Detects Adverse Drug Events
,
2001
.
[7]
D. Graham,et al.
A View From Regulatory Agencies
,
2002
.