今日推荐

2003 - Journal of clinical epidemiology

External validation is necessary in prediction research: a clinical example.

BACKGROUND AND OBJECTIVES Prediction models tend to perform better on data on which the model was constructed than on new data. This difference in performance is an indication of the optimism in the apparent performance in the derivation set. For internal model validation, bootstrapping methods are recommended to provide bias-corrected estimates of model performance. Results are often accepted without sufficient regard to the importance of external validation. This report illustrates the limitations of internal validation to determine generalizability of a diagnostic prediction model to future settings. METHODS A prediction model for the presence of serious bacterial infections in children with fever without source was derived and validated internally using bootstrap resampling techniques. Subsequently, the model was validated externally. RESULTS In the derivation set (n=376), nine predictors were identified. The apparent area under the receiver operating characteristic curve (95% confidence interval) of the model was 0.83 (0.78-0.87) and 0.76 (0.67-0.85) after bootstrap correction. In the validation set (n=179) the performance was 0.57 (0.47-0.67). CONCLUSION For relatively small data sets, internal validation of prediction models by bootstrap techniques may not be sufficient and indicative for the model's performance in future patients. External validation is essential before implementing prediction models in clinical practice.

2005 - Journal of clinical epidemiology

Substantial effective sample sizes were required for external validation studies of predictive logistic regression models.

BACKGROUND AND OBJECTIVES The performance of a prediction model is usually worse in external validation data compared to the development data. We aimed to determine at which effective sample sizes (i.e., number of events) relevant differences in model performance can be detected with adequate power. METHODS We used a logistic regression model to predict the probability that residual masses of patients treated for metastatic testicular cancer contained only benign tissue. We performed standard power calculations and Monte Carlo simulations to estimate the numbers of events that are required to detect several types of model invalidity with 80% power at the 5% significance level. RESULTS A validation sample with 111 events was required to detect that a model predicted too high probabilities, when predictions were on average 1.5 times too high on the odds scale. A decrease in discriminative ability of the model, indicated by a decrease in the c-statistic from 0.83 to 0.73, required 81 to 106 events, depending on the specific scenario. CONCLUSION We suggest a minimum of 100 events and 100 nonevents for external validation samples. Specific hypotheses may, however, require substantially higher effective sample sizes to obtain adequate power.

论文关键词

time series recurrent neural network metric space health care discrete wavelet transform sample size confidence interval discrete fourier transform systematic review dimensionality reduction internet service euclidean distance traffic engineering internet service provider web search engine amino acid internet traffic intensive care unit time warping similarity search background and objective x-ray computed tomography heart failure traffic classification large time body mass index early diagnosi evaluation procedure dimensionality reduction technique growth factor internet routing kidney disease signal transduction symmetric encryption chronic kidney disease sequence database chronic kidney time series database today internet scaling behavior internet backbone searchable symmetric encryption cardiac surgery series database internet traffic classification searchable symmetric oxidative stres publication bia cell surface efficient similarity external validation large time series efficient similarity search time warping distance glomerular filtration rate effective sample size hospital admission fast similarity fast similarity search plasma membrane acute kidney injury acute kidney kidney injury internet traffic engineering approximate similarity search search in large dynamic searchable area under curve kidney transplantation today internet traffic sse scheme dynamic searchable symmetric radical polymerization abbott laboratory traffic classification technique renal replacement therapy cell physiology ckd patient wide-area internet internet traffic measurement improved definition fibroblast growth factor chain transfer biological marker fibroblast growth genetic heterogeneity lipid raft excretory function cns disorder entity name part qualifier - adopted cessation of life standards characteristic complement system protein one thousand hypertensive disease limited stage (cancer stage) tissue membrane glutathione s-transferase adverse reaction to drug diameter (qualifier value) congenital abnormality kidney failure, chronic renal insufficiency creatinine measurement, serum (procedure) forecast of outcome stage level 1 microgram per liter milliliter per minute diagnosis, clinical vesicle (morphologic abnormality) lipid metabolism disorder transplanted tissue membrane protein traffic stage level 3 cfh gene hemolytic-uremic syndrome kidney failure, acute blighia sapida creatinine clearance measurement cystatin c (substance) stage level 5