Test–retest errors and the apparent heterogeneity of training response