Analysis of Statistical Tests to Compare Visual Analog Scale Measurements among Groups

Background A common type of study performed by anesthesiologists determines the effect of an intervention on pain reported by groups of patients. The goal of this study was to evaluate the effectiveness of t, analysis of variance (ANOVA), Mann-Whitney, and Kruskal-Wallis tests to compare visual analog scale (VAS) measurements between two or among three groups of patients. These results may be particularly helpful during the design of studies that measure pain with a VAS. Methods One VAS measurement was obtained from each of 480 nulliparous women in labor who were receiving oxytocin (149), nalbuphine (159), or epidural bupivacaine (172). Multiple simulated samples were then drawn from these data. These simulated samples were used in computer simulations of clinical trials comparing VAS measurements among groups. t and ANOVA tests were performed before and after an arcsin transformation was used, to make the data closer to a normal distribution. VAS measurements were also compared after they were divided into five ranked categories. Results The statistical distributions of VAS measurements were not normal (P < 10 sup -7). Arcsin transformation made the distributions closer to normal distributions. Nevertheless, no statistical test incorrectly suggested that a difference existed among groups, when there was no difference, more often than the expected rate, t or ANOVA tests had a slightly greater statistical power than the other tests to detect differences among groups. Because arcsin transformation both decreased differences among means and reduced the variance to a lesser extent, it decreased power to detect differences among groups. Statistical power to detect differences among groups was not less for a five-category VAS than for a continuous VAS. Conclusions We conclude that t and ANOVA, without an accompanying arcsin transformation, are good tests to find differences in VAS measurements among groups.