Testing related samples with missing values: a permutation approach

Acommon design in experimental studies of animal behaviour is based on related samples or repeated measures. A number, N, of subjects is tested, each several times with the same set of k treatments. The advantage of such a design is that each subject can serve as its own control. Intraindividual variability in behaviour is often as large as interindividual variability. While one subject may generally react at a low level, although more strongly under a particular experimental treatment, another subject may generally react at a higher level, and even more strongly under the same treatment. Under such circumstances, the only way to estimate differences between the effects of different experimental conditions may be to test each subject under each condition and use the respective statistical tests (e.g. Wilcoxon signed-ranks test, Friedman one-way ANOVA, t test for paired comparisons, repeated measures ANOVA, or two-way ANOVA without replications; Sokal & Rohlf 1987; Siegel & Castellan 1988; Bortz et al. 1990). In field work, however, it is often impossible to test a number of subjects, each under a certain set of treatments. Subjects tend to disappear, or, if present, sometimes do not react at all. A lack of reaction does not necessarily result from an experimental treatment, but can also be caused by, for instance, weather conditions or the subject being engaged in other activities. Accordingly, it is usually treated not as a result of the experiment, but as a missing data point. As a result, experimental fieldwork often does not reveal a consistent table with a figure in each cell, but one that contains empty cells denoting missing values. Common statistical procedures for related samples, however, require complete data sets. Hence, a table with missing values raises the question how to perform the statistical analysis. One correct way is to omit all subjects or treatments with missing values and perform the analysis on the remaining data set. However, doing so is often not very satisfactory since one loses information obtained from the experiment. Since the power of tests decreases with reduced sample sizes, the resulting data set might not reveal any significance. In the worst case, the data set might even become too small for any statistical analysis at all. Alternatively, one might think about testing the treatments as if no animal was tested more than once, for instance by using a one-way ANOVA or the Kruskal– Wallis H test. However, these tests require independent data and thus are not suitable when some subjects occur in more than one sample. In addition, these tests may be unable to detect any effects of treatments when intraindividual variability is relatively high. One might also consider replacing missing data by calculated values (‘imputation’) for which several techniques have been described (overviews in e.g. Little & Rubin 1987, 1989; Rubin & Schenker 1991; Gornbein et al. 1992). However, imputation of missing data means changing the samples tested and usually leads to underestimated variances since the calculated values are derived from the ones actually present (Gornbein et al. 1992). Obviously this leads to reduced P values and this effect increases with increasing number of missing values and decreasing sample size. An appropriate application of imputation techniques to small samples is questionable. Hence, imputation techniques are only rarely used in behavioural science. Another approach based on ranked data was proposed by Meddis (1984). This test might be a good solution under certain circumstances but its reliability is also likely to be considerably reduced by, for instance, a biased distribution of missing values. The assumptions and limitations of this test are rather unclear, and its usefulness for analysing tables with missing values needs further evaluation under different circumstances. Several other approaches to cope with missing values have been developed, based, for example, on maximum likelihood methods or derivatives of ANOVA models (e.g. Little & Rubin 1987, 1989; Gornbein et al. 1992). However, these methods usually require normality of the data or large samples, assumptions often not met in behavioural research. In addition, they involve rather complex mathematics and so are not easily understood and applied by biologists. To cope with such problems, permutation tests can provide excellent solutions. Here, I describe such a test for related samples with missing values, and show under what circumstances it can be successfully applied. Correspondence: R. Mundry, Institut für Biologie–Verhaltensbiologie, Freie Universität Berlin, Haderslebener Strasse 9, 12163 Berlin, Germany (email: rmundry@biologie.fu-berlin.de).

[1]  R. Sokal,et al.  Introduction to biostatistics , 1973 .

[2]  JULIA FISCHER,et al.  Use of statistical programs for nonparametric tests of small samples often leads to incorrectPvalues: examples fromAnimal Behaviour , 1998, Animal Behaviour.

[3]  D G Pitt,et al.  Applications of computer-intensive statistical methods to environmental research. , 1998, Ecotoxicology and environmental safety.

[4]  D B Rubin,et al.  Multiple imputation in health-care databases: an overview and some applications. , 1991, Statistics in medicine.

[5]  Roderick J. A. Little,et al.  The Analysis of Social Science Data with Missing Values , 1989 .

[6]  R. Little,et al.  Incomplete data in repeated measures analysis , 1992, Statistical methods in medical research.

[7]  J. Hooton,et al.  Randomization tests: statistics for experimenters. , 1991, Computer methods and programs in biomedicine.

[8]  J. Bortz,et al.  Verteilungsfreie Methoden in der Biostatistik , 1982 .

[9]  S. Siegel,et al.  Nonparametric Statistics for the Behavioral Sciences , 2022, The SAGE Encyclopedia of Research Design.

[10]  L. Bejder,et al.  A method for testing association patterns of social animals , 1998, Animal Behaviour.

[11]  D. Rubin,et al.  Statistical Analysis with Missing Data. , 1989 .

[12]  D. Adams,et al.  Using randomization techniques to analyse behavioural data , 1996, Animal Behaviour.

[13]  A. Hayes Permutation test is not distribution-free: Testing H₀: ρ = 0. , 1996 .

[14]  Derek A. Roff,et al.  DISTRIBUTION-FREE AND ROBUST STATISTICAL METHODS: VIABLE ALTERNATIVES TO PARAMETRIC STATISTICS? , 1993 .

[15]  Philip H. Crowley,et al.  RESAMPLING METHODS FOR COMPUTATION-INTENSIVE DATA ANALYSIS IN ECOLOGY AND EVOLUTION , 1992 .

[16]  J. Godin,et al.  Phenotypic Variability within and between Fish Shoals , 1996 .

[17]  P. Good,et al.  Permutation Tests: A Practical Guide to Resampling Methods for Testing Hypotheses , 1995 .

[18]  F. Juanes,et al.  The importance of statistical power analysis: an example from Animal Behaviour , 1996, Animal Behaviour.

[19]  Wulfert P. van den Brink,et al.  A comparison of the power of the t test, Wilcoxon's test, and the approximate permutation test for the two‐sample location problem , 1989 .