Resampling (statistics) - Permutation Tests

Permutation Tests

A permutation test (also called a randomization test, re-randomization test, or an exact test) is a type of statistical significance test in which the distribution of the test statistic under the null hypothesis is obtained by calculating all possible values of the test statistic under rearrangements of the labels on the observed data points. In other words, the method by which treatments are allocated to subjects in an experimental design is mirrored in the analysis of that design. If the labels are exchangeable under the null hypothesis, then the resulting tests yield exact significance levels; see also exchangeability. Confidence intervals can then be derived from the tests. The theory has evolved from the works of R.A. Fisher and E.J.G. Pitman in the 1930s.

To illustrate the basic idea of a permutation test, suppose we have two groups and whose sample means are and, and that we want to test, at 5% significance level, whether they come from the same distribution. Let and be the sample size corresponding to each group. The permutation test is designed to determine whether the observed difference between the sample means is large enough to reject the null hypothesis H that the two groups have identical probability distribution.

The test proceeds as follows. First, the difference in means between the two samples is calculated: this is the observed value of the test statistic, T(obs). Then the observations of groups and are pooled.

Next, the difference in sample means is calculated and recorded for every possible way of dividing these pooled values into two groups of size and (i.e., for every permutation of the group labels A and B). The set of these calculated differences is the exact distribution of possible differences under the null hypothesis that group label does not matter.

The one-sided p-value of the test is calculated as the proportion of sampled permutations where the difference in means was greater than or equal to T(obs). The two-sided p-value of the test is calculated as the proportion of sampled permutations where the absolute difference was greater than or equal to ABS(T(obs)).

If the only purpose of the test is reject or not reject the null hypothesis, we can as an alternative sort the recorded differences, and then observe if T(obs) is contained within the middle 95% of them. If it is not, we reject the hypothesis of identical probability curves at the 5% significance level.

Read more about this topic: Resampling (statistics)

Famous quotes containing the word tests:

“The cinema is going to form the mind of England. The national conscience, the national ideals and tests of conduct, will be those of the film.”
—George Bernard Shaw (1856–1950)