If the paired differences to be analyzed by a Wilcoxon paired signed rank test come from a population whose distribution violates the assumption of symmetry, or if outliers are present, then the paired signed rank test on the original data may provide misleading results, or may not be the most powerful test available. Transforming the data to promote normality and then performing a paired t test, or using another nonparametric test may provide a better analysis.
- Transformations (a single function applied to each data value) can be applied to correct problems of nonnormality>. For example, taking logarithms of sample values can reduce skewness to the right. If such a transformation can be found, the transformed data may be suitable for use with a paired t test. The resulting test might be more powerful than the original signed rank test, although this is not very likely. The signed rank test is nearly as powerful as the t test even when the data do in fact come from a normal distribution. The same transformation should be applied to both samples. Unless scientific theory suggests a specific transformation a priori, transformations are usually chosen from the "power family" of transformations, where each value is replaced by x**p, where p is an integer or half-integer, usually one of:
- -2 (reciprocal square)
- -1 (reciprocal)
- -0.5 (reciprocal square root)
- 0 (log transformation)
- 0.5 (square root)
- 1 (leaving the data untransformed)
- 2 (square)
For p = -0.5 (reciprocal square root), 0, or 0.5 (square root), the data values must all be positive. To use these transformations when there are negative and positive values, a constant can be added to all the data values such that the smallest is greater than 0 (say, such that the smallest value is 1). (If all the data values are negative, the data can instead be multiplied by -1, but note that in this situation, data suggesting skewness to the right would now become data suggesting skewness to the left.) Note that if you transform the paired differences so that those that originally had value 0 no longer do, the effective sample size of the data set will be changed. You can, of course, preserve the same sample size by only including in the transformation the non-zero paired differences, and making sure that none of the transformed paired differences become 0. To preserve the order of the original data in the transformed data, if the value of p is negative, the transformed data are multiplied by -1.0; e.g., for p = -1, the data are transformed as x --> -1.0/x. Taking logs or square roots tends to "pull in" values greater than 1 relative to values less than 1, which is useful in correcting skewness to the right. Transformation involves changing the metric in which the data are analyzed, which may make interpretation of the results difficult if the transformation is complicated. If you are unfamiliar with transformations, you may wish to consult a statistician before proceeding.
- Other nonparametric tests:
- Although the Wilcoxon paired signed rank test is the most commonly used nonparametric alternative to the paired two-sample t test, it is not the only one. However, all tests assume that the paired differences are independent. The paired sign test can be calculated for paired differences even when only the direction (+ or -) of the difference is known. This means that it can be applied in situations when the paired signed rank test, which requires at least knowledge of the relative ranks and directions (signs) of the paired differences, can not be used. The sign test does not assume symmetry of the population distribution for the paired differences, but is likely to be less powerful than the paired signed rank test when that distribution is in fact symmetric. If the distribution is extremely heavy-tailed, the sign test may be more powerful than either the paired signed rank test or the paired t test.
- Paired two-sample t test
- If the sampled paired differences do indeed come a population with a normal distribution, then the paired two-sample t test is the most powerful test of the equality of the two means, meaning that no other test is more likely to detect an actual difference between the two means. If the population distribution for the paired differences is not normal, however, the signed rank test is likely to be more powerful at detecting differences between the sample medians. And it is nearly as powerful as the paired t test even when the paired differences do come from a normal distribution. If applying a transformation promotes normality, the paired two-sample t test may be a more powerful test than the paired signed rank test for the transformed data.