# Examining rank sum test results to detect assumption violations

All the following results are provided as part of a rank sum test analysis.

#### Results for sample values:

<
• Normality tests:
• Although normality is not assumed for the rank sum test, departures from normality can suggest the presence of outliers in the data, or of dissimilar distributional shape. Conversely, if the populations from which the samples were drawn are in fact normally distributed, the unpaired two-sample t test may be a more powerful alternative to the rank sum test. The normality test will give an indication of whether the populations from which the samples were drawn appear to be normally distributed, but will not indicate the cause(s) of the nonnormality. The smaller the sample size, the less likely the normality test will be able to detect nonnormality. If the sample sizes are large enough for the normality test to correctly detect normality or nonnormality, differing results for the normality test when applied to the two samples (i.e., normality is rejected for only one of the samples) may indicate that the samples do not come from populations that differ only in location. In that situation, the possibility of dissimilar distributional shapes should be considered.
• Histograms:
• The histogram for each sample has a reference normal distribution curve for a normal distribution with the same mean and variance as the sample. This provides a reference for detecting gross nonnormality when the sample sizes are large. It may also help in judging whether the two histograms could come from the distributions (normal or not) with the same shape and dispersion. If the histograms for the two samples are dissimilar, then the possibility of dissimilar distributional shapes should be considered.
• Boxplots:
• Suspected outliers appear in a boxplot as individual points o or x outside the box. If these appear on both sides of the box, they suggest the possibility of a heavy-tailed distribution. If they appear on only one side, they also suggest the possibility of a skeweddistribution. Skewness is also suggested if the mean (+) does not lie on or near the central line of the boxplot, or if the central line of the boxplot does not evenly divide the box. Examples of these plots will help illustrate the various situations. If the boxplots for the two samples are dissimilar, then the possibility of dissimilar distributional shapes should be considered.
• Normal probability plot:
• For values sampled from a normal distribution, the normal probability plot, (normal Q-Q plot) has the points all lying on or near the straight line drawn through the middle half of the points. Scattered points lying away from the line are suspected outliers. Examples of these plots will help illustrate the various situations. If the normal probability plots for the two samples are dissimilar, then the possibility of dissimilar distributional shapes should be considered.