Does your data violate Wilcoxon paired signed rank test assumptions?
If the
population
from which paired differences
to be analyzed by a Wilcoxon signed rank test were sampled
violate one or more of
the signed rank test assumptions, the results of the analysis may be
incorrect or misleading. For example, if the assumption of
independence
for the paired differences is violated, then the Wilcoxon signed rank
test is simply not appropriate.
Note that the two values that make up
each paired difference need not be independent, and in fact
are expected to be correlated, such as before and after measurements.
If you treat paired data as coming from two independent samples,
such as doing an inappropriate Mann-Whitney rank-sum
test instead of a paired signed rank test, then you may sacrifice
power.
Often, the effect of an assumption violation on the signed rank test
result depends on the extent of the violation (such as how
skewed
the distribution of the paired differences is). Some small violations may have little practical effect
on the analysis, while other violations may render the signed rank test
result uselessly incorrect or uninterpretable.
In particular, small
sample sizes can increase vulnerability to assumption violations.
A lack of independence
within a sample is often caused by
the existence of an implicit factor in the data. For example,
values collected over time may be serially
correlated
(here time is the implicit factor). If the data are in a
particular order, consider the possibility of dependence.
(If the row order of the data reflect the order in which
the data were collected,
an index plot of the data [data
value plotted against row number] can reveal patterns in
the plot that could suggest possible time effects.)
Values may not be identically distributed because of the
presence of outliers.
Outliers are anomalous values in the
data. They may be due to recording errors, which may be
correctable, or they may be due to the sample not being
entirely from the same population. Apparent outliers
may also be due to the values being from the same, but
skewed or
heavy-tailed
population.
Outliers may lead to an incorrect conclusion
that the distribution of the paired differences is skewed.
Because the statistic for the signed rank test is
resistant,
it will not be substantially affected by the
presence of outliers unless the number of
outliers becomes large relative to the sample size.
The signed rank test generally does well with paired differences with
outliers, or when the paired differences come from
heavy-tailed (but symmetric) distributions.
If the population from which the paired differences
were sampled is skewed, then the signed rank test
may incorrectly reject the null hypothesis that the
median of the paired differences is 0 even when
it is true. The
paired sign test does not rely on symmetry,
and may be an appropriate alternative test.
Paired differences are often symmetric even when
the two populations producing the values that
make up the paired differences are both unsymmetric,
provided that those two populations have similar skewness.
For example, two very positively skewed distributions that
differ only by location
will produce a set of paired differences
that are symmetric about 0, and perfectly suitable
for the signed rank test. This is often the case
with before and after measurements.
Whether or not the population of the paired differences
is skewed can be assessed either informally (including
graphically),
or by examining the sample skewness statistic or conducting a
test for skewness.
If outliers or skewness is present, employing a
transformation may
resolve both problems at once, and also promote normality.
In this case, it may be preferable to perform an
paired two-sample t test
on the transformed data, as the t test has slightly more
power than the signed rank test
if the assumption of normality holds. (The signed rank test
has about 95% efficiency compared to the paired t test
if the assumption of normality is in fact correct.)
The usual measurement for skewness is
not resistant
to outliers, so one should be consider the possibility that
apparent skewness is in fact due to one or more outliers.
A lack of power due to small sample sizes may also
make it hard to detect skewness.
Outliers
may appear as anomalous points in a graph of the paired differences
against their median.
A boxplot or
normal probability plot
of the paired differences can
also reveal lack of symmetry
and suspected outliers.
If the number of non-zero paired differences is small, it may be difficult
to detect assumption violations. With small samples, violation assumptions
such as skewness
are difficult to detect even when they are present. Also, with
small sample size(s) there is less resistance to outliers, and less protection
against violation of assumptions.
Even if none of the test
assumptions are violated, a signed rank test with small sample
sizes may not have sufficient
power
to detect a significant
difference between the median of the paired differences and 0, even if
the medians are in fact different.
Power decreases as the significance
level is decreased (i.e., as the test is made
more stringent), and increases as the sample size
increases. With very small samples, even samples from
populations with very different means may not produce
a significant signed rank test statistic.
If a statistical significance test with small sample sizes
produces a surprisingly non-significant
P value, then a lack of power may be the reason.
The best time to avoid such problems is in the
design stage of an experiment, when appropriate
minimum sample sizes can be determined, perhaps in consultation
with a statistician, before data collection begins.
Because paired differences equal to 0 are ignored (omitted from the
analysis), having a relatively large number of paired differences equal to 0
can drastically reduce the effective sample size.
If there are many tied values in the data, the assumption
of continuity for the distribution of the paired differences
may be called into question. A correction for tied values
is made in performing the signed rank test; however, the number
of ties must be quite large relative to the total
sample size before the correction makes a substantial
difference in the test results. The effect of ties
depends not only on the number of ties, but how many
observed paired differences are tied at a single value.
A cluster of paired differences tied at the same value will lead to a
bigger correction than the same number of ties scattered
a different values. Such a situation also raises questions
about the assumption of independence
for the paired differences as well as whether they come from
a continuous distribution.
If you are unsatisfied with your purchase, you may return it within 30
days for an
exchange, credit or refund.
This guarantee does not cover electronic download products, special requests requiring photocopying
or
engineering aids; however, if you cannot
edit our document(s) in your MS Word, Excel or Visio program we will fix
it or give you a refund.
Can't find what you're
looking for...?
Please call, Fax or Email Us at:
Office: (719) 649-4242
Fax: (719) 573-4205 Home Page
Click here to bookmark At-PQC™ then visit our
Toolbox to find a quality control plan that will
help you achieve an effective and efficient business
infrastructure that focuses on customer satisfaction,
continuous improvement and desirable cost savings. Visit
with us today for comprehensive assistance in developing
or choosing the right quality control plan for your
business.
Click here to visit our extensive selection of
quality control plans, policies, procedures and forms or
click here
for help with where-to-start.
We can interact with you anywhere in the USA from
8:00am to 5:00pm Monday through Friday except holidays.
At-PQC™
JnF Specialties, LLC
664 Greenscape Lane
Colorado Springs, Colorado 80916-5534
Office:
(719) 649-4242
Fax: (719) 573-4205
Email Us at:
Send an email to request next-day support or call our helpline at 719-649-4242
during your office hours
Mon - Fri except holidays.