If the
populations
from which
data to be analyzed by a F test were sampled
violate one or more of
the F test assumptions, the results of the analysis may be
incorrect or misleading. For example, if the assumption of
independence
is violated, then the F test is simply
not appropriate, although another test (perhaps a
chi-square test for variance)
on the paired differences might be appropriate. If the assumption
of normality is violated,
or outliers are present,
then the F test may not be the most
powerful
test available, and this could mean the difference
between detecting a true difference or not.
A nonparametric test
or more robust test
may result in a more powerful test.
Often, the effect of
an assumption violation on the F test result depends
on the extent of the violation
such as how
skewed
or heavy-tailed
one or the other population
distribution
is).
Some very small violations may have little practical effect
on the analysis, while other violations may render
the F test result uselessly incorrect or uninterpretable.
In particular, small
sample sizes can increase vulnerability to assumption violations.
The bad news is that the F test is strongly affected and often
rendered invalid by violation of the normality assumption.
In fact, if your reason for performing an
F test is to judge whether or not the assumption of equal
variances is valid for
a two-sample unpaired t test,
then the t test is usually
much less affected by nonnormality than the F test is, and
you may be best off simply using a
Welch-Satterthwaite t test
or transforming the
data to be analyzed by the t test if you have reason to
suspect that the sample variances are not equal.
The good news is that the "other" F tests, the ones calculated for
analysis of variance
are F tests for location instead of F tests for
dispersion, and, like the t test, are reasonably
robust to nonnormality if the sample sizes are
not too small.
A lack of independence
within a sample is often caused by
the existence of an implicit factor in the data. For example,
values collected over time may be serially
correlated
(here time is the implicit factor). If the data are in a
particular order, consider the possibility of dependence.
(If the row order of the data reflect the order in which
the data were collected,
an index plot of the data [data
value plotted against row number] can reveal patterns in
the plot that could suggest possible time effects.)
Whether the two samples are
independent
of each other is generally
determined by the structure of the experiment from which
they arise. Obviously correlated samples, such as a
set of pre- and post-test observations on the same subjects,
are not independent, and such data would be more appropriately
tested by a one-sample test on the paired differences.
If you are unsure whether
your samples are independent, you wish may to consult
a statistician or someone who is knowledgeable
about the data collection scheme you are using.
Values may not be identically distributed because of the
presence of outliers.
Outliers are anomalous values in the
data. Outliers tend to increase the estimate of a sample
variance. This can make the F statistic, which is
a ratio of the two sample variances, very different
from what it would be without the outlier(s), and
thus render the F test meaningless. In particular,
one or more outliers in a single sample will
tend to make the F statistic too large, thus
increasing the chance of incorrectly concluding
that the population variances differ.
Outliers may be due to recording errors, which may be
correctable, or they may be due to the sample not being
entirely from the same population. Apparent outliers
may also be due to the values being from the same, but
nonnormal,
population.
The boxplot
and normal probability plot
(normal Q-Q plot) may suggest the presence of outliers in the data.
The F statistic is based on
the the sample variances, both of which
are sensitive to outliers.
(In other words,
the sample variance is not
resistant
to outliers, and thus, neither is the F statistic.)
A nonparametric test
may be a more powerful test in such a situation.
If you find outliers in your data that
are not due to correctable errors, you may wish to consult
a statistician as to how to proceed.
The values in a sample may indeed be from the same
population, but not from a normal one. Signs of
nonnormality
are
skewness
(lack of symmetry) or
light-tailedness or
heavy-tailedness.
The
boxplot,
histogram,
and normal probability plot
(normal Q-Q plot), along with the normality test,
can provide information on the normality of the
population distribution. However, if there are only a small number
of data points, nonnormality can be hard to detect.
If there are a great many data points, the
normality test may detect statistically significant
but trivial departures from normality that will
have no real effect on the F statistic,
although the F test is more sensitive to
even small departures from normality than,
say, the t test.
For data sampled from a normal distribution, normal
probability plots should approximate straight lines,
and boxplots should be symmetric (median and mean together,
in the middle of the box) with no
outliers.
Any departures from normality can render the results of
the F test invalid, although the worst effects come
when the distributions are either heavy-tailed or light-tailed,
rather than when the distributions are simply skewed.
For data from distributions that are heavy-tailed,
the reported P value is much smaller than the actual
significance level, meaning that the F test is much more
likely to incorretly reject the null hypothesis of equal variances
even if it is true. Conversely, for data from distributions
that are light-tailed, such as the uniform distribution,
the reported P value is much larger than the actual
significance level, meaning that the F test is much less
likely to detect a real difference between the population
variances.
Robust
statistical tests operate well across a wide
variety of distributions.
The F test for comparing two variances is not a robust test
against nonnormality,
although it is the most
powerful
test available when its test assumptions are met.
In the case of nonnormality,
a nonparametric test
may result in a more powerful test.
If one or both of the sample sizes is small, it may be difficult
to detect assumption violations. With small samples,
nonnormality
is difficult to detect. Also, with
small sample size(s) the individual sample variances that
make up the F statistic are themselves less reliable.
Even if none of the test
assumptions are violated, an F test with small sample
sizes may not have sufficient
power
to detect a significant
difference between the two samples, even if the variances
are in fact different.
If a statistical
significance test with small sample sizes
produces a surprisingly non-significant
P value, then lack of power may be the reason.
The best time to avoid such problems is in the
design stage of an experiment, when appropriate
minimum sample sizes can be determined, perhaps in consultation
with a statistician, before data collection begins.
If you are unsatisfied with your purchase, you may return it within 30
days for an
exchange, credit or refund.
This guarantee does not cover electronic download products, special requests requiring photocopying
or
engineering aids; however, if you cannot
edit our document(s) in your MS Word, Excel or Visio program we will fix
it or give you a refund.
Can't find what you're
looking for...?
Please call, Fax or Email Us at:
Office: (719) 649-4242
Fax: (719) 573-4205 Home Page
Click here to bookmark At-PQC™ then visit our
Toolbox to find a quality control plan that will
help you achieve an effective and efficient business
infrastructure that focuses on customer satisfaction,
continuous improvement and desirable cost savings. Visit
with us today for comprehensive assistance in developing
or choosing the right quality control plan for your
business.
Click here to visit our extensive selection of
quality control plans, policies, procedures and forms or
click here
for help with where-to-start.
We can interact with you anywhere in the USA from
8:00am to 5:00pm Monday through Friday except holidays.
At-PQC™
JnF Specialties, LLC
664 Greenscape Lane
Colorado Springs, Colorado 80916-5534
Office:
(719) 649-4242
Fax: (719) 573-4205
Email Us at:
Send an email to request next-day support or call our helpline at 719-649-4242
during your office hours
Mon - Fri except holidays.