Does your data violate goodness of fit (chi-square) test assumptions?
If the
population
from which
data to be analyzed by a goodness of fit (chi-square) test were sampled
violate one or more of
the goodness of fit (chi-square) test assumptions, the results of the analysis may be
incorrect or misleading. For example, if the assumption of
independence
is violated, then the goodness of fit (chi-square) test is simply
not appropriate.
If the total sample size is small,
then the expected values may be too small for
the approximation involved in the chi-square
test to be valid.
If it is not possible to
cleanly assign each observation to exactly one
cell
(category) of the table, or if an ad hoc
scheme is used to divide a continuous variable into
discrete categories, then the results of the
goodness of fit chi-square test may vary
greatly depending on the exact apportionment
of observations into cells of the table.
If the categories are ordered
instead of nominal, especially
if one or both of the classification variables
is actually continuous rather than discrete,
then a chi-square goodness of fit test may not be the most
powerful
test available, and this could mean the difference
between detecting a true difference or not.
Generally speaking, if you are testing against a
well-known distribution like the
normal distribution,
there is likely to be a more powerful test tailored to
that specific distribution, and which may not require
you to completely specify the distribution function
beforehand.
Often, the effect of an assumption violation on the
test result depends on the extent of the violation.
Whether the observations are
independent
of each other is generally
determined by the structure of the experiment from which
they arise.
A lack of independence within a sample is often caused by
the existence of an implicit factor in the data.
For example,
values collected over time may be serially
correlated
(here time is the implicit factor). If the data are in a
particular order, consider the possibility of dependence.
(If the row order of the data reflect the order in which
the data were collected, an index plot
of the data [data
value plotted against row number] can reveal patterns in
the plot that could suggest possible time effects.)
An implicit factor may also separate the data into different
distributions of the same "family" (say, several different
normal distributions). Each subsample
would follow a distribution from the family, but
the combined data would not fit a distribution from
the family.
For example, measurements
for females may follow a normal distribution, and measurements
for males may also follow a normal distribution, but the measurements
for the entire population of both males and females may not
follow a normal distribution. Depending on the relative
proportions of sampled data from each underlying normal distribution,
and on the means and variances of each distribution, the
composite mixture
distribution may appear to be
skewed,
or to have nonnormal kurtosis,
or both. Separating the data into different subsamples
based on the value of the implicit factor may reveal
that, conditional on the value of the implicit factor
(e.g., gender), the data are sampled from a normal distribution,
even if it is a different distribution for each value of
the implicit factor.
Of course, an implicit factor may also separate the data
into different distributions that do not all come from
the same family. And
if one of more of the subsamples has a
small sample size,
the test on the subsample may fail to detect a difference
from the hypothesized distribution due to a lack
of power.
The chi-square statistic may be large due to the
presence of outliers.
Outliers are anomalous values in the
data.
They may be due to recording errors, which may be
correctable, or they may be due to the sample not being
entirely from the same population.
If you find outliers in your data that
are not due to correctable errors, you may wish to consult
a statistician as to how to proceed.
As long as the probability of falling into category i
is non-zero, the expected value for that cell of the
table will be greater than 0.
If the total sample size small, or if there are
many cells in the table, then
it may happen that no observations are recorded
for a particular cell. These zero values in
a table are sampling zeroes.
However, the actual process
that creates the observations may produce cells
in the table in which observations
can never occur. The zero values that must
occur in these cells are structural zeroes.
The goodness of fit chi-square test
is not designed for tables with structural zeroes.
If you find structural zeroes in your data,
you may wish to consult
a statistician as to how to proceed.
The chi-square test involves using the chi-square
distribution
to approximate the underlying exact distribution.
The approximation becomes better as the
expected cell
frequencies grow larger, and may be inappropriate
for tables with very small expected cell frequencies.
For tables with expected cell frequencies less than 5,
the chi-square approximation may not
be reliable. A standard (and conservative)
rule of thumb (due to Cochran) is to avoid using
the chi-square test for tables with expected
cell frequencies less than 1, or when more than 20% of
the table cells have expected cell frequencies
less than 5.
Another rule of thumb (due to
Roscoe and Byars) is that
the average expected cell frequency should
be at least 1 when the expected cell frequencies
are close to equal, and 2 when they are not.
(If the chosen
significance level
is 0.01 instead of 0.05, then double these numbers.)
Koehler and Larntz suggest that if the total number
of observations is at least 10, the number categories
is at least 3, and the square of the total number
of observations is at least 10 times the number of
categories, then the chi-square approximation
should be reasonable.
Care should be taken
when cell categories are combined (collapsed together)
to fix problems of small expected cell frequencies.
Collapsing can destroy evidence of
non-independence, so a failure to reject the
null hypothesis for the collapsed table does
not rule out the possibility of non-independence
in the original table.
As with most statistical tests, the
power
of the chi-square test increases with a larger number
of observations. If there are too few observations,
it may be impossible to reject the null
hypothesis even if it is false.
The goodness of fit chi-square test is specifically designed for
observations classified into nominal categories.
If the original data variable is actually continuous,
then the variable must be divided into intervals
to construct the table. The interval
boundaries should be decided beforehand on the basis of
theory or custom. If the intervals are determined
by the particular data being analyzed, then the
test statistic and corresponding P value may not
be generalizable.
Ideally,
the categories should be chosen so that
the expected cell frequencies are as equal to each
other as possible. With equal expected cell frequencies,
the chi-square statistic is unbiased,
and the chi-square distribution is
a closer approximation to the actual distribution
of the calculated chi-square statistic.
A rough rule of thumb, due to Mann and Wald,
suggests that squaring the total number of
values, taking the fifth root, and then
doubling that, gives a reasonable number
of categories to use, when the expected
cell frequencies are equal.
The chi-square test ignores any possible
ordering of the variable categories.
If the variable
is continuous, then an alternative test
to the chi-square may be preferable.
The goodness of fit chi-square test assumes that
the expected values frequencies have been calculated
without reference to the observed data.
For example, if we are testing whether the observed
data come from a normal distributions,
then we specify
beforehand what the mean and variance of that normal
distribution are, and use those values to calculate
the expected frequencies.
If you use the observed data to calculate the expected
frequencies, say using the observed data to find the
mean and variance and then using those estimates
to calculate the expected frequencies, then
the goodness of fit chi-square test is not valid
because the hypothesized distribution has already
been adapted to the data to be tested.
This makes the test less likely to reject the
null hypothesis, even when it is false.
In some cases where parameters for the hypothesized distribution function
are estimated from the observed data, the chi-square test may be adjusted by
subtracting 1 degree of freedom for every parameter estimated. However,
the parameters must be estimated from the data in a certain way.
Conover discusses this adjustment.
If you are unsatisfied with your purchase, you may return it within 30
days for an
exchange, credit or refund.
This guarantee does not cover electronic download products, special requests requiring photocopying
or
engineering aids; however, if you cannot
edit our document(s) in your MS Word, Excel or Visio program we will fix
it or give you a refund.
Can't find what you're
looking for...?
Please call, Fax or Email Us at:
Office: (719) 649-4242
Fax: (719) 573-4205 Home Page
Click here to bookmark At-PQC™ then visit our
Toolbox to find a quality control plan that will
help you achieve an effective and efficient business
infrastructure that focuses on customer satisfaction,
continuous improvement and desirable cost savings. Visit
with us today for comprehensive assistance in developing
or choosing the right quality control plan for your
business.
Click here to visit our extensive selection of
quality control plans, policies, procedures and forms or
click here
for help with where-to-start.
We can interact with you anywhere in the USA from
8:00am to 5:00pm Monday through Friday except holidays.
At-PQC™
JnF Specialties, LLC
664 Greenscape Lane
Colorado Springs, Colorado 80916-5534
Office:
(719) 649-4242
Fax: (719) 573-4205
Email Us at:
Send an email to request next-day support or call our helpline at 719-649-4242
during your office hours
Mon - Fri except holidays.