Possible alternatives if your data violate goodness of fit (chi-square) test assumptions
If the populations from which
data for a goodness of fit (chi-square) test
were sampled
violate one or more of
the chi-square test assumptions, the results of the analysis may be
incorrect or misleading.
If there are factors
unaccounted for in the analysis,
then the
chi-square test may not give useful results.
In such cases, stratification
may provide a better analysis.
Alternatively, you can use a test
specifically tailored
to the family of your hypothesized distribution, such as
tests for normality.
Data from a continuous distribution might
better match a specific hypothesized distribution
after transformation.
All of these alternatives require that you have access to the
original individual data values.
Alternative procedures:
Stratification: Dividing
the sample into homogeneous subsamples
Stratification involves dividing a sample into
subsamples based on one or more characteristics
of the population. For example, a sample may
be stratified by gender.
If the distribution function is different for
the different strata, then the characteristic
used for stratification may be an
implicit factor,
and a separate analysis
for each individual subsample may be more
informative than an analysis of the entire sample.
A potential drawback with stratification is that one or
more of the subsamples may be small in size, leading to
problems
with the reliability of the test results.
Also, the results for each subsample are generalizable
to only a part of the sample population.
The goodness of fit chi-square test is extremely versatile:
If you can determine what the expected frequencies should
be to correspond with the observed frequencies, then you
can calculate the test.
However, because the test is so general, it is usually not
the most powerful test available for a specific distribution,
particularly if the distribution
is continuous. With a continous distribution, there is the added
problem of deciding how to divide the data into discrete
categories before applying the test.
One alternative to using the chi-square test is to choose
a test specifically tailored to the distribution of interest.
The Kolmogorov-Smirnov test is
commonly used to test whether the population distribution follows a
specified continuous distribution, such as the uniform or normal.
When the hypothesized distribution is a normal distribution,
there are a number of
tests for normality available.
Some of these tests, such as the Shapiro-Wilk test
have the added advantage that you need not specify
the mean and variance of the hypothesized
normal distribution beforehand.
In general, if there is a test available that is tailored
to your hypothesized distribution, you should prefer
that to using the chi-square goodness of fit test.
A transformation of the data may create
a data set that more closely approximates that
from the hypothesized distribution distribution.
Or theory may suggest that transformed data
should follow a hypothesized distribution that is easier
to work with (say, for calculating the expected
frequencies) than the hypothesized distribution
for the original data.
Transformations (a single function applied to each
data value) are often applied to correct problems of
skewness
or heavy tails.
For example, taking logarithms of sample values
can reduce
skewness
to the right.
Unless scientific
theory suggests a specific transformation a priori,
transformations are usually chosen from the "power family"
of transformations, where each value is replaced by
x**p, where p is an integer or half-integer, usually
one of:
-2 (reciprocal square)
-1 (reciprocal)
-0.5 (reciprocal square root)
0 (log transformation)
0.5 (square root)
1 (leaving the data untransformed)
2 (square)
For p = -0.5 (reciprocal square root),
0, or 0.5 (square root), the data values must all be
positive. To use these transformations when there
are negative and positive values,
a constant can be added to all the data values
such that the smallest is greater than 0 (say,
such that the smallest value is 1). (If all
the data values are negative, the data can
instead be multiplied by -1, but note that
in this situation, data suggesting
skewness
to the right
would now become data suggesting skewness to the left.)
To preserve the order of the original data
in the transformed data, if the value of p is
negative, the transformed data are
multiplied by -1.0; e.g., for p = -1,
the data are transformed as x --> -1.0/x.
Taking logs or square roots tends to "pull in"
values greater than 1 relative to values less
than 1, which is useful in correcting skewness
to the right. Transformation involves changing
the metric in which the data are analyzed, which
may make interpretation of the results difficult if the
transformation is complicated. If you are unfamiliar
with transformations, you may wish to consult a
statistician before proceeding.
If you are unsatisfied with your purchase, you may return it within 30
days for an
exchange, credit or refund.
This guarantee does not cover electronic download products, special requests requiring photocopying
or
engineering aids; however, if you cannot
edit our document(s) in your MS Word, Excel or Visio program we will fix
it or give you a refund.
Can't find what you're
looking for...?
Please call, Fax or Email Us at:
Office: (719) 649-4242
Fax: (719) 573-4205 Home Page
Click here to bookmark At-PQC™ then visit our
Toolbox to find a quality control plan that will
help you achieve an effective and efficient business
infrastructure that focuses on customer satisfaction,
continuous improvement and desirable cost savings. Visit
with us today for comprehensive assistance in developing
or choosing the right quality control plan for your
business.
Click here to visit our extensive selection of
quality control plans, policies, procedures and forms or
click here
for help with where-to-start.
We can interact with you anywhere in the USA from
8:00am to 5:00pm Monday through Friday except holidays.
At-PQC™
JnF Specialties, LLC
664 Greenscape Lane
Colorado Springs, Colorado 80916-5534
Office:
(719) 649-4242
Fax: (719) 573-4205
Email Us at:
Send an email to request next-day support or call our helpline at 719-649-4242
during your office hours
Mon - Fri except holidays.