Does your data violate one-sample t test assumptions?
If the population
from which the data to be analyzed by a one-sample t test were sampled violates
one or more of the one-sample t test assumptions, the results of the analysis
may be incorrect or misleading. For example, if the assumption of independence
for the sample values is violated, then the one-sample t test is simply not
appropriate.
If the assumption of normality
is violated, or outliers
are present, then the one-sample t test may not be the most powerful
test available, and this could mean the difference between detecting a true
difference or not. A nonparametric
test or employing a transformation
may result in a more powerful test. For example, if the population
distribution is not symmetric, a transformation may produce symmetry.
Often, the effect of an assumption violation on the one-sample t test result
depends on the extent of the violation (such as how skewed
the distribution of the population is). Some small violations may have little
practical effect on the analysis, while other violations may render the
one-sample t test result uselessly incorrect or uninterpretable. In particular,
a small
sample size may increase vulnerability to assumption violations.
A lack of independence
within a sample is often caused by the existence of an implicit factor in the
data. For example, values collected over time may be serially correlated
(here time is the implicit factor). If the data are in a particular order,
consider the possibility of dependence. (If the row order of the data reflect
the order in which the data were collected, an index
plot of the data [data value plotted against row number] can reveal
patterns in the plot that could suggest possible time effects.)
Values may not be identically distributed because of the presence of outliers.
Outliers are anomalous values in the data. Outliers tend to increase the
estimate of sample variance, thus decreasing the calculated t statistic and
lowering the chance of rejecting the null
hypothesis. They may be due to recording errors, which may be correctable,
or they may be due to the sample not being entirely from the same population.
Apparent outliers may also be due to the values being from the same, but nonnormal,
population. The boxplot
and normal
probability plot (normal Q-Q plot) may suggest the presence of outliers in
the data.
The one-sample t statistic is based on the sample mean and the sample
variance of the sample values, both of which are sensitive to outliers.
(In other words, neither the sample mean nor the sample variance is resistant
to outliers, and thus, neither is the t statistic.) In particular, a large
outlier can inflate the sample variance, decreasing the t statistic and thus
perhaps eliminating a significant difference. A nonparametric
test may be a more powerful test in such a situation. If you find outliers
in your data that are not due to correctable errors, you may wish to consult a
statistician as to how to proceed.
If the population from which the data were sampled is skewed, then the
one-sample t test may incorrectly reject the null hypothesis that the
population mean is the hypothesized value even when it is true. The one-sample
signed rank test also assumes symmetry, and may not be appropriate
alternative in this case. The one-sample
sign test does not rely on symmetry, and may be an appropriate alternative
test. Unless the skewness is severe, or the sample size very small, the t
test may perform adequately.
Whether or not the population is skewed can be assessed either informally
(including graphically),
or by examining the sample skewness statistic or conducting a test for
skewness.
If outliers or skewness is present, employing a transformation
may resolve both problems at once, and also promote normality. In this case,
it may be preferable to perform a one-sample t test on the transformed data.
The usual measurement for skewness is not resistant
to outliers, so one should be consider the possibility that apparent skewness
is in fact due to one or more outliers. A lack of power due to small sample
sizes may also make it hard to detect skewness.
The values in a sample may indeed be from the same population, but not
from a normal one. Signs of nonnormality
are skewness
(lack of symmetry) or light-tailedness
or heavy-tailedness.
The boxplot,
histogram,
and normal
probability plot (normal Q-Q plot), along with the normality test, can
provide information on the normality of the population distribution. However,
if there are only a small number of data points, nonnormality can be hard to
detect. If there are a great many data points, the normality test may detect
statistically significant but trivial departures from normality that will have
no real effect on the t statistic (since the t statistic will converge in
probability to the standard normal distribution by the law of large numbers).
For data sampled from a normal distribution, normal probability plots
should approximate straight lines, and boxplots should be symmetric (median
and mean together, in the middle of the box) with no outliers.
If the sample size is not too small, then the t statistic will not be much
affected even if the population distributions are skewed,
although it will increase the chance that an incorrectly small P value will be
reported (i.e., that the null
hypothesis will be rejected when it is in fact true.
Unless the sample size is small (less than 10), light-tailedness
or heavy-tailedness
will have little effect on the t statistic. Light-tailedness will tend to
increase the chance that an incorrectly small P value will be reported (i.e.,
that the null hypothesis will be rejected when it is in fact true.
Heavy-tailedness will tend to increase the chance that an incorrectly large P
value will be reported (i.e., that the null hypothesis will not be rejected
when it is in fact false, making the test conservative.
Robust
statistical tests operate well across a wide variety of distributions.
A test can be robust for validity, meaning that it provides P values close to
the true ones in the presence of (slight) departures from its assumptions. It
may also be robust for efficiency, meaning that it maintains its statistical
power
(the probability that a true violation of the null
hypothesis will be detected by the test) in the presence of those
departures. The t test is fairly robust for validity against nonnormality, but
it may not be the most powerful test available for a given nonnormal
distribution, although it is the most powerful
test available when its test assumptions are met. In the case of nonnormality,
a nonparametric
test or employing a transformation
may result in a more powerful test.
If the sample size is small, it may be difficult to detect assumption
violations. With small samples, violation assumptions such as nonnormality
are difficult to detect even when they are present. Also, with small sample
size(s) there is less resistance to outliers, and less protection against
violation of assumptions.
Even if none of the test assumptions are violated, a t test with a small
sample size may not have sufficient power
to detect a significant departure from the hypothesized mean value, even if
this is in fact the case. The power curve presented in the results of the t
test indicates how likely the test would be to detect an actual difference
between the hypothesized mean and the population mean. The
shallower the power curve, the bigger the actual difference would have to be
before the t test would detect it. The power depends on variance, the selected
significance (alpha-) level of the test, and the sample size. Power decreases
as the variance increases, decreases as the significance level is decreased
(i.e., as the test is made more stringent), and increases as the sample size
increases. A very small sample from a population with a mean very different
from the hypothesized value may not result in a significant t test statistic
unless the sample variance is small. If a statistical significance test with
small sample sizes produces a surprisingly non-significant P
value, then a lack of power may be the reason. The best time to avoid such
problems is in the design stage of an experiment, when appropriate minimum
sample sizes can be determined, perhaps in consultation with a statistician,
before data collection begins.
If you are unsatisfied with your purchase, you may return it within 30
days for an
exchange, credit or refund.
This guarantee does not cover electronic download products, special requests requiring photocopying
or
engineering aids; however, if you cannot
edit our document(s) in your MS Word, Excel or Visio program we will fix
it or give you a refund.
Can't find what you're
looking for...?
Please call, Fax or Email Us at:
Office: (719) 649-4242
Fax: (719) 573-4205 Home Page
Click here to bookmark At-PQC™ then visit our
Toolbox to find a quality control plan that will
help you achieve an effective and efficient business
infrastructure that focuses on customer satisfaction,
continuous improvement and desirable cost savings. Visit
with us today for comprehensive assistance in developing
or choosing the right quality control plan for your
business.
Click here to visit our extensive selection of
quality control plans, policies, procedures and forms or
click here
for help with where-to-start.
We can interact with you anywhere in the USA from
8:00am to 5:00pm Monday through Friday except holidays.
At-PQC™
JnF Specialties, LLC
664 Greenscape Lane
Colorado Springs, Colorado 80916-5534
Office:
(719) 649-4242
Fax: (719) 573-4205
Email Us at:
Send an email to request next-day support or call our helpline at 719-649-4242
during your office hours
Mon - Fri except holidays.