Transformations (a single function applied to each X or each
Y data value) are applied to correct problems of
nonnormality
or unequal variances.
For example, taking logarithms of sample values
can reduce
skewness
to the right. Transforming the Y values
to remedy nonnormality often results in correcting
heteroscedasticity (unequal variances).
Occasionally, both the X and Y variables are
transformed.
Unless scientific
theory suggests a specific transformation a priori,
transformations are usually chosen from the "power family"
of transformations, where each value is replaced by
xp, where p is an integer or half-integer, usually
one of:
- -2 (reciprocal square)
- -1 (reciprocal)
- -0.5 (reciprocal square root)
- 0 (log transformation)
- 0.5 (square root)
- 1 (leaving the data untransformed)
- 2 (square)
For p = -0.5 (reciprocal square root),
0, or 0.5 (square root), the data values must all be
positive. To use these transformations when there
are negative and positive values,
a constant can be added to all the data values
such that the smallest is greater than 0 (say,
such that the smallest value is 1). (If all
the data values are negative, the data can
instead be multiplied by -1, but note that
in this situation, data suggesting
skewness
to the right
would now become data suggesting skewness to the left.)
To preserve the order of the original data
in the transformed data, if the value of p is
negative, the transformed data are
multiplied by -1.0; e.g., for p = -1,
the data are transformed as x --> -1.0/x.
Taking logs or square roots tends to "pull in"
values greater than 1 relative to values less
than 1, which is useful in correcting skewness
to the right.
Another common transformation is the
antilogarithm (exp(x)), which has effects
similar to but more extreme than squaring:
"drawing out" values greater than 1 relative
to values less than 1.
Generally speaking, transformations of X are
used to correct for non-linearity, and
transformations of Y to correct for
nonconstant variance of Y or nonnormality
of the error terms. A transformation of Y
to correct nonconstant variance or
nonnormality of the error terms
may also increase linearity.
Transforming Y may change the error distribution from normal
to nonnormal if the error distribution
was normal to begin with.
A transformation of Y involves changing
the metric in which the fitted values are analyzed, which
may make interpretation of the results difficult if the
transformation is complicated. If you are unfamiliar
with transformations, you may wish to consult a
statistician before proceeding.
The graph of the X-Y data may suggest an
appropriate transformation of X if the plot shows
nonlinearity but constant error variance
(that is, the general shape of the plot is not linear,
but the vertical deviation in the data values
appears constant over the range of X values).
If the X-Y plot suggests an arc from upper left to
lower right so that data points either very low or very high in X lie
above the straight line suggested by the data,
while the data points with middling X values
lie on or below that straight line, taking reciprocals or
reciprocals of the antilogarithms of the X values may promote linearity:
If the X-Y plot suggests an arc from lower left to
upper right so that data points either very low or very high in X lie
above the straight line suggested by the data,
while the data points with middling X values
lie on or below that straight line, taking squares or
antilogarithms of the X values may promote linearity:
If the X-Y plot suggests an arc from upper left to
lower right so that data points either very low or very high in X lie
below the straight line suggested by the data,
while the data points with middling X values
lie on or above that straight line, taking squares or
antilogarithms of the X values may promote linearity:
The choice of a transformation of Y may be suggested
by examining the plot of residuals against X or fitted values,
If this appears linear, but the variance of the residuals
increases as X increases, suggesting a wedge or megaphone shape,
then taking square roots, logarithms, or reciprocals
of the Y values may promote homogeneity of variance:
If the plot of residuals against X or fitted values
is a convex arc from lower left to upper right,
and the variance of the residuals
increases as X increases, then taking square roots
of the Y values may promote homogeneity of variance:
If the plot of residuals against X or fitted values
is a concave arc from upper left to lower right,
and the variance of the residuals
decreases as X increases, then taking logarithms
of the Y values may promote homogeneity of variance:
When a transformation of Y is indicated, a simultaneous
transformation of X may also improve linearity of
the fit with the transformed Y.
Transformations should be used with caution. After
the data have been transformed, you should
examine the X-Y scatterplot of the data
to ascertain whether the assumptions of parallel straight regression lines
and equal variances for the Y are correct, and whether there
is any apparent effect of treatment level on X for the transformed data.
You should also check the ANCOVA results for signs
of nonnormality of the residuals, or nonequality of residual
variances across treatment groups.