Possible alternatives if your data violate ANCOVA assumptions

Home | StatGuide | Glossary

Possible alternatives if your data violate ANCOVA assumptions


If the ANCOVA model is incorrect, if the Y values do not have constant variance, if the data for the Y variable for one or more of the treatment (group) regressions lines come from a population whose distribution violates the assumption of normality, or outliers are present, then the analysis of covariance on the original data may provide misleading results, or may not be the best test available. In such cases, fitting a different model or a nonlinear model, or transforming the X or Y data may provide a better analysis.

Alternative procedures include:

  • Different linear model: fitting a linear model with additional explanatory variable(s)
  • Nonlinear model: fitting a nonlinear model when the linear model is inappropriate
  • Transformations: correcting nonnormality, nonlinearity, or unequal variances by transforming all the data values for X and/or Y
  • Removing outliers: refitting the linear model after removing outliers

  • Different linear model:
  • Y may actually be best modeled by a linear function that includes other variables besides X and treatment level. If a graph of the residuals against the prospective X variable suggests a linear trend, then a adding one or more additional explanatory variables may provide a better model. These explanatory variables may be either continuous (like the original X variable) or discrete (like the original grouping variable). Such a model might also include interaction terms between one of more of the explanatory variables. It may even turn out that the best model does not include the original X variable.

A number of statistical packages provide facilities for fitting generalized linear models, including linear models with various continuous and discrete explanatory variables. You can also use multiple linear regression to fit such models, if you construct dummy variables to replace the discrete explanatory variables. See Neter et al. for more details.

Nonlinear model:

If Y is actually best modeled by parallel nonlinear function of X, especially if a nonlinear model is suggested on theoretical grounds, then a nonlinear multiple regression can be used to provide the best fit to the data. The shape of the X-Y plot may suggest an appropriate function to use, such as an exponential model.

If the purpose of the analysis of covariance is to study the relationship between X and Y for each treatment, then using a nonlinear regression to fit parallel curves to each treatment group may be of interest.

Transformations can also be used to deal with nonlinearity, but involve changing the metric (and possible normality) for either X and Y. However, a nonlinear model usually is more complex (more parameters) than a transformed linear model. If there are many parameters to fit and not very many data points, the precision of the fitted parameters for a more complex model may not be very good.

Transformations:

Transformations (a single function applied to each X or each Y data value) are applied to correct problems of nonnormality or unequal variances. For example, taking logarithms of sample values can reduce skewness to the right. Transforming the Y values to remedy nonnormality often results in correcting heteroscedasticity (unequal variances). Occasionally, both the X and Y variables are transformed.

Unless scientific theory suggests a specific transformation a priori, transformations are usually chosen from the "power family" of transformations, where each value is replaced by xp, where p is an integer or half-integer, usually one of:

  • -2 (reciprocal square)
  • -1 (reciprocal)
  • -0.5 (reciprocal square root)
  • 0 (log transformation)
  • 0.5 (square root)
  • 1 (leaving the data untransformed)
  • 2 (square)

For p = -0.5 (reciprocal square root), 0, or 0.5 (square root), the data values must all be positive. To use these transformations when there are negative and positive values, a constant can be added to all the data values such that the smallest is greater than 0 (say, such that the smallest value is 1). (If all the data values are negative, the data can instead be multiplied by -1, but note that in this situation, data suggesting skewness to the right would now become data suggesting skewness to the left.) To preserve the order of the original data in the transformed data, if the value of p is negative, the transformed data are multiplied by -1.0; e.g., for p = -1, the data are transformed as x --> -1.0/x. Taking logs or square roots tends to "pull in" values greater than 1 relative to values less than 1, which is useful in correcting skewness to the right.

Another common transformation is the antilogarithm (exp(x)), which has effects similar to but more extreme than squaring: "drawing out" values greater than 1 relative to values less than 1.

Generally speaking, transformations of X are used to correct for non-linearity, and transformations of Y to correct for nonconstant variance of Y or nonnormality of the error terms. A transformation of Y to correct nonconstant variance or nonnormality of the error terms may also increase linearity. Transforming Y may change the error distribution from normal to nonnormal if the error distribution was normal to begin with.

A transformation of Y involves changing the metric in which the fitted values are analyzed, which may make interpretation of the results difficult if the transformation is complicated. If you are unfamiliar with transformations, you may wish to consult a statistician before proceeding.

The graph of the X-Y data may suggest an appropriate transformation of X if the plot shows nonlinearity but constant error variance (that is, the general shape of the plot is not linear, but the vertical deviation in the data values appears constant over the range of X values).

If the X-Y plot suggests an arc from upper left to lower right so that data points either very low or very high in X lie above the straight line suggested by the data, while the data points with middling X values lie on or below that straight line, taking reciprocals or reciprocals of the antilogarithms of the X values may promote linearity:

If the X-Y plot suggests an arc from lower left to upper right so that data points either very low or very high in X lie above the straight line suggested by the data, while the data points with middling X values lie on or below that straight line, taking squares or antilogarithms of the X values may promote linearity:

If the X-Y plot suggests an arc from upper left to lower right so that data points either very low or very high in X lie below the straight line suggested by the data, while the data points with middling X values lie on or above that straight line, taking squares or antilogarithms of the X values may promote linearity:

The choice of a transformation of Y may be suggested by examining the plot of residuals against X or fitted values, If this appears linear, but the variance of the residuals increases as X increases, suggesting a wedge or megaphone shape, then taking square roots, logarithms, or reciprocals of the Y values may promote homogeneity of variance:

If the plot of residuals against X or fitted values is a convex arc from lower left to upper right, and the variance of the residuals increases as X increases, then taking square roots of the Y values may promote homogeneity of variance:

If the plot of residuals against X or fitted values is a concave arc from upper left to lower right, and the variance of the residuals decreases as X increases, then taking logarithms of the Y values may promote homogeneity of variance:

When a transformation of Y is indicated, a simultaneous transformation of X may also improve linearity of the fit with the transformed Y.

Transformations should be used with caution. After the data have been transformed, you should examine the X-Y scatterplot of the data to ascertain whether the assumptions of parallel straight regression lines and equal variances for the Y are correct, and whether there is any apparent effect of treatment level on X for the transformed data. You should also check the ANCOVA results for signs of nonnormality of the residuals, or nonequality of residual variances across treatment groups.

Removing outliers:

A common method of dealing with apparent outliers is to remove the outliers and then refit the ANCOVA model to the remaining points. If the fitted ANCOVA estimates are not substantially changed by the removal, then the fit to the remaining points will be improved without misrepresenting the data. However, if the outliers are due to a nonnormal distribution for the Y sample population, or to the underlying model being nonlinear, more can be learned by fitting a better model to the entire data (as by a different model or a nonlinear model) than by ignoring valid data values. And while removing a point that has a large residual may lead to a smaller residual variance for the new fitted model, it will not necessarily lead to a smaller P value for the F test of equality of intercepts (treatment means).


Glossary | StatGuide Home | Home