Possible alternatives if your data violate normality test assumptions

Home | StatGuide | Glossary

Possible alternatives if your data violate normality test assumptions


If the populations from which data for a normality test were sampled violate one or more of the assumptions for the test, the results of the analysis may be incorrect or misleading. If there are factors unaccounted for in the analysis, then the normality test may not give useful results. In such cases, stratification may provide a better analysis.

Although the various normality tests available in Prophet are similar, they are not identical. In some situations, one of the tests may be preferable to the others. There are also other normality tests.

Alternative procedures:


  • Stratification:
  • Stratification involves dividing a sample into subsamples based on one or more characteristics of the population. For example, a sample may be stratified by gender. If the distribution function is different for the different strata, then the characteristic used for stratification may be an implicit factor, and a separate analysis for each individual subsample may be more informative than an analysis of the entire sample. A potential drawback with stratification is that one or more of the subsamples may be small in size, leading to problems with the reliability of the test results. Also, the results for each subsample are generalizable to only a part of the sample population.
  • Choosing a particular normality test:
  • The Kolmogorov-Smirnov test is generally less A HREF="sg_glos.htm#power">powerful than the tests specifically designed to test for normality. This is particularly the case when the mean and variance are not specified in advance for the Kolmogorov-Smirnov test, which then becomes conservative. The Shapiro-Wilk test and the D'Agostino-Pearson omnibus tests are both robust for efficiency, having good power across a range of nonnormal distributions. D'Agostino's test for skewness and the Anscombe-Glynn test for kurtosis are good at detecting nonnormality caused by asymmetry or nonnormal tail heaviness, respectively. If a distribution is symmetric but heavy-tailed (positive kurtosis), the test for kurtosis may be more powerful than the Shapiro-Wilk test, especially if the heavy-tailedness is not extreme. If a distribution has normal kurtosis but is skewed, the test for skewness may be more powerful than the Shapiro-Wilk test, especially if the skewness is not extreme. Generally speaking, either the Shapiro-Wilk or D'Agostino-Pearson test is a powerful overall test for normality. D'Agostino's skewness test is particularly powerful for detecting normality due to asymmetry, and the Anscombe-Glynn test is particularly powerful for detecting normality due to nonnormal kurtosis.
  • Other normality tests:
  • A number of normality tests have been proposed over the years. Some tests are better than others in some situations, in that no single test is uniformly most powerful. These tests include: The Lilliefors test for normality adjusts the Kolmogorov-Smirnov test specifically for testing for normality when the mean and variance are unknown. D'Agostino's D is a powerful overall test for normality like the Shapiro-Wilk or D'Agostino-Pearson tests, and may be more powerful in detecting heavy-tailedness. It is not as powerful as the Shapiro-Wilk test at detecting skewness when the population distribution has normal kurtosis. Spiegelhalter's T' is designed to test for normality against other symmetric alternative distributions. Like the Anscombe-Glynn test, it is powerful for detecting nonnormal kurtosis (although not as powerful as the Anscombe-Glynn test), but has little power in detecting nonnormality when the population distribution is skewed. The Martin-Iglewicz I is designed to test for normality against other heavy-tailed alternative distributions. Like the Anscombe-Glynn test, it is powerful for detecting nonnormal kurtosis (although not as powerful as the Anscombe-Glynn test), but has little power in detecting nonnormality when the population distribution is skewed. D'Agostino and Stephens discuss and compare various normality tests. The chi-square goodness-of-fit test can be used to test whether the population distribution matches the hypothesized distribution, but it is not a very powerful test for normality. Like the Kolmogorov-Smirnov test, it requires that the mean and variance of the hypothesized distribution be specified in advance. Moreover, the test requires that the data be divided into categories. While this may be appropriate with discrete data, which can take on only a small number of values, it is at best an arbitrary process when the values come from a continuous distribution. Since the results of the chi-square test can vary with how the data are divided, this test is not a good alternative when dealing with continuous population distributions. Gupta's test is a nonparametric test for symmetry (as opposed to normality, which includes symmetry). The Wilcoxon one-sample signed rank test is sometimes described as a test for symmetry, but actually assumes the symmetry of the population distribution.
  • Testing against other distributions:
  • One alternative to testing against the null hypothesis of normality is to test against the null hypothesis that the population distribution is some other, nonnormal distribution, such as the uniform distribution. The Kolmogorov-Smirnov test is commonly used to test whether the population distribution follows a specified continuous distribution.
  • Transformations:
  • A transformation of the data may create a data set that more closely approximates that from a normal distribution. Transformations (a single function applied to each data value) are applied to correct problems of nonnormality. For example, taking logarithms of sample values can reduce skewness to the right. Unless scientific theory suggests a specific transformation a priori, transformations are usually chosen from the "power family" of transformations, where each value is replaced by x**p, where p is an integer or half-integer, usually one of:
  • -2 (reciprocal square)
  • -1 (reciprocal)
  • -0.5 (reciprocal square root)
  • 0 (log transformation)
  • 0.5 (square root)
  • 1 (leaving the data untransformed)
  • 2 (square)

For p = -0.5 (reciprocal square root), 0, or 0.5 (square root), the data values must all be positive. To use these transformations when there are negative and positive values, a constant can be added to all the data values such that the smallest is greater than 0 (say, such that the smallest value is 1). (If all the data values are negative, the data can instead be multiplied by -1, but note that in this situation, data suggesting skewness to the right would now become data suggesting skewness to the left.) To preserve the order of the original data in the transformed data, if the value of p is negative, the transformed data are multiplied by -1.0; e.g., for p = -1, the data are transformed as x --> -1.0/x. Taking logs or square roots tends to "pull in" values greater than 1 relative to values less than 1, which is useful in correcting skewness to the right. Transformation involves changing the metric in which the data are analyzed, which may make interpretation of the results difficult if the transformation is complicated. If you are unfamiliar with transformations, you may wish to consult a statistician before proceeding.


Glossary | StatGuide Home | Home