If the data to be analyzed by a one-sample t test come from a population whose distribution violates the assumption of normality, or outliers are present, then the t test on the original data may provide misleading results, or may not be the most powerful test available. In such cases, transforming the data or using a nonparametric test may provide a better analysis.
- Transformations (a single function applied to each data value) are applied to correct problems of nonnormality. For example, taking logarithms of sample values can reduce skewness to the right. Unless scientific theory suggests a specific transformation a priori, transformations are usually chosen from the "power family" of transformations, where each value is replaced by x**p, where p is an integer or half-integer, usually one of:
- -2 (reciprocal square)
- -1 (reciprocal)
- -0.5 (reciprocal square root)
- 0 (log transformation)
- 0.5 (square root)
- 1 (leaving the data untransformed)
- 2 (square)
For p = -0.5 (reciprocal square root), 0, or 0.5 (square root), the data values must all be positive. To use these transformations when there are negative and positive values, a constant can be added to all the data values such that the smallest is greater than 0 (say, such that the smallest value is 1). (If all the data values are negative, the data can instead be multiplied by -1, but note that in this situation, data suggesting skewness to the right would now become data suggesting skewness to the left.) To preserve the order of the original data in the transformed data, if the value of p is negative, the transformed data are multiplied by -1.0; e.g., for p = -1, the data are transformed as x --> -1.0/x. Taking logs or square roots tends to "pull in" values greater than 1 relative to values less than 1, which is useful in correcting skewness to the right. Transformation involves changing the metric in which the data are analyzed, which may make interpretation of the results difficult if the transformation is complicated. If you are unfamiliar with transformations, you may wish to consult a statistician before proceeding.
- Nonparametric tests:
- Nonparametric tests are tests that do not make the usual distributional assumptions of the normal-theory-based tests. For the one-sample t test, the most common nonparametric alternative tests are the one-sample Wilcoxon one-sample signed rank test and the one-sample sign test. Although the signed rank test does not assume normality of the distribution for the sample population, it does assume that they come from the same, symmetric distribution. Thus the signed rank test will not address the problem of skewness. Also, as with the one-sample t test, it is assumed that the values in the sample are independent of each other. Although the Wilcoxon one-sample signed rank test is the most commonly used nonparametric alternative to the one-sample t test, it is not the only one. However, all tests assume that the data values are independent. The one-sample sign test can be calculated for data values even when only the direction (+ or -) of the difference from the hypothesized mean value is known. This means that it can be applied in situations when the one-sample signed rank test, which requires at least knowledge of the relative ranks and directions (signs) of the differences between each data value and the hypothesized value, can not be used. Unlike the signed rank test, the sign test does not assume symmetry of the population distribution for the sample, but is likely to be less powerful than the one-sample signed rank test when that distribution is in fact symmetric. If the distribution is extremely heavy-tailed, the sign test may be more powerful than either the one-sample signed rank test or the one-sample t test. If the data values do indeed come from a population with a normal distribution, then the t test is the most powerful test of the equality of the population mean and the hypothesized value, meaning that no other test is more likely to detect an actual departure. (If a distribution is symmetric, its mean and median are both equal to the center of symmetry. Since the normal distribution is symmetric, the t test can also be viewed as testing whether the population median is different from the hypothesized value, if the normality assumption holds.) If the population distribution of the sample is not normal, however, the signed rank test may be more powerful at detecting differences between the population median and the hypothesized value. Because the signed rank test is nearly as powerful as the one-sample t test in the case of data from a normal distribution, and may be substantially more powerful in the case of nonormality, the one-sample rank test is well suited to analyzing data when outliers are suspected, even if the underlying distribution is close to normal.