Examining t test results to detect assumption violations

Home | StatGuide | Glossary


All the following results are provided as part of a two-sample (unpaired) t test analysis.

Results for sample values:

Results for residuals:


  • Normality tests:
  • If the assumptions for the t test hold, the values from each sample should come from a normal distribution. Departures from normality can suggest the presence of outliers in the data, or of a nonnormal distribution in one or more of the samples. The normality test will give an indication of whether the populations from which the samples were drawn appear to be normally distributed, but will not indicate the cause(s) of the nonnormality. The smaller the sample size, the less likely the normality test will be able to detect nonnormality.
  • Histograms:
  • The histogram for each sample has a reference normal distribution curve for a normal distribution with the same mean and variance as the sample. This provides a reference for detecting gross nonnormality when the sample sizes are large.
  • Boxplots:
  • Suspected outliers appear in a boxplot as individual points o or x outside the box. If these appear on both sides of the box, they also suggest the possibility of a heavy-tailed distribution. If they appear on only one side, they also suggest the possibility of a skewed distribution. Skewness is also suggested if the mean (+) does not lie on or near the central line of the boxplot, or if the central line of the boxplot does not evenly divide the box. Examples of these plots will help illustrate the various situations.
  • Normal probability plot:
  • For values sampled from a normal distribution, the normal probability plot, (normal Q-Q plot) has the points all lying on or near the straight line drawn through the middle half of the points. Scattered points lying away from the line are suspected outliers. Examples of these plots will help illustrate the various situations.
  • Normality test for residuals:
  • If the assumptions for the t test hold, all the residuals (from both samples) should come from the same normal distribution with mean 0. Departures from normality can suggest the presence of outliers in the data, or of a nonnormal distribution in one or more of the populations from which the samples were drawn. The normality test will give an indication of whether the populations from which the samples were drawn appear to be normally distributed, but will not indicate the cause(s) of the nonnormality. The smaller the sample size, the less likely the normality test will be able to detect nonnormality.
  • Histogram for residuals:
  • The histogram for residuals has a reference normal distribution curve for a normal distribution with the same mean and variance as the residuals. This provides a reference for detecting gross nonnormality when the sample sizes are large.
  • Boxplot for residuals:
  • Suspected outliers appear in a boxplot as individual points o or x outside the box. If these appear on both sides of the box, they also suggest the possibility of a heavy-tailed distribution. If they appear on only one side, they also suggest the possibility of a skewed distribution. Skewness is also suggested if the mean (+) does not lie on or near the central line of the boxplot, or if the central line of the boxplot does not evenly divide the box. Examples of these plots will help illustrate the various situations.
  • Normal probability plot for residuals:
  • For data sampled from a normal distribution, the normal probability plot, (normal Q-Q plot) has the points all lying on or near the straight line drawn through the middle half of the points. Scattered points lying away from the line are suspected outliers. Examples of these plots will help illustrate the various situations.
  • Residuals plotted against fitted values:
  • If the fitted model under the assumption of two populations with equal variance is correct, the plot of residuals against fitted values should suggest a horizontal band across the graph. Because there are only two unique fitted values, the mean of each of the two samples, the graph of residuals against fitted values will consist of two vertical "stacks" of residuals; the stacks should be about the same length and at about the same level. Outliers may appear as anomalous points in the graph (although an outlier may not turn up in the residuals plot by virtue of affecting the mean so that its fitted value lies near it).
  • A fan pattern like the profile of a megaphone, with a noticeable flare either to the right or to the left as shown in the picture (one of the "stacks" of residuals is much longer than the other), suggests that the variance in the values increases in the direction the fan pattern widens (often to the right), and this in turn suggests that a transformation may be needed. Other systematic pattern in the residuals (like a linear trend) suggest either that there is another factor that should be considered in analyzing the data, or that a transformation is needed.

Glossary | StatGuide Home | Home