If the populations from which data to be analyzed by a contingency table analysis were sampled violate one or more of the assumptions, the results of the analysis may be incorrect or misleading.
For example, if the assumption of independence is violated, then neither the chi-square test nor Fisher's exact test is appropriate, although another test (such as Cochran's Q test or McNemar's Q test) may be appropriate.
If the total sample size or particular row or columns totals are too small, then the expected values may be too small for the approximation involved in the chi-square test to be valid.
If it is not possible to cleanly assign each observation to exactly one cell of the contingency table, or if an ad hoc scheme is used to divide a continuous variable into discrete categories, then the results of the chi-square or Fisher's exact test may vary greatly depending on the exact apportionment of observations into cells of the contingency table.
If one or both of the categories are ordered instead of nominal, especially if one or both of the classification variables is actually continuous rather than discrete, then a chi-square or Fisher's exact test may not be the most powerful test available, and this could mean the difference between detecting a true difference or not.
Often, the effect of an assumption violation on the test result depends on the extent of the violation.
Potential assumption violations include:
- Interaction: Interaction(s) between row and column classifications
- Outliers: Anomalous observations
- Lack of independence: lack of independence
- Structural zeroes: contingency table cells that must be empty
- Special problems with small expected cell counts for the chi-square test
- Special problems with small observed cell counts for the chi-square test
- Special problems with continuous classification variables
- Fisher's exact test when the marginal totals are not fixed
- Special considerations for unbalanced marginal totals for the chi-square test
- Interaction(s) between row and column classifications:
- Lack of independence within a sample may be caused by interactions between categories of the row and column variables. For example, the probability of death in patient category 1 from disease A may be higher than the probability of death in patient category 2 from disease 1 because patients in category 1 tend to suffer a worse form of the disease. Such interactions may indicate the existence of an implicit factor in the data. For example, the probability of death in patient category 1 from disease A may be higher than the probability of death in patient category 2 from disease 1 because patients in category 1 tend to be older than those in category 2. In this case, age would be an implicit factor. Outliers in the table may be due to interactions.
- Lack of independence:
- Whether the observations are independent of each other is generally determined by the structure of the experiment from which they arise. Obviously correlated samples, such as a set of pre- and post-test observations on the same subjects, are not independent, and such data would be more appropriately tested by test like Cochran's Q test or McNemar's Q test). If you are unsure whether your samples are independent, you may wish to consult a statistician or someone who is knowledgeable about the data collection scheme you are using.
- The model of independence may fit poorly due to the presence of outliers. Outliers are anomalous values in the data. They may be due to recording errors, which may be correctable, or they may be due to the sample not being entirely from the same population, or they may be due to interactions between row and column classificatins. If you find outliers in your data that are not due to correctable errors, you may wish to consult a statistician as to how to proceed.
- Structural zeroes:
- As long as the probability of falling into row category i and the probability of falling into column category j are both non-zero, the expected probability of falling into cell(i,j) is also non-zero under the usual two-way contingency table model of independence. If the total sample size small, or if there are many cells in the table, then it may happen that no observations are recorded for a particular cell. These zero values in a table are sampling zeroes. However, the actual process that creates the observations may produce cells in the contingency table in which observations can never occur. The zero values that must occur in these cells are structural zeroes. A contingency table of cancer incidence by sex and type of cancer must have the value 0 in the cell for males and ovarian cancer, but the expected number of males with ovarian cancer will not be 0 as long as there is are at least 1 male and 1 ovarian cancer patient among the observations. A contingency table containing one or more structural zeroes is an incomplete table. The chi-square test and Fisher's exact test are not designed for contingency tables with structural zeroes. If you find structural zeroes in your data, you may wish to consult a statistician as to how to proceed, perhaps using an alternative test.
- Special problems with small expected cell counts for the chi-square test:
- The chi-square test involves using the chi-square distribution to approximate the underlying exact distribution. Although the chi-square approximation can be used in all three sampling schemes, the approximation becomes less accurate when marginal totals are fixed. The best approximation will be most likely be in the (multinomial) sampling scheme. The approximation becomes better as the expected cell frequencies grow larger, and may be inappropriate for contingency tables with very small expected cell frequencies. In case of a 2x2 contingency table, an adjusted value of the chi-square statistic (the Yates corrected chi-square) is often used to correct for a continuous distribution (chi-square) being used to approximate the very discrete distribution of the values in the 2x2 table.
- For tables with expected cell frequencies less than 5, the chi-square approximation may not be reliable. A standard (and conservative) rule of thumb (due to Cochran) is to avoid using the chi-square test for contingency tables with expected cell frequencies less than 1, or when more than 20% of the contingency table cells have expected cell frequencies less than 5.
- Special problems with small observed cell counts for the chi-square test:
- When no observations appear in a particular row category (row total is 0) or a particular column category (column total is 0), the chi-square statistic can not be calculated. To proceed, the category must be either eliminated completely, or combined with another category.
- When rows or columns are combined (collapsed together) to fix problems of small expected cell frequencies or zero-sum categories, care should be taken to do the collapsing such that the new hypothesis being tested is still of interest. If the null hypothesis of independence of row and column variables is true for all categories of each variable, then combining categories will preserve that property. However, collapsing can destroy evidence of non-independence, so a failure to reject the null hypothesis for the collapsed table does not rule out the possibility of non-independence in the original table.
- As with most statistical tests, the power of the chi-square test increases with a larger number of observations. If there are too few observations, it may be impossible to reject the null hypothesis even if it is false.
- Special problems with continuous classification variables:
- The chi-square test and Fisher's exact test are designed for observations cross-classified by two sets of nominal categories. If either the row or column variable is actually continuous, then the variable must be divided into intervals to construct the contingency table. The interval boundaries should be decided beforehand on the basis of theory or custom. If the intervals are determined by the particular data being analyzed, then the test statistic and corresponding P value may not be generalizable.
- The chi-square test ignores any possible ordering of either the row or column variables. If either or both of the row or column variables is ordinal or continuous, then an alternative test to the chi-square or Fisher's exact test may be preferable, especially if one of the variables is an outcome variable and the other an explanatory variable. If the explanatory variable is nominal and the outcome variable is continuous, an analysis of variance [ANOVA] is an alternative test. If the explanatory variable is continuous and the outcome variable is nominal, then logistic regression is an alternative test. If both the explanatory and outcome variables are continuous, then simple linear regression is an alternative test.
- Fisher's exact test when the marginal totals are not fixed:
- Fisher's exact test assumes that all the row marginal totals and all the column marginal totals are fixed. However, work by Tocher shows that the test can be extended to the case where only one set of marginal totals is fixed.
- Special considerations for unbalanced marginal totals for the chi-square test:
- If a set of marginal totals is fixed, then the more nearly those marginal totals equal each other, the more powerful the chi-square test will be for the same total sample size. For this reason, a study using a sampling scheme in which the row or column marginal totals are fixed and can be set equal (such as a retrospective or prospective study) will tend to be more powerful than a cross-sectional study with the same total sample size.