# Examining Kaplan-Meier results to detect assumption violations

A typical Kaplan-Meier step function plot will have short horizontal runs (the steps) at the beginning, when there are many subjects and relatively short times between deaths, and longer steps in later stages of the experiment, when there are fewer subjects and the wait between noncensored survival times becomes longer.

If there are relatively few or no tied or censored values, the vertical drops (the risers) for the steps will all be about the same in height. If a vertical drop is particularly long, there may be tied values or many censored values in a particular interval. The individual observations can be examined for signs of lack of independence or lack of uniformity in the censoring. When examining Kaplan-Meier results, you should keep these potential problems in mind, along with the possibility of implicit factors not surfaced in the data. The problems detectable from the Kaplan-Meier results themselves are often related to problems due to lack of data.

#### Examining results for a Kaplan-Meier calculation:

• Lack of independence of censoring:
• You should be alert to the possibility of systematic patterns in the censoring, For example, if there are many values censored earlier in the experiment rather than later, there may have been a change of conditions during the experiment. (For example, one physician may have withdrawn referred patients early on while other doctors did not.) If there was a relatively large number of censored values in a short span of time, then the censorings may be related. (For example, a physician transfers to another hospital, and all referred patients suddenly leave the study.) A common problem with a survival analysis experiment studying medical treatments is that patients who do not do well one or more of the treatments must be withdrawn from the study, so that sicker patients may be more likely to have censored survival times.
• Many censored values:
• If there are many censored values, the Kaplan-Meier table estimates become less reliable, and the estimated variances may be considerably smaller than the actual variances. If many subjects are censored at approximately the same time, the possibility of a common cause should be considered. This would violate the assumption of independence of censoring and survival times. If many subjects are left alive at the end of the study, the study may simply not have continued long enough to give reliable estimates. If the last observation is censored, the Kaplan-Meier estimate of survival can not reach 0.
• Many tied values:
• Kaplan-Meier's product-limit estimator for survival assumes that the intervals between deaths are small enough that it is unlikely that there will be tied survival values. If there are many such tied values, then the survival estimates may be less reliable. Also, tied survival values may point to the presence of implicit factors in the data.
• Small sample sizes:
• Small sample sizes tend to lead to wide intervals (the times between successive noncensored survival times), raising the question of whether the assumption of a constant survival probability within each interval is appropriate. High censoring rates also reduce the effective sample size for subsequent intervals. If the final interval(s) of a study contain only a few subjects, the Kaplan-Meier estimates for those intervals are not reliable, and should not be given much weight.