Possible alternatives if your data violate Kaplan-Meier assumptions

Home | StatGuide | Glossary


If the populations from which data for a Kaplan-Meier estimation were sampled violate one or more of the Kaplan-Meier assumptions, the results of the analysis may be incorrect or misleading. For example, if the assumption of independence of censoring times is violated, then the estimates for survival may be biased and unreliable. If there are factors unaccounted for in the analysis that affect survival and/or censoring times, then the Kaplan-Meier method may not give useful estimates for survival. In such cases, stratification of the data or using a parametric method may provide a better analysis.

The best cures for some problems--running an experiment longer or doing more aggressive follow-up to avoid a large proportion of censored values, or using a large enough sample size to lessen the problems of lengthy time intervals between successive noncensored survival times--are outside the scope of statistical analysis per se.

Alternative procedures:


  • Stratification:
  • Stratification involves dividing a sample into subsamples based on one or more characteristics of the population. For example, a sample may be stratified by gender. This gives multiple subsamples, each of which can be analyzed separately. If the survival function is different for the different strata, then the characteristic used for stratification may be an implicit factor, and the separate analysis for each individual subsample may be more informative than an analysis of the entire sample. Stratification may also reveal correlations between censoring and strata. A potential drawback with stratification is that one or more of the subsamples may be small in size, leading to problems with the reliability of the estimates. Also, the results for each subsample are generalizable to only a part of the sample population.
  • Parametric methods:
  • If a specific survival distribution can be assumed based on previous knowledge, then that assumption can be used to make survival estimates. A specific functional (parametric) form for the survival distribution function, such as the Weibull distribution or the exponential distribution, or the Cox proportional hazards model, can be fitted to individual data, if a particular distribution makes sense a priori. (If the exponential model is appropriate, the graph of the log of the survival function [or the cumulative hazard function, which is -log(survival function)], against time should look like a straight line passing through the origin. If the Weibull distribution is appropriate, a graph of the log of the log of the survival function [or the log of the cumulative hazard function] against the log of time should look like a straight line.) Elandt-Johnson and Johnson and Lawless discuss methods of fitting parametric survival models to data. Like nonparametric methods, parametric methods make assumptions about the independence of censoring and survival, and can be affected by implicit factors, the presence of many censored values, or small sample sizes. In addition, parametric methods assume that the designated survival function is the correct one.

Glossary | StatGuide Home | Home