I am giving a talk about using R for survival analysis and I wanted to talk first about the Kaplan-Meier curve and how you might draw it in R.

I wrote about the Kaplan-Meier curve in a previous webpage, but that was a generic example. First, you read the data in R and transform it into a survival object.

day <- c(37,40,43,44,45,
 47,49,54,56,58,59,60,61,62,68,
 NA,71,NA,NA,75,
 NA,NA,89,NA,96)
death <- is.finite(day)
tim <- day
tim[is.na(day)] <- 70
library("survival")
fly.surv <- Surv(tim,death)

The Surv function has options for left censored and interval censored observations. Read the help file for details.

There is a print method for survival objects.

print(fly.surv)

The survival object prints with a “+” attached to any censored observation.

[1] 37  40  43  44  45  47  49  54  56  58  59  60  61  62  68  70+ 71  70+ 70+ 75  70+ 70+ 89 
[24] 70+ 96

The survfit function creates a new object that summarizes the data in a survival object using a Kaplan-Meier curve or a Cox regression model. The input for survfit is a formula with a survival object on the left side of the equation. A model with "~1<U+2033> fits a single Kaplan-Meier curve to the entire survival object.

fly.fit <- survfit(fly.surv~1)

There is a print method for survfit objects .

print(fly.fit)

which lists some basic statistics

Call: survfit(formula = fly.surv ~ 1)

records   n.max n.start  events  median 0.95LCL 0.95UCL 
     25      25      25      19      61      56      NA

There is also a summary method

summary(fly.fit)

which produces a more detailed set of statistics.

Call: survfit(formula = fly.surv ~ 1)

records   n.max n.start  events  median 0.95LCL 0.95UCL 
     25      25      25      19      61      56      NA 
 time n.risk n.event survival std.err lower 95% CI upper 95% CI
   37     25       1     0.96  0.0392       0.8862        1.000
   40     24       1     0.92  0.0543       0.8196        1.000
   43     23       1     0.88  0.0650       0.7614        1.000
   44     22       1     0.84  0.0733       0.7079        0.997
   45     21       1     0.80  0.0800       0.6576        0.973
   47     20       1     0.76  0.0854       0.6097        0.947
   49     19       1     0.72  0.0898       0.5639        0.919
   54     18       1     0.68  0.0933       0.5197        0.890
   56     17       1     0.64  0.0960       0.4770        0.859
   58     16       1     0.60  0.0980       0.4357        0.826
   59     15       1     0.56  0.0993       0.3956        0.793
   60     14       1     0.52  0.0999       0.3568        0.758
   61     13       1     0.48  0.0999       0.3192        0.722
   62     12       1     0.44  0.0993       0.2827        0.685
   68     11       1     0.40  0.0980       0.2475        0.646
   71      4       1     0.30  0.1136       0.1428        0.630
   75      3       1     0.20  0.1114       0.0672        0.596
   89      2       1     0.10  0.0900       0.0171        0.584
   96      1       1     0.00     NaN           NA           NA

Most importantly, there is a plot method.

plot(fly.fit)

The graph includes the survival curve (either from a Kaplan-Meier estimate or a Cox regression model) and confidence limits. The graph displays a “+” at any censored value.

There’s a lot more on survival models which I hope to cover in another blog entry.

This Blog post was added to the website on 2014-10-31 and was last modified on 2020-02-29. You can find similar pages at R software, Survival analysis.

An earlier version of this page appears here.