Summary of Steps and Splus Log in Computing Self-Consistent Estimator of Survival D.f. for Interval-Censored Data ============================================== 10/17/03 The starting point here is a dataset dfram with columns L_i and U_i corresponding to lower and upper points of intervals (L_i,U_i] known to contain the true value X_i for individual i. Note that some of the values U_i may be Inf (+infinity). Here we view the left- and right- censoring points as times of fixed exams (eg periodic followup visits) between which X_i must lie. Note that any missing of exams must be assumed independent of the X_i values, eg due to external events like snowstorms or traffic tie-ups independent of the patient's health. Next create a vector Times consisting of all distinct order left and/or right censoring times. > Times <- sort(unique(c(dfram$L, dfram$U))) > nobs <- nrow(dfram) > ntim <- length(Times) Now define an nobs x ntim matrix of `weights' alpha_{ij} showing whether individual i has interval-observation containing the jth successive interval (tau_{j-1},tau_j] among the ordered times in the vector Times . > alpha <- outer(Times, dfram$L, function(x,y) x > y) * outer(Times, dfram$U, function(x,y) x <= y) Next we start with an arbitrary initial guess for the S(t) = S_X(t) function. The following is the inductive `self-consistency' algorithm of Turnbull (1976). On input of the following function, Svec is supposed to be a survival-function estimator evaluated at all of the entries of Times. > Hfunc <- function(Svec, alpha) { pvec <- -diff(c(1,Svec)) qvec <- 1/c(alpha %*% pvec) dvec <- pvec * c(qvec %*% alpha) Yvec <- rev(cumsum(rev(dvec))) cumprod(1-dvec/Yvec) } ### The algorithm then consists of successively replacing ### Shat [initially equal to Svec] by Hfunc(Shat,alpha). We illustrate this using the book's example of interval-censored data (radiotherapy only) from Section 1.18. > BCdet <- read.table("BCdet.dat", col.names=c("L","U","Trt")) > dfram <- BCdet[,1:2] > Times <- sort(unique(c(dfram$L, dfram$U))) ### length 33 nobs <- nrow(dfram) ntim <- length(Times) > alpha <- t(outer(Times, dfram$L, function(x,y) x > y) * outer(Times, dfram$U, function(x,y) x <= y)) ### This is 46 x 33 matrix > Sinit <- (32:0)/33 > Sold <- Sinit for(i in 1:100) { Snew <- Hfunc(Sold,alpha) cat(i,"th iteration change = ",round(sum(abs(Snew-Sold)),4),"\n") Sold <- Snew } 1 th iteration change = 5.3261 2 th iteration change = 0.8141 3 th iteration change = 0.2635 4 th iteration change = 0.1171 5 th iteration change = 0.0667 6 th iteration change = 0.0478 7 th iteration change = 0.0386 8 th iteration change = 0.0331 9 th iteration change = 0.0292 10 th iteration change = 0.026 ### Not very rapid convergence, but not bad ... ...100 th iteration change = 0.0005 > cbind(Times=Times, Surv=round(Sold,5)) ### Agrees with little ### Table at lower left on page 145. Times Surv [1,] 0 1.00000 [2,] 4 1.00000 [3,] 5 0.95365 [4,] 6 0.95365 [5,] 7 0.92029 [6,] 8 0.83162 [7,] 10 0.83162 [8,] 11 0.83162 [9,] 12 0.76087 [10,] 14 0.76087 [11,] 15 0.76087 [12,] 16 0.76087 [13,] 17 0.76087 [14,] 18 0.76087 [15,] 19 0.76087 [16,] 22 0.76086 [17,] 24 0.76063 [18,] 25 0.66823 [19,] 26 0.66823 [20,] 27 0.66823 [21,] 32 0.66815 [22,] 33 0.66553 [23,] 34 0.58659 [24,] 35 0.58659 [25,] 36 0.58659 [26,] 37 0.58654 [27,] 38 0.58417 [28,] 40 0.46585 [29,] 44 0.46568 [30,] 45 0.46568 [31,] 46 0.46568 [32,] 48 0.00665 [33,] 999 0.00000