The data show the length of remission in weeks for two groups of leukemia patients,
treated and control, and were analyzed by Cox in his original proportional hazards paper.
The data are available in a file containing three columns:
Treatment: coded Treated (drug) or Control (placebo),
Time: weeks of remission,
Failure: coded 1 if a failure (relapse), 0 if censored
Thus, the third and fourth observations, 6 and 6+, corresponding
to a death and a censored observation at six weeks, are coded 6, 1 and 6, 0, respectively.
The data are available in the usual two plain-text formats in gehan.dat
and gehan.raw (group codes are 1=control, 2=treated), and as a Stata file in gehan.dta.
These data actually come from a matched-pairs design, where patients
were paired according to remission status (partial or complete) and then
randomly assigned to the treated or control group, but most analyses have
ignored this fact. See Andersen et al (1993), pages 22-23, which has
references to several papers using this dataset.
Reference: Andersen, P. K.; Borgan, O.; Gill, R. D. and Keiding, N.
(1993). Statistical Models Based on Counting Processes, Springer-Verlag, New York.