Predictions
stpm3
makes use of frames
for predictions
Predictions are much improved in stpm3
.
We first load the Rotterdam 2 breast cancer data and then use stset
to declare the survival time and event indicator.
. use https://www.pclambert.net/data/rott2b, clear
(Rotterdam breast cancer data (augmented with cause of death))
. stset os, f(osi==1) scale(12) exit(time 120)
Survival-time data settings
Failure event: osi==1
Observed time interval: (0, os]
Exit on or before: time 120
Time for analysis: time/12
--------------------------------------------------------------------------
2,982 total observations
0 exclusions
--------------------------------------------------------------------------
2,982 observations remaining, representing
1,171 failures in single-record/single-failure data
20,002.424 total analysis time at risk and under observation
At risk from t = 0
Earliest observed entry t = 0
Last observed exit t = 10
I will fit a simple proportional hazards model just include hormon
and age
.
. stpm3 i.hormon age, scale(lncumhazard) df(5)
Iteration 0: log likelihood = -2909.4524
Iteration 1: log likelihood = -2908.4666
Iteration 2: log likelihood = -2908.4635
Iteration 3: log likelihood = -2908.4635
Number of obs = 2,982
Wald chi2(2) = 64.07
Log likelihood = -2908.4635 Prob > chi2 = 0.0000
------------------------------------------------------------------------------
| Coefficient Std. err. z P>|z| [95% conf. interval]
-------------+----------------------------------------------------------------
xb |
hormon |
yes | .3229905 .0876157 3.69 0.000 .1512669 .4947141
age | .0149851 .002374 6.31 0.000 .0103322 .0196381
-------------+----------------------------------------------------------------
time |
_ns1 | -24.01501 1.921081 -12.50 0.000 -27.78026 -20.24976
_ns2 | 6.695913 1.028047 6.51 0.000 4.680978 8.710849
_ns3 | -1.221354 .0499264 -24.46 0.000 -1.319208 -1.123501
_ns4 | -.8156036 .0389592 -20.93 0.000 -.8919622 -.7392449
_ns5 | -.5046344 .0422296 -11.95 0.000 -.587403 -.4218658
_cons | -1.391522 .1368464 -10.17 0.000 -1.659736 -1.123308
------------------------------------------------------------------------------
Warning: This is a test version of stpm3
We can predict the survival function for a 60 year old women who did not take hormon therapy using,
. predict S, survival ci ///
> at1(age 60 hormon 0) ///
> timevar(0 10, step(0.1))
Predictions are stored in frame - stpm3_pred
The predict
command has requested a survival function with a 95% confidence interval.
The values of the covariates to predict at are given in the at1()
option.
It is possible to have multiple at
options.
The predictions will be at times between 0 and 10 in steps of 0.1, leading to 101 observations.
We can list or plot the results in frame stpm3_pred
.
. frame stpm3_pred: list in 1/21
+-----------------------------------------+
| tt S S_lci S_uci |
|-----------------------------------------|
1. | 0 1 1 1 |
2. | .1 .99987272 .99948675 .99996844 |
3. | .2 .99939144 .99844378 .99976209 |
4. | .3 .99849308 .9969857 .99924694 |
5. | .4 .99715658 .99513109 .99834016 |
|-----------------------------------------|
6. | .5 .99537771 .99288014 .99700048 |
7. | .6 .99316099 .99022634 .9952166 |
8. | .7 .99051598 .98716155 .99299707 |
9. | .8 .98745529 .98367965 .99036174 |
10. | .9 .98399346 .97977945 .98733494 |
|-----------------------------------------|
11. | 1 .98014617 .97546655 .98394053 |
12. | 1.1 .97592979 .9707539 .98019903 |
13. | 1.2 .97136109 .96566103 .9761267 |
14. | 1.3 .96645696 .96021236 .97173595 |
15. | 1.4 .96123425 .95443524 .96703632 |
|-----------------------------------------|
16. | 1.5 .95570969 .94835805 .96203579 |
17. | 1.6 .94989974 .9420086 .95674187 |
18. | 1.7 .94382061 .93541289 .95116257 |
19. | 1.8 .93748813 .92859406 .94530721 |
20. | 1.9 .93091774 .92157156 .93918715 |
|-----------------------------------------|
21. | 2 .92412448 .91436051 .93281647 |
+-----------------------------------------+
stpm3
saves predictions to a new frame by default. The default name for the prediction frame is stpm3_pred
,
but you can choose a frame name with the frame()
option.
. predict S1, survival ci ///
> at1(age 60 hormon 0) ///
> timevar(0 10, step(0.1)) ///
> frame(f1)
Predictions are stored in frame - f1
One feature is that you can merge new predictions to an existing frame.
When you do this the time variable stored in the existing frame will be used.
For example, we could predict for a 70 year old woman and merge with frame f1
.
. predict S2, survival ci ///
> at1(age 70 hormon 0) ///
> frame(f1, merge)
Predictions are stored in frame - f1
Note that you can specify multiple at()
options, which is often easier
than having multiple predict
commands. Below I use frame(f1, replace)
to replace the
existing frame f1
.
. predict S1 S2, survival ci ///
> at1(age 60 hormon 0) ///
> at2(age 70 hormon 0) ///
> timevar(0 10, step(0.1)) ///
> frame(f1, replace)
Predictions are stored in frame - f1
I will cover more details about multiple at()
options when I discuss contrasts.
Using frame(..., merge)
is useful when you require predictions of different types,
for example when requiring predictions of both survival and hazard functions.
The predictions for the hazard functions are merged with the survival predictions below.
. predict h1 h2, hazard ci ///
> at1(age 60 hormon 0) ///
> at2(age 70 hormon 0) ///
> frame(f1, merge)
Predictions are stored in frame - f1
Sometimes you may want to merge the predictions into the active frame rather than create a new frame.
You can do this with the merge
option. This could be useful if you wanted to predict at each
observed value in a dataset, for example at the observed event/censoring times (_t
),
or at a specific value of time.
The code below, predicts 10 year survival for each observation in the dataset.
. predict S10, survival ci ///
> at1(.) ///
> timevar(10) ///
> merge
Note that only one value of time has been specified in the timevar()
optios
The default is to predict
at this time for all individuals.
The at1(.)
option means that predictions will be at observed values of covariates
rather than values specifed by the user.
We could plot a histogram of the 10 year survival probabilities.
. hist S10
(bin=34, start=.2766368, width=.01245414)