Hazard of Standardized Survival Functions

This will be a short tutorial as the ideas are very simple. I have previously discussed standardized survival functions. In survival analysis we know that there is a simple mathematical transformation from hazard to survival function and vice versa. The idea here is to transform to a hazard function from the standardized survival function. Recall that a standardized survival funnction; $S_{s} (t | X = x, Z)$ is estimated by

$S_{s} (t | X = x, Z) = \frac{1}{N} \sum_{i = 1}^{N} S (t | X = x, Z = z_{i})$

If we apply the usual transformation from survival to hazard to function ( $h (t) = \frac{- d}{d t} \log [S (t)]$ ) we get

$h_{s} (t | X = x, Z) = \frac{\sum_{i = 1}^{N} S (t | X = x, Z = z_{i}) h (t | X = x, Z = z_{i})}{\sum_{i = 1}^{N} S (t | X = x, Z = z_{i})}$

This is a weighted average of the $N$ individual hazard functions with weights equal to $S (t | X = x, Z = z_{i})$ , i.e. the predicted survival function for individual $i$ when forced to take a specific value of the exposure variable, $X$ , but their observed values of confounding variables, $Z$ .

This is implemented in standsurv using the hazard option.

Example

I will use the Rotterdam Breast cancer data. The code below loads and stset’s the data and then fits a model using stpm3.

. use https://www.pclambert.net/data/rott3, clear
(Rotterdam breast cancer data (augmented with cause of death))

. stset os, f(osi==1) scale(12) exit(time 120)

Survival-time data settings

         Failure event: osi==1
Observed time interval: (0, os]
     Exit on or before: time 120
     Time for analysis: time/12

--------------------------------------------------------------------------
      2,982  total observations
          0  exclusions
--------------------------------------------------------------------------
      2,982  observations remaining, representing
      1,171  failures in single-record/single-failure data
 20,002.424  total analysis time at risk and under observation
                                                At risk from t =         0
                                     Earliest observed entry t =         0
                                          Last observed exit t =        10

. stpm3 hormon age enodes pr_1, scale(lncumhazard) df(4) eform nolog tvc(hormon) dftvc(3)

                                                        Number of obs =  2,982
                                                        Wald chi2(4)  = 615.95
Log likelihood = -2666.5968                             Prob > chi2   = 0.0000

-------------------------------------------------------------------------------------
                    |     exp(b)   Std. err.      z    P>|z|     [95% conf. interval]
--------------------+----------------------------------------------------------------
xb                  |
             hormon |   .8499802   .0963501    -1.43   0.152     .6806444    1.061445
                age |   1.013249   .0024115     5.53   0.000     1.008534    1.017987
             enodes |   .1132408    .011008   -22.41   0.000     .0935963    .1370084
               pr_1 |   .9061179   .0119267    -7.49   0.000     .8830411    .9297979
--------------------+----------------------------------------------------------------
time                |
               _ns1 |  -27.09524   2.109681   -12.84   0.000    -31.23014   -22.96034
               _ns2 |   8.647725   1.122097     7.71   0.000     6.448455    10.84699
               _ns3 |  -1.072205   .0477674   -22.45   0.000    -1.165827   -.9785823
               _ns4 |  -.6930019   .0518048   -13.38   0.000    -.7945373   -.5914664
                    |
c.hormon#c._ns_tvc1 |   5.425507    3.92237     1.38   0.167    -2.262197    13.11321
                    |
c.hormon#c._ns_tvc2 |  -3.309698   2.096769    -1.58   0.114    -7.419291    .7998943
                    |
c.hormon#c._ns_tvc3 |  -.1256484    .195217    -0.64   0.520    -.5082667      .25697
                    |
              _cons |   .7984459   .1615956     4.94   0.000     .4817244    1.115168
-------------------------------------------------------------------------------------
Note: Estimates are transformed only in the first equation.

I have made the effect of our exposure, hormon, time-dependent using the tvc option.

I first calculate the standardized survival curves where everyone is forced to be exposed and then unexposed.

. range timevar 0 10 100
(2,882 missing values generated)

. standsurv, surv timevar(timevar) ci frame(surv, replace) ///
>            at1(hormon 0) at2(hormon 1) atvar(S0 S1)      ///
>            contrast(difference) contrastvar(Sdiff) 

. 
. frame surv {
.   twoway (rarea S0_lci S0_uci timevar, color(red%25))                     ///
>          (rarea S1_lci S1_uci timevar, color(blue%25))                    ///
>          (line S0 timevar, sort lcolor(red))                              ///
>          (line S1  timevar, sort lcolor(blue))                            ///
>          , legend(order(1 "No hormonal treatment" 2 "Hormonal treatment") ///
>                   ring(0) cols(1) pos(1))                                 ///
>          ylabel(0.5(0.1)1,angle(h) format(%3.1f))                         ///
>          ytitle("S(t)")                                                   ///
>          xtitle("Years from surgery")
. }

If I run standsurv again with the hazard option I get the corresponding hazard functions of the standardized curves. This is the marginal hazard ratio (as a function of time).

. standsurv, hazard  timevar(timevar) ci per(1000) frame(hazard, replace) ///
>            at1(hormon 0) at2(hormon 1) atvar(h0 h1)                     /// 
>            contrast(ratio) contrastvar(hr)

Plot the standardized hazard functions.

. frame hazard {
.   twoway (rarea h0_lci h0_uci timevar, color(red%30))  ///
>          (rarea h1_lci h1_uci timevar, color(blue%30)) ///
>          (line h0 timevar, color(red))                 ///
>          (line h1 timevar, color(blue))                ///
>           , legend(off)                                ///
>           ylabel(,angle(h) format(%3.1f))              ///
>           xtitle("Years from surgery")   
. }

I can’t explain the lower and then higher hazard for those on hormon therapy. Perhaps better adjustment for confounders would change this.

I can also plot the ratio of these two hazard functions with a 95% confidence interval.

. frame hazard {
.   twoway (rarea hr_lci hr_uci timevar, color(red%30))      ///
>          (line hr timevar, color(red))                     ///
>          if timevar>0, yscale(log)                         ///
>          ylabel(0.5 1 2 4 8 20 40, angle(h) format(%3.1f)) ///
>          xtitle("Years from surgery")                      ///
>          legend(off)                                       ///
>          yscale(log) 
. }

If I had used the difference argument of the contrast() option I would have obtained the absolute difference in the standardized hazard functions.

I am still thinking about the usefulness of this - in general I prefer the idea of standardized survival functions rather than the corresponding hazard function. However, it is harder to see how the risk of events changes over follow-up time with a cumulative measure (i.e. standardized survival).