Direct modelling of marginal relative survival models

Background

When using relative survival we may be intested in both estimating relative survival conditional on specific covariate patterns, for example, for a male aged 70 diagnosed in 2018 with localized cancer, or marginal relative survival where we may interested in an average effect in a population or making comparisons where we average over the same covariate (confounder) patterns. My standsurv command can be used to estimate marginal relative survival after fitting a model conditional on covariates.

The estimand of interest, is marginal relative survival. Consider a set of covariates, $\mathbf{X}_i$, for the $i^{th}$ individual that may affect the rate of death from the cancer under study and the rate of death from other causes. The all cause rate of death, $h(t|\mathbf{X}_i)$, can be partitioned into two components,

$$ h(t|\mathbf{X}_i) = h^*(t|\mathbf{X}_i) + \lambda(t|\mathbf{X}_i) $$

where $h^*(t|\mathbf{X}_i)$ is the expected mortaliity rate and $\lambda(t|\mathbf{X}_i)$ is the excess mortality rate for the $i^{th}$ individual. The relative survival for covariate pattern, $\mathbf{X}_i$ is,

$$ R(t|\mathbf{X}_i) = \int_0^t {\lambda(u|\mathbf{X}_i) du} $$

The marginal relative survival involves taking the expectation of $R(t|\mathbf{X}_i)$ over covariate pattern, $\mathbf{X}_i$,

$$ R^m(t|\mathbf{X}) = E_{\mathbf{X}}\left[R(t|\mathbf{X})\right] \tag{Equation 1} $$

Note that in the above for simplicity, I assume the same covariates act on the expected and excess mortality rates, but this is not a requirement

Example

I use the Melanoma data, restricting to those diagnosed in the later calendar perdiod, 1985-1994. I restrict follow-up to 10 years after diagnosis using the exit() option.

. use https://pclambert.net/data/melanoma.dta if year8594 == 1 
(Skin melanoma, diagnosed 1975-94, follow-up to 1995)

. stset surv_mm, failure(status=1,2) id(id) exit(time 120.5) scale(12)

Survival-time data settings

           ID variable: id
         Failure event: status==1 2
Observed time interval: (surv_mm[_n-1], surv_mm]
     Exit on or before: time 120.5
     Time for analysis: time/12

--------------------------------------------------------------------------
      4,744  total observations
          0  exclusions
--------------------------------------------------------------------------
      4,744  observations remaining, representing
      4,744  subjects
      1,401  failures in single-failure-per-subject data
 22,003.417  total analysis time at risk and under observation
                                                At risk from t =         0
                                     Earliest observed entry t =         0
                                          Last observed exit t =  10.04167

I will first estimate the non-parametric estimate of marginal relative survival using stpp, so we have something to compare our model based estimates to.

. stpp R_pp using https://pclambert.net/data/popmort.dta, /// 
>                 agediag(age) datediag(dx) pmother(sex)

. preserve

. rename _t PP_t

. keep R_pp* PP_t

. save PP, replace
file PP.dta saved

. restore                

I have saved the Pohar Perme estimates in a dataset, so I can merge them in after using mrsprep.

I will now fit some relative survival models, but first I need to merge in the expected mortality rates at the event/censoring times.

. // conditional model (no covariates)
. gen _age = floor(min(age + _t,99))

. gen _year = floor(year(dx + _t*365.24))

. merge m:1 _age _year sex using https://pclambert.net/data/popmort.dta, ///
>           keep(match master)        

    Result                      Number of obs
    -----------------------------------------
    Not matched                             0
    Matched                             4,744  (_merge==3)
    -----------------------------------------

Now I will fit a flexible parametric relative survival model with no covariate using stpm2. I will then predict the estimated relative survival.

. stpm2, scale(hazard) df (5) bhazard(rate)

Iteration 0:   log likelihood = -4364.5376  
Iteration 1:   log likelihood = -4279.6772  
Iteration 2:   log likelihood = -4279.5867  
Iteration 3:   log likelihood = -4279.5864  

Log likelihood = -4279.5864                              Number of obs = 4,744

------------------------------------------------------------------------------
             | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
xb           |
       _rcs1 |   .7654828   .0324456    23.59   0.000     .7018906    .8290749
       _rcs2 |   .1834916    .029522     6.22   0.000     .1256295    .2413537
       _rcs3 |   .0572084    .017442     3.28   0.001     .0230227     .091394
       _rcs4 |   .0242512   .0122991     1.97   0.049     .0001453     .048357
       _rcs5 |   .0120989    .008138     1.49   0.137    -.0038512     .028049
       _cons |  -2.037147   .0457164   -44.56   0.000     -2.12675   -1.947545
------------------------------------------------------------------------------

. range tt_cond 0 10 101
(4,643 missing values generated)

. predict s_cond, surv timevar(tt) ci        

I can now compare the model based and the non-parametric estimates.

. twoway (rarea R_pp_lci R_pp_uci _t, sort connect(stairstep) color(${C1ci}))       ///
>        (line R_pp _t, sort connect(stairstep) color(${C1}))                       ///
>        (line s_cond* tt_cond, color(${C2} ${C2} ${C2}) lpattern(solid dash dash)) ///
>        , legend(order(2 "Pohar Perme" 3 "stpm2 model without covariates")         ///
>                 ring(0) cols(1) pos(7))                                           ///
>          ylabel(0.6(0.1)1, format(%3.1f))                                         ///
>          ytitle("Marginal relative survival")                                     ///
>          xtitle("Years from diagnosis")                                           ///
>          name(int_stand, replace)      

It can be seen that there is disagrement between the non-parametric estimate and the model based estimate. This is not good and differs in what we would expect to see in a standard survival model. For example, if I fit an all-cause stpm2 survival model without covariates I get the following graph comparing the model based estmates with a Kaplan Meier estimate.

There is now near perfect agreement. I explain in the next section why there is disagreement between the model based and non-parametric estimate.

Why is there disagreement when a model with no covariates is fitted.

Consider the relative survival model fitted when not including any covariates.

$$ h(t|\mathbf{X}_i) = h^*(t|\mathbf{X}_i) + \lambda(t) $$

In this model the excess mortality is assumed to be exactly the same for each individual. In this model the all cause mortality rate varies between individuals only through variation in expected (other cause) mortality rates and the excess (cancer) mortality rate is assumed to be the same for all individuals. This is different from the definition in Equation 1 where relative survival is allowed to vary between individuals. Assuming that the excess mortality is the same over age, sex etc is a very strong assumption and almost certainly not true.

Regression standardization

Regression standardization can be used in the relative survival framework. This means that we should include all covariates that affect expected mortality rates in the model. In the case of the Melanoma data this is age, sex and calendar year

I will fit a model that uses restricted cubic splines to model the effect of age at diagnosis and also relax the proportional hazards assumption for the effect of age by allowing an interaction with time. The model will include sex and calendar years as these both impact the expected mortality rates. I will allow the effect of sex to be time-dependent (non-proportional), and model the effect of year of diagnosis using restricted cubic splines. A key point here is that various modelling choices need to be made, for example, I have chosen not to include interactions between any of the covariates. Different modelling choices will result in different estimates.

. gen female = sex==2

. rcsgen age , df(3) gen(agercs) orthog
Variables agercs1 to agercs3 were created

. rcsgen yydx , df(3) gen(yearrcs) orthog
Variables yearrcs1 to yearrcs3 were created

. 
. stpm2 agercs* female yearrcs*, scale(hazard) df(5) bhazard(rate) ///
>                           tvc(agercs* female) dftvc(3)   

Iteration 0:   log likelihood = -4291.7959  
Iteration 1:   log likelihood = -4221.2221  
Iteration 2:   log likelihood = -4216.8452  
Iteration 3:   log likelihood =  -4216.714  
Iteration 4:   log likelihood = -4216.7071  
Iteration 5:   log likelihood = -4216.7071  

Log likelihood = -4216.7071                              Number of obs = 4,744

---------------------------------------------------------------------------------
                | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
----------------+----------------------------------------------------------------
xb              |
        agercs1 |   .4014643   .0509245     7.88   0.000     .3016541    .5012744
        agercs2 |  -.0241269   .0519829    -0.46   0.643    -.1260115    .0777577
        agercs3 |   -.077221   .0497178    -1.55   0.120    -.1746662    .0202241
         female |  -.4660131   .0871942    -5.34   0.000    -.6369105   -.2951157
       yearrcs1 |   .0378156   .0466302     0.81   0.417    -.0535778     .129209
       yearrcs2 |  -.0840698   .0434742    -1.93   0.053    -.1692778    .0011382
       yearrcs3 |   .0226007   .0427815     0.53   0.597    -.0612495    .1064508
          _rcs1 |   .8485124   .0520516    16.30   0.000     .7464932    .9505316
          _rcs2 |   .2333051   .0517937     4.50   0.000     .1317914    .3348189
          _rcs3 |   .0467025   .0240512     1.94   0.052    -.0004371     .093842
          _rcs4 |   .0323118   .0137245     2.35   0.019     .0054122    .0592113
          _rcs5 |   .0194916   .0086331     2.26   0.024     .0025711     .036412
  _rcs_agercs11 |   .0222324   .0489672     0.45   0.650    -.0737416    .1182063
  _rcs_agercs12 |   .0379927   .0464296     0.82   0.413    -.0530077     .128993
  _rcs_agercs13 |   .0350472   .0215183     1.63   0.103    -.0071278    .0772223
  _rcs_agercs21 |  -.0939309   .0459412    -2.04   0.041     -.183974   -.0038878
  _rcs_agercs22 |  -.0120687   .0433049    -0.28   0.780    -.0969448    .0728074
  _rcs_agercs23 |   .0225369    .020735     1.09   0.277    -.0181029    .0631767
  _rcs_agercs31 |  -.0494584   .0434172    -1.14   0.255    -.1345546    .0356378
  _rcs_agercs32 |   .0334398   .0398716     0.84   0.402    -.0447071    .1115867
  _rcs_agercs33 |  -.0040932   .0201338    -0.20   0.839    -.0435547    .0353684
   _rcs_female1 |  -.0637989    .067838    -0.94   0.347     -.196759    .0691611
   _rcs_female2 |  -.0909384   .0607976    -1.50   0.135    -.2100995    .0282228
   _rcs_female3 |   .0296021   .0326255     0.91   0.364    -.0343426    .0935468
          _cons |  -1.720422   .0584158   -29.45   0.000    -1.834914   -1.605929
---------------------------------------------------------------------------------

. standsurv, surv timevar(tt) ci atvar(mrs_cond)

After fitting the model I have used standsurv to obtain the estimate of marginal relative survival using regression standardization. This predicts a relative survival function for each individual conditional on their observed covariate pattern and takes the average of these curves. In this case there are 4,744 individuals in the study and so the estimated marginal relative survival is an average of 4,744 different survival curves.

I can now compare the model based estimate, based on regression standardization, and the non-parametric Pohar estimate.

. twoway (rarea R_pp_lci R_pp_uci _t, sort connect(stairstep) color(${C1ci}))             ///
>        (line R_pp _t, sort connect(stairstep) color(${C1}))                             ///
>        (line mrs_cond* tt_cond, color(${C2} ${C2} ${C2}) lpattern(solid dash dash))     ///
>        , legend(order(2 "Pohar Perme" 3 "Regression standardization after stpm2 model") ///
>                 ring(0) cols(1) pos(7))                                                 ///
>          ylabel(0.6(0.1)1, format(%3.1f))                                               ///
>          ytitle("Marginal relative survival")                                           ///
>          xtitle("Years from diagnosis")                                                 ///
>          name(int_stand_standsurv, replace)      

There is now good agreement between the estimate based on regression standardizion and the non-parametric estimate. I will now move on to describing how to model marginal relative survival directly, so we can make fewer modelling decisions if we are only interested in the estimation of marginal relative survival.

Using mrsprep to prepare data for fitting a marginal modelling

In order to directly model marginal relative survival I will run mrsprep. This does two things,

  1. It calculates time-dependent weights which are the inverse of the expected survival. These are needed as the estimand of interest is in the net world, where it is not possible to die from causes other than the cancer under study. However, we have data in the real world and as follow-up time increases we have fewer at risk and fewer deaths than we would see in the net world. The weights are based on the same idea as the weights used in the Pohar Perme non-parametric estimate.
  2. At each event time it calculates the weighted mean mortality (hazard) rate for those still at risk. The weights are based on the inverse of expected survival among those at risk. The weighted mean is needed as the marginal relative survival is of interest. See the paper for more details.

The code for mrsprep is shown below.

. mrsprep using https://pclambert.net/data/popmort.dta   ///
>               , pmother(sex) agediag(age) datediag(dx) ///
>                 breaks(0(0.2)10)                       

mrsprep needs the filename of where the expected mortality rates are stored. It requires the name of the variable for age at diagnosis and the name of the variable for date of diagnosis. It also needs the name of variables other than age and calendar year that the expected mortality rates are stratified by, in this case this is just sex. The final option is breaks(0(0.2)10). This splits the time scale into intervals, each of width 0.2 years. The weights are calculated at the mid-point of each interval. This is an approximation, greater precision can be obtained without narrower intervals, but the expanded dataset becomes larger. See the paper for a sensitivity analysis for different interval widths.

Below is a listing for the first two individuals in the dataset.

. list id age sex tstart tstop wt meanhazard event if inlist(id,51,574), ///
>      noobs sepby(id) abbrev(13)

  +--------------------------------------------------------------------------+
  |  id   age   sex   tstart       tstop          wt   meanhazard_wt   event |
  |--------------------------------------------------------------------------|
  |  51    86     1        0          .2    1.017644             999       0 |
  |  51    86     1       .2          .4   1.0556412             999       0 |
  |  51    86     1       .4          .6   1.0968963             999       0 |
  |  51    86     1       .6          .8   1.1378526             999       0 |
  |  51    86     1       .8           1   1.1783593             999       0 |
  |  51    86     1        1         1.2   1.2222716             999       0 |
  |  51    86     1      1.2       1.375   1.2674326       .02960256       1 |
  |--------------------------------------------------------------------------|
  | 574    69     2        0          .2   1.0039818             999       0 |
  | 574    69     2       .2          .4   1.0119931             999       0 |
  | 574    69     2       .4          .6   1.0206144             999       0 |
  | 574    69     2       .6          .8   1.0298603             999       0 |
  | 574    69     2       .8           1   1.0391899             999       0 |
  | 574    69     2        1         1.2   1.0486292             999       0 |
  | 574    69     2      1.2         1.4   1.0581798             999       0 |
  | 574    69     2      1.4         1.6   1.0673769             999       0 |
  | 574    69     2      1.6         1.8   1.0762101             999       0 |
  | 574    69     2      1.8           2    1.085564             999       0 |
  | 574    69     2        2         2.2   1.0958287             999       0 |
  | 574    69     2      2.2         2.4   1.1065721             999       0 |
  | 574    69     2      2.4         2.6   1.1174208             999       0 |
  | 574    69     2      2.6         2.8   1.1283759             999       0 |
  | 574    69     2      2.8           3   1.1394383             999       0 |
  | 574    69     2        3         3.2   1.1510196             999       0 |
  | 574    69     2      3.2         3.4   1.1631333             999       0 |
  | 574    69     2      3.4         3.6   1.1753744             999       0 |
  | 574    69     2      3.6         3.8   1.1877444             999       0 |
  | 574    69     2      3.8           4   1.2002446             999       0 |
  | 574    69     2        4         4.2   1.2135969             999       0 |
  | 574    69     2      4.2         4.4   1.2278268             999       0 |
  | 574    69     2      4.4         4.6   1.2422236             999       0 |
  | 574    69     2      4.6         4.8   1.2567893             999       0 |
  | 574    69     2      4.8           5   1.2715257             999       0 |
  | 574    69     2        5         5.2   1.2863285             999       0 |
  | 574    69     2      5.2         5.4   1.3011962             999       0 |
  | 574    69     2      5.4         5.6   1.3162357             999       0 |
  | 574    69     2      5.6         5.8    1.331449             999       0 |
  | 574    69     2      5.8           6   1.3468382             999       0 |
  | 574    69     2        6         6.2   1.3633453             999       0 |
  | 574    69     2      6.2         6.4    1.381007             999       0 |
  | 574    69     2      6.4         6.6   1.3988974             999       0 |
  | 574    69     2      6.6         6.8   1.4170196             999       0 |
  | 574    69     2      6.8           7   1.4353766             999       0 |
  | 574    69     2        7         7.2   1.4547646             999       0 |
  | 574    69     2      7.2   7.2916667   1.4696508       .03708329       1 |
  +--------------------------------------------------------------------------+

Individual 51 is 86 years old at diagnosis and male. They have 7 rows of data with each row corresponding to a different time interval. The start of the interval is given by tstart and the end of the interval by tstop. Each time interval is 0.2 years, execpt the last interval in which they die (event==1) at 1.375 years. For each interval there is an associated weight (wt), which is the inverse of the expected survival at the midpoint of the interval. As the expected survival decreases over time, the weights increase over time. The meanhazard_wt gives the weighted mean expected mortality rate at each individuals event time. Note that for any censored time it is set to 999. When fitting relative survival models using stpm2 or strcs the expected mortality rate at the event time is needed, but is not required for any censored times. However, having a missing value would exclude these rows from the analysis and so we feed it a value that is actually not used when we fit the model. Individual 574 is younger than Individual 1 and so the weights are lower at the same time points, e.g. 1.039 vs 1.222 at 1 year.

Having restructured the data we can now use stset where we need to give the end of each interval (tstop), the start of the interval (tstart). The weights are passed using [iweights=wt].

. stset tstop [iweight=wt], enter(tstart) failure(event==1)                                          

Survival-time data settings

         Failure event: event==1
Observed time interval: (0, tstop]
     Enter on or after: time tstart
     Exit on or before: failure
                Weight: [iweight=wt]

--------------------------------------------------------------------------
    112,229  total observations
          0  exclusions
--------------------------------------------------------------------------
    112,229  observations remaining, representing
      1,401  failures in single-record/single-failure data
 21,994.417  total analysis time at risk and under observation
                                                At risk from t =         0
                                     Earliest observed entry t =         0
                                          Last observed exit t =        10

The marginal model van now be fitted using stpm2. As there are time-dependent weights cluster robust standard errors are used using vce(cluster id).

. stpm2, scale(hazard) df(5) bhazard(meanhazard_wt) vce(cluster id)
note: delayed entry models are being fitted

Iteration 0:   log pseudolikelihood = -5670.0926  
Iteration 1:   log pseudolikelihood = -5558.0634  
Iteration 2:   log pseudolikelihood = -5557.3568  
Iteration 3:   log pseudolikelihood = -5557.3542  
Iteration 4:   log pseudolikelihood = -5557.3542  

Log pseudolikelihood = -5557.3542                      Number of obs = 112,229

                                 (Std. err. adjusted for 4,744 clusters in id)
------------------------------------------------------------------------------
             |               Robust
             | Coefficient  std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
xb           |
       _rcs1 |    .983882   .0430693    22.84   0.000     .8994676    1.068296
       _rcs2 |   .1594691   .0243798     6.54   0.000     .1116855    .2072526
       _rcs3 |   .0581832   .0161444     3.60   0.000     .0265407    .0898257
       _rcs4 |   .0197788   .0125402     1.58   0.115    -.0047995     .044357
       _rcs5 |   .0184667   .0097193     1.90   0.057    -.0005827    .0375162
       _cons |  -2.208731   .0511667   -43.17   0.000    -2.309016   -2.108446
------------------------------------------------------------------------------

. range tt 0 10 101
(112,128 missing values generated)

. predict s_mrsprep, surv timevar(tt) ci

After fitting the model, the marginal relative survival has been predicted. This can now be compared to the Pohar Perme non-parametric estimate.

. merge 1:1 _n using PP,  nogenerate

    Result                      Number of obs
    -----------------------------------------
    Not matched                       107,485
        from master                   107,485  
        from using                          0  

    Matched                             4,744  
    -----------------------------------------

. twoway (rarea R_pp_lci R_pp_uci PP_t, sort connect(stairstep) color(${C1ci}))   ///
>        (line R_pp PP_t, sort connect(stairstep) color(${C1}))                   ///
>        (line s_mrsprep* tt, color(${C2} ${C2} ${C2}) lpattern(solid dash dash)) ///
>        , legend(order(2 "Pohar Perme" 3 "Marginal stpm2 model")                 ///
>                 ring(0) cols(1) pos(7))                                         ///
>          ylabel(0.6(0.1)1, format(%3.1f))                                       ///
>          ytitle("Marginal relative survival")                                   ///
>          xtitle("Years from diagnosis")                                         ///
>          name(int_stand_standsurv, replace)      

There is now good agreement between the model based and the non parametric Pohar Perme estimtor. Here the marginal estimate obtained from regression standardization and them arginal model are very similar.

The estimate here is an internally standardized estimate, over the observed covariate distribution and thus would not be comparable to another study with a different age/sex distribution or if separate analysis were performed for males and females. See the example of external age standardization and modelling covarites for further extensions.

Note that mrsprep makes uses of frames.

. frame
  (current frame is mrs_data)

. frames dir
* default   4744 x 74; Skin melanoma, diagnosed 1975-94, follow-up to 1995
* mrs_data  112229 x 37

Note: Frames marked with * contain unsaved data.

It is possible to switch to the orginal data using frame change default.

References

Lambert PC, Syriopoulou E, Rutherford MJ. Direct modelling of age standardized marginal relative survival through incorporation of time-dependent weights. BMC Medical Research Methodology 2021;21:84

Professor of Biostatistics