Standardization

Using standsurv

I will not go into details of standsurv here. There are various tutorials here.

There main advantage of using stpm3 rather than stpm2 with standsurv is the support for factor variables. I will ilustrate this with an example.

The code below loads the example colon cancer data set.

. use "https://pclambert.net/data/colon.dta", clear
(Colon carcinoma, diagnosed 1975-94, follow-up to 1995)

. stset surv_mm,f(status=1,2) id(id) scale(12) exit(time 120.5)

Survival-time data settings

           ID variable: id
         Failure event: status==1 2
Observed time interval: (surv_mm[_n-1], surv_mm]
     Exit on or before: time 120.5
     Time for analysis: time/12

--------------------------------------------------------------------------
     15,564  total observations
          0  exclusions
--------------------------------------------------------------------------
     15,564  observations remaining, representing
     15,564  subjects
     10,459  failures in single-failure-per-subject data
 51,685.667  total analysis time at risk and under observation
                                                At risk from t =         0
                                     Earliest observed entry t =         0
                                          Last observed exit t =  10.04167

. gen male = sex==1

To illust rate the prediction options I will fit a model with an interaction between age and sex for both main and time-dependent effects, where the effect of age is modelled using natural splines.

. stpm3 i.male##@ns(age,df(3)),                 ///
>       tvc(i.male##@ns(age,df(3))) dftvc(2)    ///
>       df(5) scale(lncumhazard)

Iteration 0:   log likelihood =   -26505.4  
Iteration 1:   log likelihood = -26040.023  
Iteration 2:   log likelihood = -26005.258  
Iteration 3:   log likelihood = -26004.878  
Iteration 4:   log likelihood = -26004.877  

                                                       Number of obs =  15,564
                                                       Wald chi2(7)  = 1029.65
Log likelihood = -26004.877                            Prob > chi2   =  0.0000

-----------------------------------------------------------------------------------------------
                              | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
------------------------------+----------------------------------------------------------------
xb                            |
                       1.male |   -.241306   .2648958    -0.91   0.362    -.7604922    .2778802
                  _ns_f1_age1 |  -6.206268   .9371473    -6.62   0.000    -8.043043   -4.369493
                  _ns_f1_age2 |  -.4202363   .4447267    -0.94   0.345    -1.291885    .4514121
                  _ns_f1_age3 |  -1.728845   .2531885    -6.83   0.000    -2.225086   -1.232605
                              |
           male#c._ns_f1_age1 |
                           1  |    .684687   1.445988     0.47   0.636    -2.149398    3.518772
                              |
           male#c._ns_f1_age2 |
                           1  |   .2789823   .6314241     0.44   0.659    -.9585862    1.516551
                              |
           male#c._ns_f1_age3 |
                           1  |   .6956899   .4554634     1.53   0.127     -.197002    1.588382
------------------------------+----------------------------------------------------------------
time                          |
                         _ns1 |  -10.15155   .8167647   -12.43   0.000    -11.75238   -8.550724
                         _ns2 |   2.443526   .2234638    10.93   0.000     2.005545    2.881507
                         _ns3 |  -1.284021   .1792894    -7.16   0.000    -1.635422     -.93262
                         _ns4 |   -.836973   .1507723    -5.55   0.000    -1.132481   -.5414646
                         _ns5 |  -.6772498   .1283478    -5.28   0.000    -.9288069   -.4256928
                              |
              male#c._ns_tvc1 |
                           1  |    .408095   .7632089     0.53   0.593    -1.087767    1.903957
                              |
              male#c._ns_tvc2 |
                           1  |   1.358495   .6838178     1.99   0.047     .0182364    2.698753
                              |
     c._ns_f1_age1#c._ns_tvc1 |  -19.04826   6.611939    -2.88   0.004    -32.00742     -6.0891
                              |
     c._ns_f1_age1#c._ns_tvc2 |   2.555648   2.733907     0.93   0.350    -2.802712    7.914008
                              |
     c._ns_f1_age2#c._ns_tvc1 |   3.701191   3.555068     1.04   0.298    -3.266614      10.669
                              |
     c._ns_f1_age2#c._ns_tvc2 |  -.0607427   1.341542    -0.05   0.964    -2.690116    2.568631
                              |
     c._ns_f1_age3#c._ns_tvc1 |  -2.033875   .8956267    -2.27   0.023    -3.789271    -.278479
                              |
     c._ns_f1_age3#c._ns_tvc2 |  -.9184183    .694715    -1.32   0.186    -2.280035    .4431981
                              |
male#c._ns_f1_age1#c._ns_tvc1 |
                           1  |   7.104182   8.667079     0.82   0.412    -9.882981    24.09134
                              |
male#c._ns_f1_age1#c._ns_tvc2 |
                           1  |   -2.53056   3.925283    -0.64   0.519    -10.22397    5.162853
                              |
male#c._ns_f1_age2#c._ns_tvc1 |
                           1  |  -4.308122   4.611764    -0.93   0.350    -13.34701     4.73077
                              |
male#c._ns_f1_age2#c._ns_tvc2 |
                           1  |  -1.032837   1.804608    -0.57   0.567    -4.569803    2.504129
                              |
male#c._ns_f1_age3#c._ns_tvc1 |
                           1  |   .0752164   1.472709     0.05   0.959     -2.81124    2.961673
                              |
male#c._ns_f1_age3#c._ns_tvc2 |
                           1  |  -2.959469   1.198583    -2.47   0.014    -5.308649   -.6102891
                              |
                        _cons |   1.835956   .1428585    12.85   0.000     1.555958    2.115953
-----------------------------------------------------------------------------------------------
Extended functions
 (1) @ns(age, df(3))
Warning: This is a test version of stpm3

To obtain the marginal survival for males and females which is standardized over the combined covariate distribution (just age in this case) we can use standsurv.

. range tt 0 10 101
(15,463 missing values generated)

. standsurv, at1(male 1) at2(male 0)    ///
>            atvar(Sm Sf)               ///
>            survival ci                ///
>            timevar(tt)                ///
>            contrast(difference)       ///
>            contrastvar(Sdiff)

If this was an stpm2 model then the spline variables would need to be calculated and then the interactions with age formed and this information passed to standsurv. To demonstrate the advantages of using stpm3 with factor variables and extended functions I will now fit the same model without using them.

. gensplines age, gen(agens) df(3) type(ns)

. forvalues i = 1/3 {
  2.   gen m_agens`i' = agens`i' * male
  3. }

. stpm3 male agens1 agens2 agens3 m_agens1 m_agens2 m_agens3,                 ///
>       tvc(male agens1 agens2 agens3 m_agens1 m_agens2 m_agens3) dftvc(2)    ///
>       df(5) scale(lncumhazard)

Iteration 0:   log likelihood =   -26505.4  
Iteration 1:   log likelihood = -26040.023  
Iteration 2:   log likelihood = -26005.258  
Iteration 3:   log likelihood = -26004.878  
Iteration 4:   log likelihood = -26004.877  

                                                       Number of obs =  15,564
                                                       Wald chi2(7)  = 1029.65
Log likelihood = -26004.877                            Prob > chi2   =  0.0000

---------------------------------------------------------------------------------------
                      | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
----------------------+----------------------------------------------------------------
xb                    |
                 male |   -.241306   .2648958    -0.91   0.362    -.7604922    .2778802
               agens1 |  -6.206268   .9371473    -6.62   0.000    -8.043043   -4.369493
               agens2 |  -.4202364   .4447267    -0.94   0.345    -1.291885     .451412
               agens3 |  -1.728845   .2531885    -6.83   0.000    -2.225086   -1.232605
             m_agens1 |    .684687   1.445988     0.47   0.636    -2.149398    3.518772
             m_agens2 |   .2789823   .6314241     0.44   0.659    -.9585861    1.516551
             m_agens3 |     .69569   .4554634     1.53   0.127     -.197002    1.588382
----------------------+----------------------------------------------------------------
time                  |
                 _ns1 |  -10.15155   .8167647   -12.43   0.000    -11.75238   -8.550724
                 _ns2 |   2.443526   .2234638    10.93   0.000     2.005545    2.881507
                 _ns3 |  -1.284021   .1792894    -7.16   0.000    -1.635422     -.93262
                 _ns4 |   -.836973   .1507723    -5.55   0.000    -1.132481   -.5414646
                 _ns5 |  -.6772499   .1283478    -5.28   0.000    -.9288069   -.4256928
                      |
    c.male#c._ns_tvc1 |   .4080952   .7632089     0.53   0.593    -1.087767    1.903957
                      |
    c.male#c._ns_tvc2 |   1.358495   .6838178     1.99   0.047     .0182364    2.698753
                      |
  c.agens1#c._ns_tvc1 |  -19.04826   6.611939    -2.88   0.004    -32.00742   -6.089098
                      |
  c.agens1#c._ns_tvc2 |   2.555648   2.733907     0.93   0.350    -2.802712    7.914008
                      |
  c.agens2#c._ns_tvc1 |    3.70119   3.555068     1.04   0.298    -3.266615    10.66899
                      |
  c.agens2#c._ns_tvc2 |  -.0607429   1.341542    -0.05   0.964    -2.690116    2.568631
                      |
  c.agens3#c._ns_tvc1 |  -2.033875   .8956267    -2.27   0.023    -3.789271   -.2784789
                      |
  c.agens3#c._ns_tvc2 |  -.9184183    .694715    -1.32   0.186    -2.280035    .4431981
                      |
c.m_agens1#c._ns_tvc1 |   7.104179   8.667079     0.82   0.412    -9.882983    24.09134
                      |
c.m_agens1#c._ns_tvc2 |   -2.53056   3.925283    -0.64   0.519    -10.22397    5.162853
                      |
c.m_agens2#c._ns_tvc1 |  -4.308121   4.611764    -0.93   0.350    -13.34701    4.730771
                      |
c.m_agens2#c._ns_tvc2 |  -1.032837   1.804608    -0.57   0.567    -4.569803    2.504129
                      |
c.m_agens3#c._ns_tvc1 |    .075216   1.472709     0.05   0.959    -2.811241    2.961673
                      |
c.m_agens3#c._ns_tvc2 |  -2.959469   1.198583    -2.47   0.014    -5.308649   -.6102891
                      |
                _cons |   1.835956   .1428585    12.85   0.000     1.555958    2.115953
---------------------------------------------------------------------------------------
Warning: This is a test version of stpm3

It is necessary to incorporate the interactions into the standsurv call.

. standsurv, at1(male 1 m_agens1 = agens1 m_agens2 = agens2 m_agens3 = agens3) ///
>            at2(male 0 m_agens1 0 m_agens2 0 m_agens3 0)                      ///
>            atvar(Sm2 Sf2)                                                    ///
>            survival ci                                                       ///
>            timevar(tt)                                                       ///
>            contrast(difference)                                              ///
>            contrastvar(Sdiff2)

The standardized estimates are identical, but using factor variables combined with extended functions makes life much easier.

. list Sm Sm2 Sf Sf2 Sdiff Sdiff2 in 1/21

     +-----------------------------------------------------------------------+
     |        Sm         Sm2          Sf         Sf2       Sdiff      Sdiff2 |
     |-----------------------------------------------------------------------|
  1. |         1           1           1           1           0           0 |
  2. | .92529545   .92529545   .94122671   .94122671   .01593126   .01593126 |
  3. | .84974545   .84974545   .87134443   .87134443   .02159898   .02159898 |
  4. | .80156903   .80156903   .82310036   .82310036   .02153133   .02153133 |
  5. | .76810442   .76810442   .78841691   .78841691   .02031248   .02031248 |
     |-----------------------------------------------------------------------|
  6. |  .7411819    .7411819   .76034214   .76034214   .01916024   .01916024 |
  7. | .71729707   .71729707   .73561503   .73561503   .01831796   .01831796 |
  8. | .69488063   .69488063   .71268414   .71268414   .01780351   .01780351 |
  9. | .67372376   .67372376   .69128997   .69128997    .0175662    .0175662 |
 10. | .65393981   .65393981   .67149717   .67149717   .01755737   .01755737 |
     |-----------------------------------------------------------------------|
 11. | .63554187   .63554187   .65327779   .65327779   .01773593   .01773593 |
 12. | .61848448   .61848448    .6365447    .6365447   .01806021   .01806021 |
 13. | .60270023   .60270023   .62119776   .62119776   .01849753   .01849753 |
 14. | .58811288   .58811288   .60713539   .60713539   .01902251   .01902251 |
 15. | .57464343   .57464343   .59425861   .59425861   .01961518   .01961518 |
     |-----------------------------------------------------------------------|
 16. | .56218753   .56218753   .58244777   .58244777   .02026024   .02026024 |
 17. | .55060507   .55060507    .5715515    .5715515   .02094643   .02094643 |
 18. | .53977606   .53977606   .56144057   .56144057   .02166451   .02166451 |
 19. | .52960184   .52960184   .55200868   .55200868   .02240684   .02240684 |
 20. | .52000052   .52000052   .54316761   .54316761   .02316709   .02316709 |
     |-----------------------------------------------------------------------|
 21. | .51090369   .51090369   .53484371   .53484371   .02394001   .02394001 |
     +-----------------------------------------------------------------------+

The marginal estimates can be plotted

. line Sm Sf tt, xtitle("Time since diagnosis")      ///
>                ytitle(S(t))                        ///
>                legend(order(1 "Males" 2 "Females") ///
>                       ring(0) pos(1) cols(1))      ///
>                name(Marginal, replace)

. twoway (rarea Sdiff_lci Sdiff_uci tt, color(red%30)) ///
>          (line Sdiff tt, color(red)),                ///
>          xtitle("Time since diagnosis")              ///
>          ytitle(Difference in marginal survival)     /// 
>          legend(off)                                 ///
>          name(Marginal_diff, replace)

Professor of Biostatistics