# Some comments on using steffects in Stata

## Using stteffects

There is a command in Stata called stteffects which calculates marginal effects for survival-time data. This is the description in the helpfile:

"stteffects estimates average treatment effects, average treatment effects on the treated, and potential-outcome means using observational survival-time data. The available estimators are regression adjustment, inverse-probability weighting, and more efficient methods that combine regression adjustment and inverse-probability weighting."

I will concentrate on regression adjustment as this is essentially what standsurv does. The other estimators are just different methods to estimate the same underlying quantities.

I do not use stteffects as the estimand of interest is based on the mean survival time. In many cases this relies on extrapolation beyond the range of follow-up. In my applications I am not willing to make such strong assumptions. I will illustrate this with an example using the Rotterdam Breast cancer data. The code below loads and stset’s the data. I restrict follow-up time to 5 years as ths highlights some of the extrapolation issues. I would usually use the exit() option of stset to restrict follow-up time, but for some reason sttefects does not allow you to do this.

. use https://www.pclambert.net/data/rott2b if nodes>0, clear
(Rotterdam breast cancer data (augmented with cause of death))

. gen os2  = cond(os<(5*12),os,60)

. gen osi2 = cond(os<(5*12),osi,0)

. stset os2, f(osi2==1) scale(12)

failure event:  osi2 == 1
obs. time interval:  (0, os2]
exit on or before:  failure
t for analysis:  time/12

------------------------------------------------------------------------------
1,546  total observations
0  exclusions
------------------------------------------------------------------------------
1,546  observations remaining, representing
565  failures in single-record/single-failure data
6,343.591  total analysis time at risk and under observation
at risk from t =         0
earliest observed entry t =         0
last observed exit t =         5

.
. sts graph, by(hormon)

failure _d:  osi2 == 1
analysis time _t:  os2/12



I have plotted the Kaplan-Meier curve comparing those who receive and do not receive hormonal treatment, which shows that those taking treatment have slightly better survival. As shown in other examples there is imbalance between the groups for a number of potential confounders. For simplicity, I will just use age at diagnosis and the number of positive lymph nodes as potential confounders.

. tabstat age nodes ,by(hormon)

Summary statistics: mean
by categories of: hormon (Hormonal therapy)

hormon |       age     nodes
---------+--------------------
no |  54.12925  5.094449
yes |  62.54867  5.719764
---------+--------------------
Total |  55.97542  5.231565
------------------------------



Those receiving treatment are older and have slightly more positive lymph nodes.

I will first use stteffects using regression adjustment (ra) with age and enodes as covariates. I will use the potential outcomes framework,

• $T^{x=0}$ is the potential survival time for the non-treated/unexposed.
• $T^{x=1}$ is the potential survival time for the treated/exposed.

An average causal effect could be defined as the difference in the expected values of these two potential outcomes

$$E\left[T^{x=1}\right] - E\left[T^{x=0}\right]$$

I use stteffects to estimate this below,

. stteffects ra (age enodes, weibull) (hormon),

failure _d:  osi2 == 1
analysis time _t:  os2/12

Iteration 0:   EE criterion =  1.789e-18
Iteration 1:   EE criterion =  1.067e-29

Survival treatment-effects estimation           Number of obs     =      1,546
Outcome model  : Weibull
Treatment model: none
Censoring model: none
------------------------------------------------------------------------------
|               Robust
_t |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
ATE          |
hormon |
(yes vs no)  |   1.067025   1.230018     0.87   0.386    -1.343765    3.477816
-------------+----------------------------------------------------------------
POmean       |
hormon |
no  |   7.841406   .3771598    20.79   0.000     7.102186    8.580625
------------------------------------------------------------------------------



The output has given the mean potential outcome for those not receiving treatment and the average causal difference (labelled ATE). Thus, the estimated mean survival time is 7.84 years for those not taking treatment and estimated to be 1.07 year lower for those taking treatmnet. The 95% confidence interval for this difference spans zero.

What has stteffects done? Well effectively it has fitted a Weibull regression model that includes age and enodes and hormon and the interactions between enodes and hormon and between age and hormon. It has also allowed the shape parameter of the Weibull model to vary by hormon. This model is the same as fitting separate Weibull models for the untreated and treated groups. It has also estimated the mean survival time if all individuals were untreated and the mean difference in survival time between the treated and untreated. It actually simultaneously estimates all this, but as I will show below, it is doing exactly the same as regression standardization using standsurv.

I have assumed a Weibull model in the above code, so it is of interest to see if fitting a different model gives notable differences. Below I run steffects again, but now using a lognormal model.

. stteffects ra (age enodes, lnormal) (hormon),

failure _d:  osi2 == 1
analysis time _t:  os2/12

Iteration 0:   EE criterion =  1.716e-25
Iteration 1:   EE criterion =  9.506e-30

Survival treatment-effects estimation           Number of obs     =      1,546
Outcome model  : lnormal
Treatment model: none
Censoring model: none
------------------------------------------------------------------------------
|               Robust
_t |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
ATE          |
hormon |
(yes vs no)  |   3.430201   3.669848     0.93   0.350    -3.762568    10.62297
-------------+----------------------------------------------------------------
POmean       |
hormon |
no  |   12.56943   1.027354    12.23   0.000     10.55586    14.58301
------------------------------------------------------------------------------



The mean survival time for the non-treated has increased from 7.84 years to 12.57 and the average causal difference has increased from 1.07 years to 3.43 years. This should be concerning has the estimates appear to be strongly dependent on the choice of parametric distribution.

Let’s explore what is happening here and why the estimates are so different. The code below fits all models available in streg to the non-treated group (hormon=0) and compares the estimated survival functions to the Kaplan-Meier estimate. I am not fitting covariates here as I am just making a point about extrapolating beyond the range of your data.

foreach dist in weibull gompertz loglogistic lognormal ggamma {
streg if hormon == 1, dist(dist')
predict surv_dist', surv
estimates store dist'
}

sts gen s_km = s if hormon == 1
twoway  (line s_km _t, connect(stairstep) sort lcolor(black))   ///
(line surv_* _t, sort),                                 ///
legend(order(1 "Kaplan-Meier" 2 "Weibull" 3 "Gompertz"  ///
4 "LogLogistic" 5 "LogNormal" 6 "Ggamma")  ///
ring(0) cols(1) pos(1))                    ///
xtitle("Years from Surgery")                            ///
ytitle("S(t)")                                          ///
ylabel(,format(%3.1f))


None of the fitted line are perfect (we will do better with stpm2 later), but are in broad agreement. The problems is we have censored survival data and the maximum follow-up time is 5 years. Mean survival can be calculated as the area under the survival function and so the graph above does not show what we want. I now perform predictions up to 80 years. Here were are extrapolating beyond the range of our follow-up.

preserve
gen oldt = _t
drop _t
range _t 0 80
foreach dist in weibull gompertz loglogistic lognormal ggamma {
estimates restore dist'
predict surv80_dist', surv
}

twoway  (line s_km oldt, connect(stairstep) sort lcolor(black)) ///
(line surv80_* _t, sort),                               ///
legend(order(1 "Kaplan-Meier" 2 "Weibull" 3 "Gompertz"  ///
4 "LogLogistic" 5 "LogNormal" 6 "Ggamma")  ///
ring(0) cols(1) pos(1))                               ///
xtitle("Years from Surgery")                            ///
ytitle("S(t)")                                          ///
ylabel(,format(%3.1f))                                  ///
xline(5, lpattern(dash))
restore


Now we can see the difference between the estimated survival functions between the different models. There is very little difference up to 5 years, where we have follow-up information to, but they are very different beyond this point. The mean survival time is the area under each curve and we can see why the Weibull model gives a lower mean than the log-normal model. In fact the Log-logistic and Log-Normal model are still clearly above zero at 80 years.

What does this mean? Well if you have a lot of censoring and the survival function is clearly above zero at the end of your follow-up then you are making strong assumptions about what happens beyond where you have data if you use stteffects.

## Equivalent model using stpm2 and standsurv

I will show that I can get the same estimates as stteffects by fitting an stpm2 model followed by using standsurv. First I create interactions between hormon and age and between hormon and enodes. Then I fit the model.

. gen age_h = age*hormon

. gen enodes_h = enodes*hormon

.
. stpm2 hormon age enodes age_h enodes_h, ///
>       scale(hazard) df(1) eform nolog tvc(hormon) dftvc(1)

Log likelihood = -1329.3579                     Number of obs     =      1,546

--------------------------------------------------------------------------------
|     exp(b)   Std. Err.      z    P>|z|     [95% Conf. Interval]
---------------+----------------------------------------------------------------
xb             |
hormon |   1.134897   .8353114     0.17   0.863     .2681956    4.802432
age |   1.013167   .0036718     3.61   0.000     1.005996    1.020389
enodes |   .1548263   .0282304   -10.23   0.000     .1083029    .2213345
age_h |   .9973785   .0109109    -0.24   0.810     .9762212    1.018994
enodes_h |   .5988075   .2563495    -1.20   0.231     .2587545    1.385756
_rcs1 |   2.209114   .0764872    22.89   0.000     2.064175     2.36423
_rcs_hormon1 |   1.083153   .0889351     0.97   0.331     .9221461    1.272272
_cons |   .4478668    .105619    -3.41   0.001     .2821063    .7110253
--------------------------------------------------------------------------------
Note: Estimates are transformed only in the first equation.



I have used df(1) as this is equivalent to a Weibull model. By using tvc(hormon) and dftvc(1) I have allowed the shape parameter of the Weibull model to vary between those receiving and not-receiving treatment. I could have fitted separate models by treatment group and the parameter estimates would be the same, but by fitting one model I will be able to form contrasts between the treatment groups using standsurv.

I will now standsurv to estimate the marginal survival functions up to 40 years for the non-treated and treated subjects. As the model has interactions I need to make sure these are all set to zero for the non-treated (hormon=0) and take the values of the relevant covariates for the treated (hormon=1).

. range ttlong 0 40 100
(1,446 missing values generated)

. standsurv, atvar(S_h0b S_h1b) timevar(ttlong) ci        ///
>            at1(hormon 0 age_h 0 enodes_h 0)             ///
>            at2(hormon 1 age_h = age enodes_h = enodes)



The two survival functions can then plotted. A reference line at 5 years has been added to indicate where we have follow-up information to.

. twoway  (line S_h0b ttlong, sort lcolor(red)) ///
>    (line S_h1b ttlong, sort lcolor(blue)) ///
>    , legend(order(1 "No hormonal treatment" 2 "Hormonal treatment") ring(0) cols(1) pos(1)) ///
>    ylabel(0(0.1)1,angle(h) format(%3.1f)) ///
>    ytitle("S(t)") ///
>    xtitle("Years from surgery") ///
>    xline(5, lpattern(dash) lcolor(black%50))



The estimated average potential outcome for the untreated can be approximated by using integ to approximate the area under the survival curve (I will use standsurv to do this better shortly).

. integ S_h0b ttlong

number of points = 100

integral         = 7.8376079



This is similar to the output from stteffects above (7.841406 vs 7.8376079). Using the rmst option of standsurv will do more accurate integration. I will integrate up to 100 years (theoretically the integral is to infinity, but the survival is virtually zero at 100 years). rmst stands for restricted mean survival time, but if $t$ is large enough so that $S(t)\approx 0$ then this is effectively the area under the full survival curve.

. gen tt100 = 100 in 1
(1,545 missing values generated)

. standsurv, rmst ci  timevar(tt100) trans(none) ///
>     at1(hormon 0 age_h 0 enodes_h 0)                ///
>     at2(hormon 1 age_h = age enodes_h = enodes)     ///
>     contrast(difference)                             ///
>     atvars(rmst_h0b rmst_h1b)                        ///
>     contrastvar(rmstdiff100) mestimation

. list rmst_h0b rmst_h1b rmstdiff100* in 1, noobs

+------------------------------------------------------------+
|  rmst_h0b    rmst_h1b   rmstd~100   rmstdi~lci   rmstd~uci |
|------------------------------------------------------------|
| 7.8414209   8.9084369   1.0670161   -1.3445604   3.4785925 |
+------------------------------------------------------------+



I have used the mestimation option to calculate the equivalent of robust standard errors so that these and the confidence intervals are comparable with stteffects. I have also used the trans(none) option as sttefects calculates standard errors on the untransformed survival scale, while the default in standsurv is the log scale.

The estimates are now very similar to stteffects (to 4 or 5 decimal places).

This has shown that I can obtain the same estimates as stteffects using standsurv, but generally I would not be happy in doing so as to obtain the estimated mean of the potential outcomes I have had to extrapolate my two survival functions way beyond the end of follow-up. I am not sure is this is obvious to users of stteffects.

## Using restricted mean survival time (RMST) as an alternative.

I have previous discussed using RMST in another tutorial. As a reminder the restricted mean survival time at time $t^*$ is defined as, $$E\left[min(t,t^*)\right]$$ i.e. it is the mean up to some point $t^*$. The RMST can be estimated by calculating the area under the survival curve between 0 and $t^*$. In an observational study where we need to take account of potential confounders, we can define the RMST of the standardized survival function as

$$RMST(t^*|X=x,Z) = \int_0^{t^*} E\left[S(t|X=x,Z)\right]$$

We used the rmst option of standsurv above, but integrated the survival function to 100 years, to where $S(t)\approx 0$. So if we feed the timevar option a lower value of $t$, then we can calculate RMST to a point within our follow-up period. In the code below I use 5 years.

. gen tt5 = 5 in 1
(1,545 missing values generated)

. standsurv, rmst ci  timevar(tt5) trans(none) ///
>     at1(hormon 0 age_h 0 enodes_h 0)                ///
>     at2(hormon 1 age_h = age enodes_h = enodes)     ///
>     contrast(difference)                             ///
>     atvars(rmst5_h0b rmst5_h1b)                        ///
>     contrastvar(rmstdiff5) mestimation

. list rmst5_h0b rmst5_h1b rmstdiff5* in 1, noobs

+-----------------------------------------------------------+
| rmst5_h0b   rmst5_h1b   rmstdiff5   rms~5_lci   rms~5_uci |
|-----------------------------------------------------------|
| 4.1250222   4.3241716   .19914943   .05495712   .34334175 |
+-----------------------------------------------------------+



We get a difference of 0.199 (95% CI 0.055 to 0.343) years between the two groups. The difference in RMST can still be interpreted as a causal effect but, unlike stteffects`, does not reply on extrapolation. Alternatively, the difference in standardized survival functions could be presented.