Attributable Fraction from Standardized Survival Functions
This example will demonstrate how the attributable fraction (userfunction() option.
The is defined in epidemiology as the proportion of preventable outcomes if all subjects had not been exposed to a particular exposure. i.e.
where
In survival studies the probability of being diseased is a function of time, so we define the
Example
I will use the Rotterdam Breast cancer data. The code below loads and stset’s the data and then fits a model using stpm3.
. clear all
. use https://www.pclambert.net/data/rott3,
(Rotterdam breast cancer data (augmented with cause of death))
. stset os, f(osi==1) scale(12) exit(time 120)
Survival-time data settings
Failure event: osi==1
Observed time interval: (0, os]
Exit on or before: time 120
Time for analysis: time/12
--------------------------------------------------------------------------
2,982 total observations
0 exclusions
--------------------------------------------------------------------------
2,982 observations remaining, representing
1,171 failures in single-record/single-failure data
20,002.424 total analysis time at risk and under observation
At risk from t = 0
Earliest observed entry t = 0
Last observed exit t = 10
. stpm3 i.hormon age enodes pr_1, scale(lncumhazard) df(4) eform nolog
Number of obs = 2,982
Wald chi2(4) = 619.62
Log likelihood = -2668.4925 Prob > chi2 = 0.0000
------------------------------------------------------------------------------
| exp(b) Std. err. z P>|z| [95% conf. interval]
-------------+----------------------------------------------------------------
xb |
hormon |
yes | .7906432 .0715077 -2.60 0.009 .66221 .9439854
age | 1.013244 .0024119 5.53 0.000 1.008528 1.017983
enodes | .1132534 .0110135 -22.40 0.000 .0935998 .1370337
pr_1 | .9064855 .0119282 -7.46 0.000 .8834055 .9301685
-------------+----------------------------------------------------------------
time |
_ns1 | -25.90082 1.871965 -13.84 0.000 -29.5698 -22.23184
_ns2 | 7.980587 1.003724 7.95 0.000 6.013324 9.947851
_ns3 | -1.091126 .0461407 -23.65 0.000 -1.18156 -1.000691
_ns4 | -.70103 .0504635 -13.89 0.000 -.7999366 -.6021234
_cons | .801967 .161537 4.96 0.000 .4853603 1.118574
------------------------------------------------------------------------------
Note: Estimates are transformed only in the first equation.It is worthwhile commenting what we mean be “exposed” here. Those on hormal treatment will be consided unexposed and those not taking the treatment will be exposed, i.e our unexposed group is when hormon=1.
I will first use the failure option to calculate the standardized failure probabilities in both groups. I also predict the failure probability in the population as a whole. I do this using . within an at option, i.e. using at3(.) in the example below.
. range timevar 0 10 101
(2,881 missing values generated)
. standsurv, failure timevar(timevar) ci frame(Ft, replace) ///
> at1(hormon 0) at2(hormon 1) at3(.) atvar(F_hormon0 F_hormon1 F_all)
. frame Ft {
. twoway (line F_hormon0 timevar) ///
> (line F_hormon1 timevar) ///
> (line F_all timevar), ///
> legend(order(1 "No treatment" 2 "Treatment" 3 "All") pos(11)) ///
> ylabel(, format(%3.1f)) ///
> ytitle("S(t)") ///
> xtitle("Years from surgery")
. }These are just 1 - the standardized survival functions. There are more untreated women (88.6%) which is why the “No Treatment” function is closer to the combined function. The attributable fraction could be calculated using
. frame Ft {
. gen AF_tmp = 1 - F_hormon1/F_all
(1 missing value generated)
. list timevar F_hormon1 F_all AF_tmp if inlist(timevar,1,5,10), noobs
+---------------------------------------------+
| timevar F_hormon1 F_all AF_tmp |
|---------------------------------------------|
| 1 .01685169 .02035349 .17204904 |
| 5 .22362896 .26167585 .14539701 |
| 10 .39250923 .44808119 .12402208 |
+---------------------------------------------+
. }I have listed the userfunction() option comes in. We can calculate a transformation of our standardized estimates with standard errors estimated using the delta method where derivatives are calculated numerically (similar to nlcom and predictnl). I “borrowed” the idea of a userfunction() from Arvid Sjölander’s stdReg R package (Sjölander 2018).
The user function needs to be written in Mata. The function should receive one argument at, which refer to the various at options and can be indexed by at[1], at[2] etc. The code below calculates the AF assuming that at1 is the standardized failure function in the population as a whole and at2 is the standardized failure function assuming everyone is unexposed (takes hormonal treatment). We need to be careful to specify the at() options is this order.
. mata
------------------------------------------------- mata (type end to exit) -----------------------------------------------------------------------
: function calcAF(at) {
> // at2 is F(t|unexposed,Z)
> // at1 is F(t,Z)
> return(1 - at[2]/at[1])
> }
: end
-------------------------------------------------------------------------------------------------------------------------------------------------Having defined the Mata function I just pass this to standsurv using the userfunction() option.
. standsurv, failure timevar(timevar) ci frame(Ft2, replace) ///
> at1(.) at2(hormon 0) atvar(F_hormon1 F_all) ///
> userfunction(calcAF) userfunctionvar(AF) I have specified the userfunctionvar(AF) option so that the new variable is called AF. Without this option the default is _userfunc. I can now plot the AF as a function of follow-up time.
. frame Ft2 {
. twoway (rarea AF_lci AF_uci timevar, color(red%30)) ///
> (line AF timevar, lcolor(red)) ///
> , legend(off) ///
> ylabel(0(0.05)0.3, format(%4.2f)) ///
> ytitle("AF") ///
> xtitle("Years from surgery")
. }I purposely chose for the effect of hormonal treatment to be proportional as this example is illustrative. When I relaxed this assumption, the AF was negative for the first few months.
Samualson (2008) defines alternative based on the hazard function. I am less keen on this than the use of the survival function, but show how this can be estimated using standsurv for completeness.
Samualson defines this is the attributable hazard fraction. The equation is similar to the AF defined above, but we replace the failure function with the hazard function.
This give the proportion of preventable events at time
See the page of The hazard function of the standardized survival curve. for a description of standardized hazard functions.
As I just have to replace the failure probability with the hazard function, I can just use the same Mata function. This means that I just have the change the option failure to hazard in standsurv.
. standsurv, hazard timevar(timevar) ci frame(haz, replace) ///
> at1(.) at2(hormon 0) atvar(F_hormon1 F_all) ///
> userfunction(calcAF) userfunctionvar(AHF) I can now plot the results.
. frame haz {
. twoway (rarea AHF_lci AHF_uci timevar, color(red%30)) ///
> (line AHF timevar, lcolor(red)) ///
> , legend(off) ///
> ylabel(0(0.05)0.3, format(%4.2f)) ///
> ytitle("AHF") ///
> xtitle("Years from surgery")
. }References
Samuelsen S.O., Eide G.E. Attributable fractions with survival data. Statistics in Medicine 2008;27:1447–1467
Sjölander A. Estimation of causal effect measures with the R-package stdReg.European Journal of Epidemiology 2018