Skip to contents

The R package survstan can be used to fit right-censored survival data under independent censoring. The implemented models allow the fitting of survival data in the presence/absence of covariates. All inferential procedures are currently based on the maximum likelihood (ML) approach.

Installation

You can install the released version of survstan from CRAN with:

install.packages("survstan")

You can install the development version of survstan from GitHub with:

# install.packages("devtools")
devtools::install_github("fndemarqui/survstan")

Inference procedures

Let (ti,δi) be the observed survival time and its corresponding failure indicator, i = 1, ⋯, n, and θ be a k × 1 vector of parameters. Then, the likelihood function for right-censored survival data under independent censoring can be expressed as:

$$ L(\boldsymbol{\theta}) = \prod_{i=1}^{n}f(t_{i}|\boldsymbol{\theta})^{\delta_{i}}S(t_{i}|\boldsymbol{\theta})^{1-\delta_{i}}. $$

The maximum likelihood estimate (MLE) of θ is obtained by directly maximization of log (L(θ)) using the rstan::optimizing() function. The function rstan::optimizing() further provides the hessian matrix of log (L(θ)), needed to obtain the observed Fisher information matrix, which is given by:

$$ \mathscr{I}(\hat{\boldsymbol{\theta}}) = -\frac{\partial^2}{\partial \boldsymbol{\theta}\boldsymbol{\theta}'} \log L(\boldsymbol{\theta})\mid_{\boldsymbol{\theta}=\hat{\boldsymbol{\theta}}}, $$

Inferences on θ are then based on the asymptotic properties of the MLE, $\hat{\boldsymbol{\theta}}$, that state that:

$$ \hat{\boldsymbol{\theta}} \asymp N_{k}(\boldsymbol{\theta}, \mathscr{I}^{-1}(\hat{\boldsymbol{\theta}})). $$

Baseline Distributions

Some of the most popular baseline survival distributions are implemented in the R package survstan. Such distributions include:

  • Exponential
  • Weibull
  • Lognormal
  • Loglogistic
  • Gamma
  • Rayleigh
  • Birnbaum-Saunders (fatigue)

The parametrizations adopted in the package survstan are presented next.

Exponential Distribution

If T ∼ Exp(λ), then

f(t|λ) = λexp {−λt}I[0, ∞)(t), where λ > 0 is the rate parameter.

The survival and hazard functions in this case are given by:

S(t|λ) = exp {−λt} and h(t|λ) = λ.

Weibull Distribution

If T ∼ Weibull(α,γ), then

$$ f(t|\alpha, \gamma) = \frac{\alpha}{\gamma^{\alpha}}t^{\alpha-1}\exp\left\{-\left(\frac{t}{\gamma}\right)^{\alpha}\right\}I_{[0, \infty)}(t), $$ where α > 0 and γ > 0 are the shape and scale parameters, respectively.

The survival and hazard functions in this case are given by:

$$ S(t|\alpha, \gamma) = \exp\left\{-\left(\frac{t}{\gamma}\right)^{\alpha}\right\} $$ and $$ h(t|\alpha, \gamma) = \frac{\alpha}{\gamma^{\alpha}}t^{\alpha-1}. $$

Lognormal Distribution

If T ∼ LN(μ,σ), then

$$ f(t|\mu, \sigma) = \frac{1}{\sqrt{2\pi}t\sigma}\exp\left\{-\frac{1}{2}\left(\frac{log(t)-\mu}{\sigma}\right)^2\right\}I_{[0, \infty)}(t), $$ where  − ∞ < μ < ∞ and σ > 0 are the mean and standard deviation in the log scale of T.

The survival and hazard functions in this case are given by:

$$S(t|\mu, \sigma) = \Phi\left(\frac{-log(t)+\mu}{\sigma}\right)$$ and $$h(t|\mu, \sigma) = \frac{f(t|\mu, \sigma)}{S(t|\mu, \sigma)},$$ where Φ(⋅) is the cumulative distribution function of the standard normal distribution.

Loglogistic Distribution

If T ∼ LL(α,γ), then

$$ f(t|\alpha, \gamma) = \frac{\frac{\alpha}{\gamma}\left(\frac{t}{\gamma}\right)^{\alpha-1}}{\left[1 + \left(\frac{t}{\gamma}\right)^{\alpha}\right]^2}I_{[0, \infty)}(t), ~ \alpha>0, \gamma>0, $$

where α > 0 and γ > 0 are the shape and scale parameters, respectively.

The survival and hazard functions in this case are given by:

$$S(t|\alpha, \gamma) = \frac{1}{1+ \left(\frac{t}{\gamma}\right)^{\alpha}}$$ and $$ h(t|\alpha, \gamma) = \frac{\frac{\alpha}{\gamma}\left(\frac{t}{\gamma}\right)^{\alpha-1}}{1 + \left(\frac{t}{\gamma}\right)^{\alpha}}. $$

Gamma Distribution

If T ∼ Gamma(α,γ), then

$$f(t|\alpha, \lambda) = \frac{\lambda^{\alpha}}{\Gamma(\alpha)}t^{\alpha-1}\exp\left\{-\lambda t\right\}I_{[0, \infty)}(t),$$

where Γ(α) = ∫0uα − 1exp { − u}du is the gamma function.

The survival function is given by

$$S(t|\alpha, \lambda) = 1 - \frac{\gamma^{*}(\alpha, \gamma t)}{\Gamma(\alpha)},$$ where γ*(α,γt) is the lower incomplete gamma function, which is available only numerically. Finally, the hazard function is expressed as:

$$h(t|\alpha, \lambda) = \frac{f(t|\alpha, \gamma)}{S(t|\alpha, \gamma)}.$$

Rayleigh Distribution

Let T ∼ rayleigh(σ), where σ > 0 is a scale parameter. Then, the density, survival and hazard functions are respectively given by:

$$f(t|\sigma) = \frac{x}{\sigma^2}\exp\left\{-\frac{x^2}{2\sigma^2}\right\},$$ $$S(t|\sigma) = \exp\left\{-\frac{x^2}{2\sigma^2}\right\}$$ and

$$h(t|\sigma) = \frac{x}{\sigma^2}.$$

Birnbaum-Saunders (fatigue) Distribution

If T ∼ fatigue(α,γ), then

$$ f(t|\alpha, \gamma) = \frac{\sqrt{\frac{t}{\gamma}}+\sqrt{\frac{\gamma}{t}}}{2 \alpha t}\phi\left(\sqrt{\frac{t}{\gamma}}+\sqrt{\frac{\gamma}{t}}\right)(t), ~ \alpha>0, \gamma>0, $$

where ϕ(⋅) is the probability density function of a standard normal distribution, α > 0 and γ > 0 are the shape and scale parameters, respectively.

The survival function in this case is given by:

$$ S(t|\alpha, \gamma) =\Phi\left(\sqrt{\frac{t}{\gamma}}-\sqrt{\frac{\gamma}{t}}\right)(t) $$,

where Φ(⋅) is the cumulative distribution function of a standard normal distribution. The hazard function is given by $$h(t|\mu, \sigma) = \frac{f(t|\alpha, \gamma)}{S(t|\alpha, \gamma)}. $$

Regression models

When covariates are available, it is possible to fit four different regression models with the R package survstan:

  • accelerated failure time (AFT) models;
  • proportional hazards (PH) models;
  • proportional odds (PO) models;
  • accelerated hazard (AH) models.
  • Yang and Prentice (YP) models.

Let x be a 1 × p vector of covariates, β a p × 1 of regression coefficients, and θ a vector of parameters associated with some baseline survival distribution, and denote by Θ = (θ,β)T the full vector of parameters. Here, to ensure identifiability, in all regression structures the linear predictor xβ does not include a intercept term.

The regression survival models implemented in the R package survstan are briefly described in the sequel.

Accelerate Failure Time Models

Accelerated failure time (AFT) models are defined as

T = exp {xβ}ν, where ν follows a baseline distribution with survival function S0(⋅|θ) so that

f(t|Θ,x) = exβf0(texβ|θ) and

S(t|Θ,x) = S0(texβ|θ).

Proportional Hazards Models

Proportional hazards (PH) models are defined as

h(t|Θ,x) = h0(t|θ)exp {xβ}, where h0(t|θ) is a baseline hazard function so that

f(t|Θ,x) = h0(t|θ)exp {xβH0(t|θ)exβ}, and

S(t|Θ,x) = exp {−H0(t|θ)exβ}.

Proportional Odds Models

Proportional Odds (PO) models are defined as

R(t|Θ,x) = R0(t|θ)exp {xβ}, where $\displaystyle R_{0}(t|\boldsymbol{\theta}) = \frac{1-S_{0}(t|\boldsymbol{\theta})}{S_{0}(t|\boldsymbol{\theta})} = \exp\{H_{0}(t|\boldsymbol{\theta})\}-1$ is a baseline odds function so that

$$ f(t|\boldsymbol{\Theta}, \mathbf{x}) = \frac{h_{0}(t|\boldsymbol{\theta})\exp\{\mathbf{x} \boldsymbol{\beta} + H_{0}(t|\boldsymbol{\theta})\}}{[1 + R_{0}(t|\boldsymbol{\theta})e^{\mathbf{x} \boldsymbol{\beta}}]^2}. $$

and

$$ S(t|\boldsymbol{\Theta}, \mathbf{x}) = \frac{1}{1 + R_{0}(t|\boldsymbol{\theta})e^{\mathbf{x} \boldsymbol{\beta}}}. $$

Accelerated Hazard Models

Accelerated hazard (AH) models are defined as

h(t|Θ,x) = h0(texβ|θ)

so that

S(t|Θ,x) = exp {−H0(texβ|θ)exβ} and f(t|θ,x) = h0(texβ|θ)exp {−H0(texβ|θ)exβ}.

Yang and Prentice Models

The survival function of the Yang and Prentice (YP) model is given by:

$$S(t|\boldsymbol{\theta},\boldsymbol{\beta}, \boldsymbol{\phi}) = \left[1+\frac{\kappa_{S}}{\kappa_{L}}R_{0}(t|\boldsymbol{\theta})\right]^{-\kappa_{L}}. $$

The hazard and the probability density functions are then expressed as:

$$h(t|\boldsymbol{\theta},\boldsymbol{\beta}, \boldsymbol{\phi}) = \frac{\kappa_{S}h_{0}(t|\boldsymbol{\theta})\exp\{H_{0}(t|\boldsymbol{\theta})\}}{\left[1+\frac{\kappa_{S}}{\kappa_{L}}R_{0}(t|\boldsymbol{\theta})\right]} $$ and

$$f(t|\boldsymbol{\theta},\boldsymbol{\beta}, \boldsymbol{\phi}) = \kappa_{S}h_{0}(t|\boldsymbol{\theta})\exp\{H_{0}(t|\boldsymbol{\theta})\}\left[1+\frac{\kappa_{S}}{\kappa_{L}}R_{0}(t|\boldsymbol{\theta})\right]^{-(1+\kappa_{L})}, $$

respectively, where κS = exp {xβ} and κL = exp {xϕ}.