The R package survstan can be used to fit right-censored survival data under independent censoring. The implemented models allow the fitting of survival data in the presence/absence of covariates. All inferential procedures are currently based on the maximum likelihood (ML) approach.
Inference procedures
Let be the observed survival time and its corresponding failure indicator, , and be a vector of parameters. Then, the likelihood function for right-censored survival data under independent censoring can be expressed as:
The maximum likelihood estimate (MLE) of is obtained by directly maximization of using the rstan::optimizing()
function. The function rstan::optimizing()
further provides the hessian matrix of , needed to obtain the observed Fisher information matrix, which is given by:
Inferences on are then based on the asymptotic properties of the MLE, , that state that:
Baseline Distributions
Some of the most popular baseline survival distributions are implemented in the R package survstan. Such distributions include:
- Exponential
- Weibull
- Lognormal
- Loglogistic
- Gamma,
- Generalized Gamma (original Stacyβs parametrization)
- Generalized Gamma (alternative Prenticeβs parametrization)
- Gompertz
- Rayleigh
- Birnbaum-Saunders (fatigue)
The parametrizations adopted in the package survstan are presented next.
Exponential Distribution
If , then
where is the rate parameter.
The survival and hazard functions in this case are given by:
and
Weibull Distribution
If , then
where and are the shape and scale parameters, respectively.
The survival and hazard functions in this case are given by:
and
Lognormal Distribution
If , then
where and are the mean and standard deviation in the log scale of .
The survival and hazard functions in this case are given by:
and where is the cumulative distribution function of the standard normal distribution.
Loglogistic Distribution
If , then
where and are the shape and scale parameters, respectively.
The survival and hazard functions in this case are given by:
and
Gamma Distribution
If , then
where is the gamma function.
The survival function is given by
where is the lower incomplete gamma function, which is available only numerically. Finally, the hazard function is expressed as:
Generalized Gamma Distribution (original Stacyβs parametrization)
If , then
for , and .
It can be show that the survival function can be expressed as:
where , and corresponds to the distribution function of a gamma distribution with shape parameter and scale parameter equals to 1.
Finally, the hazard function is expressed as:
Generalized Gamma Distribution (alternative Prenticeβs parametrization)
If , then
where , for , and $.
It can be show that the survival function can be expressed as:
where , is the distribution function of a gamma distribution with shape parameter and scale parameter equals to 1, and corresponds to the survival function of a lognormal distribution with location parameter and scale parameter .
Finally, the hazard function is expressed as:
Gompertz Distribution
If , then
The survival and hazard functions are given, respectively, by
and
$$h(t|\alpha, \lambda) = \alpha\exp\{\gamma x}.$$
Rayleigh Distribution
Let , where is a scale parameter. Then, the density, survival and hazard functions are respectively given by:
and
Birnbaum-Saunders (fatigue) Distribution
If , then
where is the probability density function of a standard normal distribution, and are the shape and scale parameters, respectively.
The survival function in this case is given by:
,
where is the cumulative distribution function of a standard normal distribution. The hazard function is given by
Regression models
When covariates are available, it is possible to fit six different regression models with the R package survstan:
- accelerated failure time (AFT) models;
- proportional hazards (PH) models;
- proportional odds (PO) models;
- accelerated hazard (AH) models.
- Yang and Prentice (YP) models.
- extended hazard (EH) models.
The regression survival models implemented in the R package survstan are briefly described in the sequel. Denote by a vector of covariates, and let and be vectors of regression coefficients, and a vector of parameters associated with some baseline survival distribution. To prevent identifiability issues, it is assumed that the linear predictors and do not include an intercept term.
Accelerate Failure Time Models
Accelerated failure time (AFT) models are defined as
where follows a baseline distribution with survival function so that
and
Proportional Hazards Models
Proportional hazards (PH) models are defined as
where is a baseline hazard function so that
and
Proportional Odds Models
Proportional Odds (PO) models are defined as
where is a baseline odds function so that
and
Accelerated Hazard Models
Accelerated hazard (AH) models can be defined as
so that
and
Extended hazard Models
The survival function of the extended hazard (EH) model is given by:
The hazard and the probability density functions are then expressed as:
and
respectively.
The EH model includes the AH, AFT and PH models as particular cases when , , and , respectively.
Yang and Prentice Models
The survival function of the Yang and Prentice (YP) model is given by:
The hazard and the probability density functions are then expressed as:
and
respectively, where and .
The YO model includes the PH and PO models as particular cases when and , respectively.