A simple Bayesian model for accrual

Steve Simon

2006/11/17

[StATS]: A simple Bayesian model for accrual (created 2006-11-17)

Suppose you are a researcher in charge of a long term study. You plan to collect data on 120 patients. The goal is to finish your study in ten years

A Bayesian model of accrual times can help you to discern whether recruitment is behind schedule and project an estimated completion date allowing for uncertainty. The Bayesian model starts with an elicitation of prior beliefs about accrual rates. Then you combine the prior belief with the actual accrual data and compute a posterior distribution of accrual times. You can then use the posterior distribution to make meaningful probability statements about the current accrual rate and the projected completion time.

Elicitation of prior beliefs is difficult in most problems

You can model the accrual process as a series of waiting times between patients. A simple distribution for waiting times is the exponential distribution. The probability density function for the exponential distribution is quite simple and easy to work with.

The value lambda (<U+03BB>) of this distribution is of great interest, because it will tell you how quickly the trial will end. If lambda is equal to 30

The 5th and 95th percentiles would be 1.5 days and 90 days respectively (see the gray shading in the picture above). This wide range is quite natural for the exponential distribution

An exponential distribution with a larger lambda value would indicate an accrual process that is much slower.

Shown above is a graph of the exponential distribution with lambda=60. For this distribution

Determining the value of lambda

There are a wide range of attractive prior distributions for accrual, but the inverse gamma distribution is attractive because of its close relationship to the exponential distribution. The probability density function for the inverse gamma distribution is

The mean and variance of the inverse gamma distribution are

The second factor in the formula for the standard deviation represents the coefficient of variation (CV) also known as the relative standard deviation. Notice that as alpha increases

If you accumulate n waiting times

The posterior distribution is

which is proportional to an inverse gamma distribution. The parameters of the inverse gamma distribution are

The posterior mean is

which can be written as a weighted average of the prior mean and the mean of the data.

with greater weight being given to the mean of the data as the sample size increases. Let’s examine some possible prior distributions that you might use.

The value of alpha indicates the degree of certainty that you have, prior to collecting the data

Once you have specified the value for alpha

If you set set alpha to a small value

The gray area represents the range of the 5th percentile (6.3 days) to the 95th percentile (84.4 days). These rates represent a broad range of accrual times

If you set alpha to 10 and beta to 270

Finally

Let’s choose the first prior and apply it to the existing accrual times. On day 768

With only ten patients

The 5th and 95th percentiles are 43.8 and 115.2 days

Let’s choose the last prior distribution and apply it to the existing accrual times. On day 768

Note that the average of this posterior distribution is 37.9 days and the 5th and 95th percentiles are 30.5 and 46.8 respectively. The strong prior belief outweighs the evidence from just 10 observations, but even here you can recognize problems. The 5th percentile for accrual rates is slightly above your target of 30 days in spite of your strong optimistic prior belief.

With the Bayesian models

The graph above shows what an exponential distribution looks like with lambda = 43.8

With the last prior distribution

The graph shown above indicates what an exponential distribution looks like with lambda = 30.5

To calculate a predictive distribution

The density function is closely related to a Pareto distribution. For large values of alpha

Notice the square root term in the standard deviation formula. This again represents the coefficient of variation. It is always larger than 1

The graph above shows what this predictive distribution looks like for the first prior distribution (black line). There is a small deviation from this distribution and the exponential distribution (red line).

The graph above shows what the predictive distribution looks like for the last prior distribution. There is almost no visible difference between the predictive distribution and the exponential distribution.

If you assume that the remaining 110 patients accrue at an exponential rate

which has mean and standard deviation equal to

Set theta to the posterior mean and k to the number of patients remaining. With the first prior distribution

  1. The mean of the remaining time in the study is 7,975 days (21.8 years) and the standard deviation is 760.4 days or 2.1 years. The 5th and 95th percentiles are 18.5 years and 25.4 years respectively.

For the last prior distribution

The 5th and 95th percentiles are 9.7 and 13.3 years. Since we have already spent 2.1 years in the study

If the number of patients remaining in the study is large

This page was written by Steve Simon while working at Children’s Mercy Hospital. Although I do not hold the copyright for this material

statistics](../category/BayesianStatistics.html). trials](../category/AccrualProblems.html) or [Category: Bayesian for pages similar to this one at [Category: Accrual problems in clinical with general help resources. You can also browse Children’s Mercy Hospital website. Need more information? I have a page reproducing it here as a service