% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/stan_rw.R
\name{stan_rw}
\alias{stan_rw}
\title{Time series models for mortality and disease incidence}
\source{
Brandt P and Williams JT. Multiple time series models. Thousand Oaks, CA: SAGE Publications, 2007.

Clayton, DG. Generalized linear mixed models. In: Gilks WR, Richardson S, Spiegelhalter DJ, editors. Markov Chain Monte Carlo in Practice: Interdisciplinary Statistics. Boca Raton, FL: CRC Press, 1996. p. 275-302.

Donegan C, Hughes AE, and Lee SC (2022). Colorectal Cancer Incidence, Inequalities, and Prevention Priorities in Urban Texas: Surveillance Study With the "surveil" Software Package. \emph{JMIR Public Health & Surveillance} 8(8):e34589. \doi{10.2196/34589}

Stan Development Team. Stan Modeling Language Users Guide and Reference Manual, 2.28. 2021. https://mc-stan.org
}
\usage{
stan_rw(
  data,
  group,
  time,
  cor = FALSE,
  family = poisson(),
  prior = list(),
  chains = 4,
  cores = 1,
  iter = 3000,
  refresh = 1500,
  control = list(adapt_delta = 0.98),
  ...
)
}
\arguments{
\item{data}{A \code{data.frame} containing the following columns: \describe{
\item{Count}{Number of cases or deaths; this column must be named 'Count'.}
\item{Population}{Size of population at risk; this column must be named 'Population'.}
\item{time}{Time period indicator. (Provide the (unquoted) column name using the \code{time} argument.)}
\item{group}{Optional grouping variable. (Provide the (unquoted) column name using the \code{group} argument.)}
}}

\item{group}{If \code{data} is aggregated by demographic group, provide the (unquoted) name of the column in \code{data} containing the grouping structure, such as age brackets or race-ethnicity. E.g., if \code{data} has column names \code{Year}, \code{Race}, \code{Cases}, and \code{Population}, then you would provide \code{group = Race}.}

\item{time}{Specify the (unquoted) name of the time variable in \code{data}, as in \code{time = Year}. This variable must be numeric-alike (i.e., \code{as.numeric(data$time)} will not fail).}

\item{cor}{For correlated random walks use \code{cor = TRUE}; default value is \code{FALSE}. Note this only applies when the \code{group} argument is used.}

\item{family}{The default specification is a Poisson model with log link function (\code{family = poisson()}). For a Binomial model with logit link function, use \code{family = binomial()}.}

\item{prior}{Optionally provide a named \code{list} with prior parameters. If any of the following items are missing, default priors will be assigned and printed to the console.

\describe{
\item{eta_1}{The first value of log-risk in each series must be assigned a Gaussian prior probability distribution. Provide the location and scale parameters for each demographic group in a list, where each parameter is a \code{k}-length vector.

For example, with \code{k=2} demographic groups, the following code will assign priors of \code{normal(-6.5, 5)} to the starting values of both series: \verb{prior = list(eta_1 = normal(location = -6.5, scale = 5, k = 2)}. Note, \code{eta} is the log-rate, so centering the prior for \code{eta_1} on \code{-6.5} is similar to centering the prior rate on \verb{exp(-6.5)*100,000 = 150} cases per 100,000 person-years at risk. Note, however, that the translation from log-rate to rate is non-linear.}

\item{sigma}{Each demographic group has a scale parameter assigned to its log-rate. This is the scale of the annual deviations from the previous year's log-rate. The scale parameters are assigned independent half-normal prior distributions (these \code{half} normal distributions are restricted to be positive-valued only).}

\item{omega}{If \code{cor = TRUE}, an LKJ prior is assigned to the correlation matrix, Omega.}
}}

\item{chains}{Number of independent MCMC chains to initiate (passed to \code{\link[rstan]{sampling}}).}

\item{cores}{The number of cores to use when executing the Markov chains in parallel (passed to \code{\link[rstan]{sampling}}).}

\item{iter}{Total number of MCMC iterations. Warmup draws are automatically half of \code{iter}.}

\item{refresh}{How often to print the MCMC sampling progress to the console.}

\item{control}{A named list of parameters to control Stan's sampling behavior. The most common parameters to control are \code{adapt_delta}, which may be raised to address divergent transitions, and \code{max_treedepth}. For example, \code{control = list(adapt_delta = 0.99, max_treedepth = 13)}, may be a reasonable specification to address a divergent transitions or maximum treedepth warning from Stan.}

\item{...}{Other arguments passed to \code{\link[rstan]{sampling}}.}
}
\value{
The function returns a list, also of class \code{surveil}, containing the following elements:
\describe{

\item{summary}{A \code{data.frame} with posterior means and 95 percent credible intervals, as well as the raw data (Count, Population,  time period, grouping variable if any, and crude rates).}

\item{samples}{A \code{stanfit} object returned by \code{\link[rstan]{sampling}}. This contains the MCMC samples from the posterior distribution of the fitted model.}

\item{cor}{Logical value indicating if the model included a correlation structure.}

\item{time}{A list containing the name of the time-period column in the user-provided data and a \code{data.frame} of observed time periods and their index.}

\item{group}{If a grouping variable was used, this will be a list containing the name of the grouping variable and a \code{data.frame} with group labels and index values.}

\item{family}{The user-provided \code{family} argument.}
}
}
\description{
Model time-varying incidence rates given a time series of case (or death) counts and population at risk.
}
\details{
By default, the models have Poisson likelihoods for the case counts, with log link function. Alternatively, a Binomial model with logit link function can be specified using the \code{family} argument (\code{family = binomial()}).

For time t = 1,...n, the models assign Poisson probability distribution to the case counts, given log-risk \code{eta} and population at tirks P; the log-risk is modeled using the first-difference (or random-walk) prior:

\if{html}{\out{<div class="sourceCode">}}\preformatted{ y_t ~ Poisson(p_t * exp(eta_t))
 eta_t ~ Normal(eta_\{t-1\}, sigma)
 eta_1 ~ Normal(-6, 5) (-Inf, 0)
 sigma ~ Normal(0, 1) (0, Inf)
}\if{html}{\out{</div>}}

This style of model has been discussed in Bayesian (bio)statistics for quite some time. See Clayton (1996).

The above model can be used for multiple distinct groups; in that case, each group will have its own independent time series model.

It is also possible to add a correlation structure to that set of models. Let \code{Y_t} be a k-length vector of observations for each of k groups at time t (the capital letter now indicates a vector), then:

\if{html}{\out{<div class="sourceCode">}}\preformatted{ Y_t ~ Poisson(P_t * exp(Eta_t))
 Eta_t ~ MVNormal(Eta_\{t-1\}, Sigma)
 Eta_1 ~ Normal(-6, 5)  (-Inf, 0)
 Sigma = diag(sigma) * Omega * diag(sigma)
 Omega ~ LKJ(2)
 sigma ~ Normal(0, 1) (0, Inf)
}\if{html}{\out{</div>}}

where \code{Omega} is a correlation matrix and \code{diag(sigma)} is a diagonal matrix with scale parameters on the diagonal. This was adopted from Brandt and Williams (2007); for the LKJ prior, see the Stan Users Guide and Reference Manual.

If the binomial model is used instead of the Poisson, then the first line of the model specifications will be:

\if{html}{\out{<div class="sourceCode">}}\preformatted{ y_t ~ binomial(P_t, inverse_logit(eta_t))
}\if{html}{\out{</div>}}

All else is remains the same. The logit function is \code{log(r/(1-r))}, where \code{r} is a rate between zero and one; the inverse-logit function is \code{exp(x)/(1 + exp(x))}.
}
\examples{
data(msa)
dat <- aggregate(cbind(Count, Population) ~ Year, data = msa, FUN = sum)

fit <- stan_rw(dat, time = Year)

## print summary of results
print(fit)
print(fit$summary)

## plot time trends (rates per 10,000)
plot(fit, scale = 10e3)
plot(fit, style = 'lines', scale = 10e3)

## Summary with MCMC diagnostics (n_eff, Rhat; from Rstan)
print(fit$samples)

## cumulative percent change
fit_pc <- apc(fit)
print(fit_pc$cpc)
plot(fit_pc, cumulative = TRUE)

\donttest{
## age-specific rates 
data(cancer)
cancer2 <- subset(cancer, grepl("55-59|60-64|65-69", Age))
fit <- stan_rw(cancer2, time = Year, group = Age,
               chains = 3, iter = 1e3) # for speed only

## plot trends 
plot(fit, scale = 10e3)

## age-standardized rates
data(standard)
fit_stands <- standardize(fit,
                          label = standard$age,
                          standard_pop = standard$standard_pop)
print(fit_stands)
plot(fit_stands)

## percent change for age-standardized rates
fit_stands_apc <- apc(fit_stands)
plot(fit_stands_apc)
print(fit_stands_apc)
}

}
\seealso{
\code{vignette("demonstration", package = "surveil")} \code{vignette("age-standardization", package = "surveil")} \code{\link[surveil]{apc}} \code{\link[surveil]{standardize}}
}
\author{
Connor Donegan (Connor.Donegan@UTSouthwestern.edu)
}
