% Generated by roxygen2 (4.1.1): do not edit by hand
% Please edit documentation in R/sir_devaus.R
\name{sir}
\alias{sir}
\title{Calculate SIR or SMR}
\usage{
sir(coh.data, coh.obs, coh.pyrs, ref.data = NULL, ref.obs = NULL,
  ref.pyrs = NULL, ref.rate = NULL, subset = NULL, print = NULL,
  adjust = NULL, mstate = NULL, test.type = "homogeneity", alpha = 0.95,
  p.adj = NULL, EAR = FALSE, round.by = 2, round.by.pvalue = 4)
}
\arguments{
\item{coh.data}{aggregated cohort data, see e.g. \code{\link{lexpand}}}

\item{coh.obs}{variable name for observed cases}

\item{coh.pyrs}{variable name for person years in cohort data (in quotes)}

\item{ref.data}{population data. Can be left NULL if \code{coh.data} is stratified in \code{print}.}

\item{ref.obs}{variable name for observed cases}

\item{ref.pyrs}{variable name for person-years in population data}

\item{ref.rate}{population rate variable (cases/person-years). Overwrites arguments
\code{ref.pyrs} and \code{ref.obs}.}

\item{subset}{logical condition to select data from \code{coh.data} before any computations}

\item{print}{variable names for computing and outputting results separately}

\item{adjust}{variable names for adjusting without stratifying output}

\item{mstate}{set column names for cause specific observations. Relevant only
when \code{coh.obs} length is two or more. See details.}

\item{test.type}{Test for equal SIRs. Test available are 'homogeneity' and 'trend'.}

\item{alpha}{level of type-I error in confidence intervals, default 0.05 is 95\% CI.}

\item{p.adj}{add multiple comparison p-value adjust type for univariate model,
check \code{help(p.adjust)} for options. Default NULL doesn't add adjusted p-values.}

\item{EAR}{logical; TRUE calculates Excess Absolute Risks for univarite SIRs.
(see details)}

\item{round.by}{set number of digits in results}

\item{round.by.pvalue}{set number of digits in p-values}
}
\value{
A list of 5: 3 \code{data.table} objects, vector of starta variables and global p-value.
}
\description{
Poisson modelled standardised incidence or mortality ratios (SIRs / SMRs) i.e.
indirect method for calculating standardised rates. SIR is a ratio of observed and expected cases.
Expected cases are derived by multiplying the strata-specific population rate with the
corresponding person-years of the cohort.
}
\details{
\code{sir} is a comprehensive tool for modelling SIRs/SMRs with flexible
options to adjust and print SIR's, test homogeneity and utilize
multistate data. The cohort data and the variable names for observation
counts and person-years are required.
The reference data is optional, since the cohort data
can be stratified (with \code{print}) and compared to total.


\strong{Adjust and print}

A SIR can be adjusted by the covariates found in both \code{coh.data} and \code{ref.data}.
Variables to adjust by are supplied as character
strings of the names of variables to \code{adjust}. Variable names needs to
match in both \code{coh.data} and \code{ref.data}. Typical variables to adjust by are
gender, age group and calendar period.

\code{print} is used to stratify the SIR output. In other words, the variables
assigned to \code{print} are the covariates of the Poisson model.
Variable levels are treaded as categorical.
Variables can be assigned in both \code{print} and \code{adjust}.
This means the output it adjusted and printed by these variables.

\code{print} can also be a list of expressions. This allows changing variable
names or transforming variables with functions such as \code{cut} and\code{round}.
For example, the existing variables \code{agegroup} and \code{year} could be
transformed to new levels using \code{cut} by

\code{print = list( age.category = cut(agegroup, breaks = c(seq(0,85,5), 120)), year.cat = cut(year, seq(1950,2015,10)))}


\strong{ref.rate or ref.obs & ref.pyrs}

The population rate variable can be given to the \code{ref.rate} parameter.
That is, when using e.g. the \code{popmort} or a comparable data file, one may
supply \code{ref.rate} instead of \code{ref.obs} and \code{ref.pyrs}, which
will be ignored if \code{ref.rate} is supplied.


Note that if all the stratifying variables in
\code{ref.data} aren't listed in \code{adjust},
or when the categories are otherwise combined,
the (unweighted) mean of rates is used for computing expected cases.
This might incur a small bias in comparison to when exact numbers of observations
and person-years are available.



\strong{mstate}

E.g. with \code{lexpand} it's possible to compute counts for several outcomes
so that the population at risk is same for each
outcome such as a certain kind of cancer.
The transition counts are in wide data format,
and the relevant columns can be supplied to \code{sir}
in a vector via the \code{coh.obs} argument.
The name of the corresponding new column in \code{ref.data} is given in
\code{mstate}. It's recommended to include the \code{mstate} variable in \code{adjust},
so the corresponding information should also be available in \code{ref.data}.

This approach is analogous to where SIRs are calculated separately their
own function calls.


\strong{Other parameters}

The univariate multiple-comparison-adjusted p-value uses \code{\link[stats]{p.adjust}}.
Univariate confidence intervals are calculated using exact
Poisson intervals (poisson.ci). The multivariate result
is based on a poisson regression model with profile-likelihood confidence intervals
when possible. Otherwise Wald's normal-approximation is used.

The p-value is a test for the levels of \code{print}. The test can be either
\code{"homogeneity"}, a likelihood ratio test where the model with variable(s) in
\code{print} (categorical factor) is compared to the constant model.
Option \code{"trend"} is the same likelihood ratio test except the
variable(s) in \code{print} are/is continous.


\strong{EAR: Excess Absolute Risk}

A simple way to quantify the absolute difference between cohort risk and
population risk.
Make sure that the person-years are calculated accordingly before using EAR.

Formula for EAR:
\deqn{EAR = \frac{observed - expected}{person years} \times 1000.}{EAR = (obs - exp)/pyrs * 1000.}

\strong{Data format}

The data should be given in aggregated format, i.e the number of observations
and person-years are represented for each stratum.
The extra variables and levels are reduced automatically before estimating SIRs.
Example of data format:

\tabular{rrrrr}{
  sex \tab age \tab period \tab obs \tab pyrs \cr
  0 \tab 1 \tab 2010 \tab 0 \tab 390 \cr
  0 \tab 2 \tab 2010 \tab 5 \tab 385 \cr
  1 \tab 1 \tab 2010 \tab 3 \tab 308 \cr
  1 \tab 2 \tab 2010 \tab 12 \tab 315
}
}
\examples{
data(popmort)
data(sire)
c <- lexpand( sire, status = status, birth = bi_date, exit = ex_date, entry = dg_date,
              breaks = list(per = 1950:2013, age = 1:100, fot = c(0,10,20,Inf)),
              aggre = list(fot, agegroup = age, year = per, sex) )
## SMR due other causes: status = 2
se <- sir( coh.data = c, coh.obs = 'from0to2', coh.pyrs = 'pyrs',
           ref.data = popmort, ref.rate = 'haz',
           adjust = c('agegroup', 'year', 'sex'), print = 'fot')
se
## for examples see: vignette('sir')
}
\author{
Matti Rantanen, Joonas Miettinen
}
\seealso{
\code{\link{plot.sir}}, \code{\link{lexpand}}
\href{../doc/sir.html}{A SIR calculation vignette}
}

