% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/replicate.weights.R
\name{replicate.weights}
\alias{replicate.weights}
\title{Replicate weights}
\usage{
replicate.weights(
  data,
  method = c("JKn", "dCV", "bootstrap", "subbootstrap", "BRR", "split", "extrapolation"),
  cluster = NULL,
  strata = NULL,
  weights = NULL,
  design = NULL,
  k = 10,
  R = 1,
  B = 200,
  train.prob = 0.7,
  method.split = c("dCV", "bootstrap", "subbootstrap"),
  rw.test = FALSE,
  dCV.sw.test = FALSE
)
}
\arguments{
\item{data}{A data frame with information on (at least) cluster and strata indicators, and sampling weights. It could be \code{NULL} if the sampling design is indicated in the \code{design} argument (see \code{design}).}

\item{method}{A character string indicating the method to be applied to define replicate weights. Choose between one of these: \code{JKn}, \code{dCV}, \code{bootstrap}, \code{subbootstrap}, \code{BRR}, \code{split}, \code{extrapolation}.}

\item{cluster}{A character string indicating the name of the column with cluster identifiers in the data frame indicated in \code{data}. It could be \code{NULL} if the sampling design is indicated in the \code{design} argument (see \code{design}).}

\item{strata}{A character string indicating the name of the column with strata identifiers in the data frame indicated in \code{data}. It could be \code{NULL} if the sampling design is indicated in the \code{design} argument (see \code{design}).}

\item{weights}{A character string indicating the name of the column with sampling weights in the data frame indicated in \code{data}. It could be \code{NULL} if the sampling design is indicated in the \code{design} argument (see \code{design}).}

\item{design}{An object of class \code{survey.design} generated by \code{survey::svydesign()}. It could be \code{NULL} if information about \code{cluster}, \code{strata}, \code{weights} and \code{data} are given.}

\item{k}{A numeric value indicating the number of folds to be defined. Default is \code{k=10}. Only applies for the \code{dCV} method.}

\item{R}{A numeric value indicating the number of times the sample is partitioned. Default is \code{R=1}. Only applies for \code{dCV}, \code{split} or \code{extrapolation} methods.}

\item{B}{A numeric value indicating the number of bootstrap resamples. Default is \code{B=200}. Only applies for \code{bootstrap} and  \code{subbootstrap} methods.}

\item{train.prob}{A numeric value between 0 and 1, indicating the proportion of clusters (for the method \code{split}) or strata (for the method \code{extrapolation}) to be set in the training sets. Default is \code{train.prob=0.7}. Only applies for \code{split} and \code{extrapolation} methods.}

\item{method.split}{A character string indicating the way in which replicate weights should be defined in the \code{split} method. Choose one of the following: \code{dCV}, \code{bootstrap} or \code{subbootstrap}. Only applies for \code{split} method.}

\item{rw.test}{A logical value. If \code{TRUE}, the function returns in the output object the replicate weights to the corresponding test sets. If \code{FALSE}, only the replicate weights of the training sets are returned. Default is \code{rw.test = FALSE}.}

\item{dCV.sw.test}{A logical value. If \code{TRUE} original sampling weights for the units in the test sets are returned instead of the replicate weights. Default is \code{dCV.sw.test = FALSE}. Only applies for \code{dCV} method.}
}
\value{
This function returns a new data frame with new columns, each of them indicating replicate weights for different subsets.
}
\description{
This function allows calculating replicate weights.
}
\details{
Some of these methods (specifically \code{JKn}, \code{bootstrap}, \code{subbootstrap} and \code{BRR}),
were previously implemented in the \code{survey} R-package, to which we can access by means of the function
\code{as.svrepdesign()} (the names of the methods are kept as in \code{as.svrepdesign()}).
Thus, the function \code{replicate.weights()} depends on this function to define replicate weights based on these
options. In contrast, \code{dCV}, \code{split} and \code{extrapolation} have been expressly defined to be
incorporated into this function.

Selecting any of the above-mentioned methods, the object returned by this function is a new data frame,
which includes new columns into the original data set, each of them indicating replicate
weights for different training (always) and test (optionally, controlled by the argument \code{rw.test}) subsets.
The number of new columns and the way in which they are denoted depend on the values set for the arguments,
in general, and on the replicate weights method selected, in particular. The new columns indicating training and test sets
follow a similar structure for any of the selected methods. Specifically, the structure of the names of the training sets
is the following: \code{rw_r_x_train_t} where \code{x=1,...,R} indicates the \code{x}-th partition of the sample and
\code{t=1,...,T} the \code{t}-th training set. Similarly, the structure of the new columns indicating the test sets
is the following: \code{rw_r_x_test_t} or \code{sw_r_x_test_t}, where \code{x} indicates the partition and \code{t}
the number of the test set. In addition, for some of the methods we also indicate the fold or set to which each unit
in the data set has been included in each partition. This information is included as \code{fold_t} or \code{set_t},
depending on the method. See more detailed information below.
}
\examples{
data(simdata_lasso_binomial)

# JKn ---------------------------------------------------------------------
newdata <- replicate.weights(data = simdata_lasso_binomial,
                             method = "JKn",
                             cluster = "cluster",
                             strata = "strata",
                             weights = "weights",
                             rw.test = TRUE)

# dCV ---------------------------------------------------------------------
newdata <- replicate.weights(data = simdata_lasso_binomial,
                             method = "dCV",
                             cluster = "cluster",
                             strata = "strata",
                             weights = "weights",
                             k = 10, R = 20,
                             rw.test = TRUE)

# subbootstrap ------------------------------------------------------------
newdata <- replicate.weights(data = simdata_lasso_binomial,
                             method = "subbootstrap",
                             cluster = "cluster",
                             strata = "strata",
                             weights = "weights",
                             B = 100)

# BRR ---------------------------------------------------------------------
newdata <- replicate.weights(data = simdata_lasso_binomial,
                             method = "BRR",
                             cluster = "cluster",
                             strata = "strata",
                             weights = "weights",
                             rw.test = TRUE)

# split ---------------------------------------------------------------------
newdata <- replicate.weights(data = simdata_lasso_binomial,
                             method = "split",
                             cluster = "cluster",
                             strata = "strata",
                             weights = "weights",
                             R=20,
                             train.prob = 0.5,
                             method.split = "subbootstrap",
                             rw.test = TRUE)

# extrapolation -------------------------------------------------------------
newdata <- replicate.weights(data = simdata_lasso_binomial,
                            method = "extrapolation",
                            cluster = "cluster",
                            strata = "strata",
                            weights = "weights",
                            R=20,
                            train.prob = 0.5,
                            rw.test = TRUE)
}
