% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/ot_indices.R
\name{ot_indices}
\alias{ot_indices}
\title{Calculate Optimal Transport sensitivity indices for multivariate y}
\usage{
ot_indices(
  x,
  y,
  M,
  cost = "L2",
  discrete_out = FALSE,
  solver = "sinkhorn",
  solver_optns = NULL,
  scaling = TRUE,
  boot = FALSE,
  stratified_boot = FALSE,
  R = NULL,
  parallel = "no",
  ncpus = 1,
  conf = 0.95,
  type = "norm"
)
}
\arguments{
\item{x}{A matrix or data.frame containing the input(s) values. The values
can be numeric, factors, or strings. The type of data changes the
partitioning. If the values are continuous (double), the function
partitions the data into \code{M} sets. If the values are discrete (integers,
strings, factors), the number of partitioning sets is data-driven.}

\item{y}{A matrix containing the output values. Each column represents a
different output variable, and each row represents a different observation.
Only numeric values are allowed.}

\item{M}{A scalar representing the number of partitions for continuous
inputs.}

\item{cost}{(default \code{"L2"}) A string or function defining the cost function
of the Optimal Transport problem. It should be "L2" or a function taking as
input y and returning a cost matrix. If \code{cost="L2"}, \code{ot_indices} uses the
squared Euclidean metric.}

\item{discrete_out}{(default \code{FALSE}) Logical, by default the output sample
in \code{y} are equally weighted. If \code{discrete_out=TRUE}, the function tries to
create an histogram of the realizations and to use the histogram as
weights. It works if the output is discrete or mixed and the number of
realizations is large. The advantage of this option is to reduce the
dimension of the cost matrix.}

\item{solver}{Solver for the Optimal Transport problem. Currently supported
options are:
\itemize{
\item \code{"sinkhorn"} (default), the Sinkhorn's solver \insertCite{cuturi2013sinkhorn}{gsaot}.
\item \code{"sinkhorn_log"}, the Sinkhorn's solver in log scale \insertCite{peyre2019computational}{gsaot}.
\item \code{"transport"}, a solver of the non regularized OT problem using \code{\link[transport:transport]{transport::transport()}}.
}}

\item{solver_optns}{(optional) A list containing the options for the Optimal
Transport solver. See details for allowed options and default ones.}

\item{scaling}{(default \code{TRUE}) Logical that sets whether or not to scale the
cost matrix.}

\item{boot}{(default \code{FALSE}) Logical that sets whether or not to perform
bootstrapping of the OT indices.}

\item{stratified_boot}{(default \code{FALSE}) Logical that sets the type of
resampling performed. With \code{stratified_boot=FALSE}, the function resamples
the dataset and then creates the partitions. Otherwise, first, it
creates the partitions and then it performs stratified bootstrapping with
strata being the partitions.}

\item{R}{(default \code{NULL}) Positive integer, number of bootstrap replicas.}

\item{parallel}{(default \code{"no"}) The type of parallel operation to be used
(if any). If missing, the default is taken from the option \code{boot.parallel}
(and if that is not set, \code{"no"}). Only considered if \code{boot = TRUE}. For
more information, check the \code{\link[boot:boot]{boot::boot()}} function.}

\item{ncpus}{(default \code{1}) Positive integer: number of processes to be used
in parallel operation: typically one would chose this to the number of
available CPUs. Check the \code{ncpus} option in the \code{\link[boot:boot]{boot::boot()}} function of
the boot package.}

\item{conf}{(default \code{0.95}) Number between \code{0} and \code{1} representing the
confidence level. Only considered if \code{boot = TRUE}.}

\item{type}{(default \code{"norm"}) Method to compute the confidence interval.
Only considered if \code{boot = TRUE}. For more information, check the \code{type}
option of \code{\link[boot:boot.ci]{boot::boot.ci()}}.}
}
\value{
A \code{gsaot_indices} object containing:
\itemize{
\item \code{method}: a string that identifies the type of indices computed.
\item \code{indices}: a names array containing the sensitivity indices between 0 and 1
for each column in x, indicating the influence of each input variable on
the output variables.
\item \code{bound}: a double representing the upper bound of the separation measure or
an array representing the mean of the separation for each input according
to the bootstrap replicas.
\item \code{x}, \code{y}: input and output data provided as arguments of the function.
\item \code{inner_statistic}: a list of matrices containing the values of the inner
statistics for the partitions defined by \code{partitions}. If \code{method = wasserstein-bures}, each matrix has three rows containing the
Wasserstein-Bures indices, the Advective, and the Diffusive components.
\item \code{partitions}: a matrix containing the partitions built to calculate the
sensitivity indices. Each column contains the partition associated to the
same column in \code{x}. If \code{boot = TRUE}, the object contains also:
\item \code{indices_ci}: a \code{data.frame} with first column the input, second and third
columns the lower and upper bound of the confidence interval.
\item \code{inner_statistic_ci}: a list of matrices. Each element of the list contains
the lower and upper confidence bounds for the partition defined by the row.
\item \code{bound_ci}: a list containing the lower and upper bounds of the confidence
intervals of the separation measure bound.
\item \code{type}, \code{conf}: type of confidence interval and confidence level, provided
as arguments.
}
}
\description{
\code{ot_indices} calculates sensitivity indices using Optimal
Transport (OT) for a multivariate output sample \code{y} with respect to input
data \code{x}. Sensitivity indices measure the influence of inputs on outputs,
with values ranging between 0 and 1.
}
\details{
\subsection{Solvers}{

OT is a widely studied topic in Operational Research and Calculus. The
reference for the OT solvers in this package is
\insertCite{peyre2019computational;textual}{gsaot}. The default solver is
\code{"sinkhorn"}, the Sinkhorn's solver introduced in
\insertCite{cuturi2013sinkhorn;textual}{gsaot}. It solves the
entropic-regularized version of the OT problem. The \code{"sinkhorn_log"} solves
the same OT problem but in log scale. It is more stable for low values of
the regularization parameter but slower to converge. The option
\code{"transport"} is used to choose a solver for the non-regularized OT
problem. Under the hood, the function calls \code{\link[transport:transport]{transport::transport()}} from
package \code{transport}. This option does not define the solver per se, but the
solver should be defined with the argument \code{solver_optns}. See the next
section for more information.
}

\subsection{Solver options}{

The argument \code{solver_optns} should be empty (for default options) or a list
with all or some of the required solver parameters. All the parameters not
included in the list will be set to default values. The solvers
\code{"sinkhorn"} and \code{"sinkhorn_log"} have the same options:
\itemize{
\item \code{numIterations} (default \code{1e3}): a positive integer defining the maximum number
of Sinkhorn's iterations allowed. If the solver does not converge in the
number of iterations set, the solver will throw an error.
\item \code{epsilon} (default \code{0.01}): a positive real number defining the regularization
coefficient. If the value is too low, the solver may return \code{NA}.
\item \code{maxErr} (default \code{1e-9}): a positive real number defining the
approximation error threshold between the marginal histogram of the
partition and the one computed by the solver. The solver may fail to
converge in \code{numIterations} if this value is too low.
}

The solver \code{"transport"} has the parameters:
\itemize{
\item \code{method} (default \verb{"networkflow}): string defining the solver of the OT
problem.
\item \code{control}: a named list of parameters for the chosen method or the result
of a call to \code{\link[transport:trcontrol]{transport::trcontrol()}}.
\item \code{threads} (default \code{1}): an Integer specifying the number of threads used
in parallel computing.
}

For details regarding this solver, check the \code{\link[transport:transport]{transport::transport()}} help
page.
}
}
\examples{
N <- 1000

mx <- c(1, 1, 1)
Sigmax <- matrix(data = c(1, 0.5, 0.5, 0.5, 1, 0.5, 0.5, 0.5, 1), nrow = 3)

x1 <- rnorm(N)
x2 <- rnorm(N)
x3 <- rnorm(N)

x <- cbind(x1, x2, x3)
x <- mx + x \%*\% chol(Sigmax)

A <- matrix(data = c(4, -2, 1, 2, 5, -1), nrow = 2, byrow = TRUE)
y <- t(A \%*\% t(x))

x <- data.frame(x)

M <- 25

# Calculate sensitivity indices
sensitivity_indices <- ot_indices(x, y, M)
sensitivity_indices

}
\references{
\insertAllCited{}
}
\seealso{
\code{\link[=ot_indices_1d]{ot_indices_1d()}}, \code{\link[=ot_indices_wb]{ot_indices_wb()}}
}
