% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/prob-roc_curve.R
\name{roc_curve}
\alias{roc_curve}
\alias{roc_curve.data.frame}
\title{Receiver operator curve}
\usage{
roc_curve(data, ...)

\method{roc_curve}{data.frame}(
  data,
  truth,
  ...,
  na_rm = TRUE,
  event_level = yardstick_event_level(),
  case_weights = NULL,
  options = list()
)
}
\arguments{
\item{data}{A \code{data.frame} containing the columns specified by \code{truth} and
\code{...}.}

\item{...}{A set of unquoted column names or one or more
\code{dplyr} selector functions to choose which variables contain the
class probabilities. If \code{truth} is binary, only 1 column should be selected,
and it should correspond to the value of \code{event_level}. Otherwise, there
should be as many columns as factor levels of \code{truth} and the ordering of
the columns should be the same as the factor levels of \code{truth}.}

\item{truth}{The column identifier for the true class results
(that is a \code{factor}). This should be an unquoted column name although
this argument is passed by expression and supports
\link[rlang:topic-inject]{quasiquotation} (you can unquote column
names). For \verb{_vec()} functions, a \code{factor} vector.}

\item{na_rm}{A \code{logical} value indicating whether \code{NA}
values should be stripped before the computation proceeds.}

\item{event_level}{A single string. Either \code{"first"} or \code{"second"} to specify
which level of \code{truth} to consider as the "event". This argument is only
applicable when \code{estimator = "binary"}. The default uses an internal helper
that defaults to \code{"first"}.}

\item{case_weights}{The optional column identifier for case weights.
This should be an unquoted column name that evaluates to a numeric column
in \code{data}. For \verb{_vec()} functions, a numeric vector,
\code{\link[hardhat:importance_weights]{hardhat::importance_weights()}}, or \code{\link[hardhat:frequency_weights]{hardhat::frequency_weights()}}.}

\item{options}{\verb{[deprecated]}

No longer supported as of yardstick 1.0.0. If you pass something here it
will be ignored with a warning.

Previously, these were options passed on to \code{pROC::roc()}. If you need
support for this, use the pROC package directly.}
}
\value{
A tibble with class \code{roc_df} or \code{roc_grouped_df} having
columns \code{.threshold}, \code{specificity}, and \code{sensitivity}.
}
\description{
\code{roc_curve()} constructs the full ROC curve and returns a
tibble. See \code{\link[=roc_auc]{roc_auc()}} for the area under the ROC curve.
}
\details{
\code{roc_curve()} computes the sensitivity at every unique
value of the probability column (in addition to infinity and
minus infinity).

There is a \code{\link[ggplot2:autoplot]{ggplot2::autoplot()}} method for quickly visualizing the curve.
This works for binary and multiclass output, and also works with grouped
data (i.e. from resamples). See the examples.
}
\section{Multiclass}{


If a multiclass \code{truth} column is provided, a one-vs-all
approach will be taken to calculate multiple curves, one per level.
In this case, there will be an additional column, \code{.level},
identifying the "one" column in the one-vs-all calculation.
}

\section{Relevant Level}{


There is no common convention on which factor level should
automatically be considered the "event" or "positive" result
when computing binary classification metrics. In \code{yardstick}, the default
is to use the \emph{first} level. To alter this, change the argument
\code{event_level} to \code{"second"} to consider the \emph{last} level of the factor the
level of interest. For multiclass extensions involving one-vs-all
comparisons (such as macro averaging), this option is ignored and
the "one" level is always the relevant result.
}

\examples{
# ---------------------------------------------------------------------------
# Two class example

# `truth` is a 2 level factor. The first level is `"Class1"`, which is the
# "event of interest" by default in yardstick. See the Relevant Level
# section above.
data(two_class_example)

# Binary metrics using class probabilities take a factor `truth` column,
# and a single class probability column containing the probabilities of
# the event of interest. Here, since `"Class1"` is the first level of
# `"truth"`, it is the event of interest and we pass in probabilities for it.
roc_curve(two_class_example, truth, Class1)

# ---------------------------------------------------------------------------
# `autoplot()`

# Visualize the curve using ggplot2 manually
library(ggplot2)
library(dplyr)
roc_curve(two_class_example, truth, Class1) \%>\%
  ggplot(aes(x = 1 - specificity, y = sensitivity)) +
  geom_path() +
  geom_abline(lty = 3) +
  coord_equal() +
  theme_bw()

# Or use autoplot
autoplot(roc_curve(two_class_example, truth, Class1))

\dontrun{

# Multiclass one-vs-all approach
# One curve per level
hpc_cv \%>\%
  filter(Resample == "Fold01") \%>\%
  roc_curve(obs, VF:L) \%>\%
  autoplot()

# Same as above, but will all of the resamples
hpc_cv \%>\%
  group_by(Resample) \%>\%
  roc_curve(obs, VF:L) \%>\%
  autoplot()
}

}
\seealso{
Compute the area under the ROC curve with \code{\link[=roc_auc]{roc_auc()}}.

Other curve metrics: 
\code{\link{gain_curve}()},
\code{\link{lift_curve}()},
\code{\link{pr_curve}()}
}
\author{
Max Kuhn
}
\concept{curve metrics}
