% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/stat-poly-eq.R
\name{stat_poly_eq}
\alias{stat_poly_eq}
\title{Equation, p-value, R^2, AIC or BIC of fitted polynomial}
\usage{
stat_poly_eq(
  mapping = NULL,
  data = NULL,
  geom = "text_npc",
  position = "identity",
  ...,
  formula = NULL,
  eq.with.lhs = TRUE,
  eq.x.rhs = NULL,
  coef.digits = 3,
  rr.digits = 2,
  f.digits = 3,
  p.digits = 3,
  label.x = "left",
  label.y = "top",
  label.x.npc = NULL,
  label.y.npc = NULL,
  hstep = 0,
  vstep = NULL,
  output.type = "expression",
  na.rm = FALSE,
  show.legend = FALSE,
  inherit.aes = TRUE
)
}
\arguments{
\item{mapping}{The aesthetic mapping, usually constructed with
\code{\link[ggplot2]{aes}} or \code{\link[ggplot2]{aes_}}. Only needs to be
set at the layer level if you are overriding the plot defaults.}

\item{data}{A layer specific dataset, only needed if you want to override
the plot defaults.}

\item{geom}{The geometric object to use display the data}

\item{position}{The position adjustment to use for overlapping points on this
layer}

\item{...}{other arguments passed on to \code{\link[ggplot2]{layer}}. This
can include aesthetics whose values you want to set, not map. See
\code{\link[ggplot2]{layer}} for more details.}

\item{formula}{a formula object. Using aesthetic names instead of
original variable names.}

\item{eq.with.lhs}{If \code{character} the string is pasted to the front of
the equation label before parsing or a \code{logical} (see note).}

\item{eq.x.rhs}{\code{character} this string will be used as replacement for
\code{"x"} in the model equation when generating the label before parsing
it.}

\item{coef.digits, rr.digits, f.digits, p.digits}{integer Number of significant
digits to use for the fitted coefficients, R^2, F-value and P-value in labels.}

\item{label.x, label.y}{\code{numeric} with range 0..1 "normalized parent
coordinates" (npc units) or character if using \code{geom_text_npc()} or
\code{geom_label_npc()}. If using \code{geom_text()} or \code{geom_label()}
numeric in native data units. If too short they will be recycled.}

\item{label.x.npc, label.y.npc}{\code{numeric} with range 0..1 (npc units)
DEPRECATED, use label.x and label.y instead; together with a geom
using npcx and npcy aesthetics.}

\item{hstep, vstep}{numeric in npc units, the horizontal and vertical step
used between labels for different groups.}

\item{output.type}{character One of "expression", "LaTeX", "text",
"markdown" or "numeric".}

\item{na.rm}{a logical indicating whether NA values should be stripped before
the computation proceeds.}

\item{show.legend}{logical. Should this layer be included in the legends?
\code{NA}, the default, includes if any aesthetics are mapped. \code{FALSE}
never includes, and \code{TRUE} always includes.}

\item{inherit.aes}{If \code{FALSE}, overrides the default aesthetics, rather
than combining with them. This is most useful for helper functions that
define both data and aesthetics and shouldn't inherit behaviour from the
default plot specification, e.g. \code{\link[ggplot2]{borders}}.}
}
\description{
\code{stat_poly_eq} fits a polynomial and generates several labels including
the equation, p-value, coefficient of determination (R^2), 'AIC' and
'BIC'.
}
\details{
This stat can be used to automatically annotate a plot with R^2,
  adjusted R^2 or the fitted model equation. It supports only linear models
  fitted with function \code{lm()}. The R^2 and adjusted R^2 annotations can
  be used with any linear model formula. The fitted equation label is
  correctly generated for polynomials or quasi-polynomials through the
  origin. Model formulas can use \code{poly()} or be defined algebraically
  with terms of powers of increasing magnitude with no missing intermediate
  terms, except possibly for the intercept indicated by "- 1" or "-1" in the
  formula. The validity of the \code{formula} is not checked in the current
  implementation, and for this reason the default aesthetics sets R^2 as
  label for the annotation. This stat only generates labels, the predicted
  values need to be separately added to the plot, so to make sure that the
  same model formula is used in all steps it is best to save the formula as
  an object and supply this object as argument to the different statistics.

  A ggplot statistic receives as data a data frame that is not the one passed
  as argument by the user, but instead a data frame with the variables mapped
  to aesthetics. stat_poly_eq() mimics how stat_smooth() works, except that
  only polynomials can be fitted. In other words, it respects the grammar of
  graphics. This helps ensure that the model is fitted to the same data as
  plotted in other layers.
}
\note{
For backward compatibility a logical is accepted as argument for
  \code{eq.with.lhs}, giving the same output than the current default
  character value. By default "x" is retained as independent variable as this
  is the name of the aesthetic. However, it can be substituted by providing a
  suitable replacement character string through \code{eq.x.rhs}.
}
\section{Aesthetics}{
 \code{stat_poly_eq} understands \code{x} and \code{y},
  to be referenced in the \code{formula} and \code{weight} passed as argument
  to parameter \code{weights} of \code{lm()}. All three must be mapped to
  \code{numeric} variables. In addition, the aesthetics undertood by the geom
  used (\code{"text"} by default) are understood and grouping respected.
}

\section{Computed variables}{

If output.type different from \code{"numeric"} the returned tibble contains
columns:
\describe{
  \item{x,npcx}{x position}
  \item{y,npcy}{y position}
  \item{coef.ls, r.squared, adj.r.squared, AIC, BIC}{as numric values extracted from fit object}
  \item{eq.label}{equation for the fitted polynomial as a character string to be parsed}
  \item{rr.label}{\eqn{R^2} of the fitted model as a character string to be parsed}
  \item{adj.rr.label}{Adjusted \eqn{R^2} of the fitted model as a character string to be parsed}
  \item{f.value.label}{F value and degrees of freedom for the fitted model as a whole.}
  \item{p.value..label}{P-value for the F-value above.}
  \item{AIC.label}{AIC for the fitted model.}
  \item{BIC.label}{BIC for the fitted model.}
  \item{hjust, vjust}{Set to "inward" to override the default of the "text" geom.}}

If output.type is \code{"numeric"} the returned tibble contains columns:
\describe{
  \item{x,npcx}{x position}
  \item{y,npcy}{y position}
  \item{coef.ls}{list containing the "coefficients" matrix from the summary of the fit object}
  \item{r.squared, adj.r.squared, f.value, f.df1, f.df2, p.value, AIC, BIC}{numeric values extracted or computed from fit object}
  \item{hjust, vjust}{Set to "inward" to override the default of the "text" geom.}}

To explore the computed values returned for a given input we suggest the use
of \code{\link[gginnards]{geom_debug}} as shown in the example below.
}

\section{Parsing may be required}{
 if using the computed labels with
  \code{output.type = "expression"}, then \code{parse = TRUE} is needed,
  while if using \code{output.type = "LaTeX"} \code{parse = FALSE} is needed.
}

\examples{
# generate artificial data
set.seed(4321)
x <- 1:100
y <- (x + x^2 + x^3) + rnorm(length(x), mean = 0, sd = mean(x^3) / 4)
my.data <- data.frame(x = x, y = y,
                      group = c("A", "B"),
                      y2 = y * c(0.5,2),
                      w = sqrt(x))

# give a name to a formula
formula <- y ~ poly(x, 3, raw = TRUE)

# no weights
ggplot(my.data, aes(x, y)) +
  geom_point() +
  geom_smooth(method = "lm", formula = formula) +
  stat_poly_eq(formula = formula, parse = TRUE)

ggplot(my.data, aes(x, y)) +
  geom_point() +
  geom_smooth(method = "lm", formula = formula) +
  stat_poly_eq(formula = formula, parse = TRUE,
               label.y = "bottom", label.x = "right")

ggplot(my.data, aes(x, y)) +
  geom_point() +
  geom_smooth(method = "lm", formula = formula) +
  stat_poly_eq(formula = formula, parse = TRUE,
               label.y = 0.1, label.x = 0.9)

# using weights
ggplot(my.data, aes(x, y, weight = w)) +
  geom_point() +
  geom_smooth(method = "lm", formula = formula) +
  stat_poly_eq(formula = formula, parse = TRUE)

# no weights, digits for R square
ggplot(my.data, aes(x, y)) +
  geom_point() +
  geom_smooth(method = "lm", formula = formula) +
  stat_poly_eq(formula = formula, rr.digits = 4, parse = TRUE)

# user specified label
ggplot(my.data, aes(x, y)) +
  geom_point() +
  geom_smooth(method = "lm", formula = formula) +
  stat_poly_eq(aes(label =  paste(stat(eq.label),
                                  stat(adj.rr.label), sep = "*\", \"*")),
               formula = formula, parse = TRUE)

ggplot(my.data, aes(x, y)) +
  geom_point() +
  geom_smooth(method = "lm", formula = formula) +
  stat_poly_eq(aes(label =  paste(stat(f.value.label),
                                  stat(p.value.label), sep = "*\", \"*")),
               formula = formula, parse = TRUE)

# user specified label and digits
ggplot(my.data, aes(x, y)) +
  geom_point() +
  geom_smooth(method = "lm", formula = formula) +
  stat_poly_eq(aes(label =  paste(stat(eq.label),
                                  stat(adj.rr.label), sep = "*\", \"*")),
               formula = formula, rr.digits = 3, coef.digits = 4,
               parse = TRUE)

# geom = "text"
ggplot(my.data, aes(x, y)) +
  geom_point() +
  geom_smooth(method = "lm", formula = formula) +
  stat_poly_eq(geom = "text", label.x = 100, label.y = 0, hjust = 1,
               formula = formula, parse = TRUE)

# using numeric values
# Here we use column "Estimate" from the matrix.
# Other available columns are "Std. Error", "t value" and "Pr(>|t|)".
my.format <-
  "b[0]~`=`~\%.3g*\", \"*b[1]~`=`~\%.3g*\", \"*b[2]~`=`~\%.3g*\", \"*b[3]~`=`~\%.3g"
ggplot(my.data, aes(x, y)) +
  geom_point() +
  geom_smooth(method = "lm", formula = formula) +
  stat_poly_eq(formula = formula,
               output.type = "numeric",
               parse = TRUE,
               mapping =
                aes(label = sprintf(my.format,
                                    stat(coef.ls)[[1]][[1, "Estimate"]],
                                    stat(coef.ls)[[1]][[2, "Estimate"]],
                                    stat(coef.ls)[[1]][[3, "Estimate"]],
                                    stat(coef.ls)[[1]][[4, "Estimate"]])
                                    )
                   )

# Examples using geom_debug() to show computed values
#
# This provides a quick way of finding out which variables are available for
# use in mapping of aesthetics when using other geoms as in the examples
# above.

library(gginnards)

ggplot(my.data, aes(x, y)) +
  geom_point() +
  geom_smooth(method = "lm", formula = formula) +
  stat_poly_eq(formula = formula, geom = "debug")

ggplot(my.data, aes(x, y)) +
  geom_point() +
  geom_smooth(method = "lm", formula = formula) +
  stat_poly_eq(aes(label = stat(eq.label)),
               formula = formula, geom = "debug",
               output.type = "markdown")

ggplot(my.data, aes(x, y)) +
  geom_point() +
  geom_smooth(method = "lm", formula = formula) +
  stat_poly_eq(formula = formula, geom = "debug", output.type = "text")

ggplot(my.data, aes(x, y)) +
  geom_point() +
  geom_smooth(method = "lm", formula = formula) +
  stat_poly_eq(formula = formula, geom = "debug", output.type = "numeric")

# show the content of a list column
ggplot(my.data, aes(x, y)) +
  geom_point() +
  geom_smooth(method = "lm", formula = formula) +
  stat_poly_eq(formula = formula, geom = "debug", output.type = "numeric",
               summary.fun = function(x) {x[["coef.ls"]][[1]]})

}
\references{
Written as an answer to a question at Stackoverflow.
  \url{https://stackoverflow.com/questions/7549694/adding-regression-line-equation-and-r2-on-graph}
}
\seealso{
This \code{stat_poly_eq} statistic can return ready formatted labels
  depending on the argument passed to \code{output.type}. This is possible
  because only polynomial models are supported. For other types of models,
  statistics \code{\link{stat_fit_glance}},  \code{\link{stat_fit_tidy}} and
  \code{\link{stat_fit_glance}} should be used instead and the code for
  construction of character strings from numeric values and their mapping to
  aesthetic \code{label} needs to be explicitly supplied in the call.

Other statistics for linear model fits: 
\code{\link{stat_fit_deviations}()},
\code{\link{stat_fit_residuals}()}
}
\concept{statistics for linear model fits}
