% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/a.R
\name{predict.fru}
\alias{predict.fru}
\title{Predict with the fru model}
\usage{
\method{predict}{fru}(object, x, votes = FALSE, threads = 0L, ...)
}
\arguments{
\item{object}{A model used for prediction; has to hold the forest (\code{forest=TRUE} flag passed to \code{fru}) to make predictions on new data, or has to have OOB scores (\code{oob=TRUE} flag passed to \code{fru}) to return OOB scores.}

\item{x}{Data frame to predict; if missing or NULL, the method will return OOB scores.}

\item{votes}{If set to \code{TRUE}, changes the output to sums of votes cast by the ensemble on each class; useful as a prediction confidence score, for instance for ROC analysis.
Only makes sense for classification; passing this flag together with regression forest will throw an error.}

\item{threads}{Number of threads to use; by default, or when set to 0, fru will try to use all available computing cores.}

\item{...}{Ignored.}
}
\value{
For a default of \code{votes=FALSE}, a vector with a prediction for either each row of \code{x}, or, when not given, an OOB approximated prediction for each row of the original training data.
 For \code{votes=TRUE}, a data frame with as many columns as decision classes, rows corresponding to rows of \code{x} or training data, and cells with the counts of votes per each class.
}
\description{
Either predicts a given new data or returns the OOB predictions of the model; optionally, for classification forests, returns raw votes for each decision class.
}
\details{
If given, new data has to hold the same features as the training data, and the method will match them by name (order is irrelevant, additional features will be ignored); matched features have to be of the same type.
Moreover, factor features have to have exactly the same levels in the same order as in training; this will be checked.

The voting in classification case may lead to ties, in which case predict will use PRNG to resolve them.
In the OOB mode, the constant seed is used, so that OOB scores for the same forest model will always be the same, mimicking the behaviour of other packages which usually calculate predictions during training and store them with ties resolved at that time.
For new data prediction, PRNG is seeded from R's random state, so, in principle, ties will be resolved differently on each prediction.
If determinism is desired, it is best to use the votes output in which ties are evident.
Regression is performed using leaf averages, which is a deterministic process (not counting numerical issues possibly caused by nondeterministic order in which trees are produced when using multi-threading).

The OOB predictions may contain NAs when a given object was not an OOB object of any tree, which may happen for small ensembles (in particular surely when \code{trees=1}).
Similarly, the sums of OOB votes for each object will not sum up to the ensemble size, but will for new data prediction.

By the nature of the method, new data prediction for the training data is usually close to perfect reproduction of the training decision; it is basically useless for any practical use.

This method checks matches the input structure with the training data structure retained in the object, which may take some time, especially when data is large or short prediction latency is required.
In that case, one may use the non-exported \code{unsafe_fru_predict} function which expects \code{x} to be exactly in the same form as training, jumps straight to the compiled code and returns the predictions in the raw form (classes are level indices, vote matrix is unrolled, etc.).
}
\examples{
set.seed(1)
data(iris)
iris[c(TRUE,FALSE),]->iris_train
iris[c(FALSE,TRUE),]->iris_test
fru(iris_train[,-5],iris_train[,5],threads=2,forest=TRUE)->model
print(model)
table(predict(model,iris_test,threads=2),iris_test$Species)
}
