% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/perm.test.R
\name{perm.test}
\alias{perm.test}
\title{Permutation Test on Monothetic Tree}
\usage{
perm.test(
  object,
  data,
  auto.pick = FALSE,
  sig.val = 0.05,
  method = c("sw", "rl", "rn"),
  rep = 1000L,
  stat = c("f", "aw"),
  bon.adj = TRUE,
  ncores = 1L
)
}
\arguments{
\item{object}{The \code{MonoClust} object as the result of the clustering.}

\item{data}{The data set which is being clustered.}

\item{auto.pick}{Whether the algorithm stops when p-value becomes larger than
\code{sig.val} or keeps testing and let the researcher pick the final splitting
tree. Default value is \code{FALSE}.}

\item{sig.val}{Significance value to decide when to stop splitting. This
option is ignored if \code{auto.pick = FALSE}, and is 0.05 by default when
\code{auto.pick = TRUE}.}

\item{method}{Can be chosen between \code{sw} (simple-withhold, default), \code{rl}
(resplit-limit), or \code{rn} (resplit-nolimit). See Details.}

\item{rep}{Number of permutations required to calculate test statistic.}

\item{stat}{Statistic to use. Choosing between \code{"f"} (Calinski-Harabasz's
pseudo-F (Calinski and Harabasz, 1974)) or \code{"aw"} (Average silhoutte width
by Rousseeuw (1987)).}

\item{bon.adj}{Whether to adjust for multiple testing problem using
Bonferroni correction.}

\item{ncores}{Number of CPU cores on the current host.}
}
\value{
The same \code{MonoClust} object with an extra column (p-value), as well
as the \code{numofclusters} object if \code{auto.pick = TRUE}.
}
\description{
Testing the significance of each monothetic clustering split by permutation
methods. The "simple-withhold" method (\code{"sw"}) shuffles the observations
between two groups without the splitting variable. The other two methods
shuffle the values in the splitting variable to create a new data set, then
it either splits again on that variable ("resplit-limit", \code{"rl"}) or use all
variables as the splitting candidates ("resplit-nolimit", \code{"rn"}).
}
\details{
\subsection{Permutation Methods}{
\subsection{Simple-Withhold: Shuffle the observations between two proposed clusters}{

The \code{stat} calculated from the shuffles create the reference distribution
to find the p-value. Because the splitting variable that was chosen is
already the best in terms of reduction of inertia, that variable is withheld
from the distance matrix used in the permutation test.
}

\subsection{Resplit-Limit: Shuffle splitting variable, split again on that variable}{

This method shuffles the values of the splitting variables while keeping
other variables fixed to create a new data set, then the chosen \code{stat} is
calculated for each rep to compare with the observed \code{stat}.
}

\subsection{Resplit-Nolimit: Shuffle splitting variable, split on all variables}{

Similar to Method 2 but all variables are splitting candidates.
}

}

\subsection{Bonferroni Correction}{

A hypothesis test occurred lower in the monothetic clustering tree could have
its p-value corrected for multiple tests happened before it in order to reach
that node. The formula is
\deqn{adj.p = unadj.p \times depth,}
with \eqn{depth} is 1 at the root node.
}
}
\note{
This function uses \code{\link[foreach:foreach]{foreach::foreach()}} to facilitate parallel
processing. It distributes reps to processes.
}
\examples{
library(cluster)
data(ruspini)
\donttest{
ruspini6sol <- MonoClust(ruspini, nclusters = 6)
ruspini6.p_value <- perm.test(ruspini6sol, data = ruspini, method = "sw",
                              rep = 1000)
ruspini6.p_value
}
}
\references{
Calinski, T. and Harabasz, J (1974). "A dendrite method for cluster
analysis". en. In: \emph{Communications in Statistics} 3.1, pp. 1-27.

Rousseeuw, P. J. (1987). "Silhouettes: A graphical aid to the interpretation
and validation of cluster analysis". In: \emph{Journal of Computational and
Applied Mathematics} 20, pp. 53-65. ISSN: 03770427. DOI:
10.1016/0377-0427(87) 90125-7.
}
