% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/flxregbinom.R
\name{FLXMCregbinom}
\alias{FLXMCregbinom}
\title{FlexMix Driver for Regularized Binomial Mixtures}
\usage{
FLXMCregbinom(formula = . ~ ., size = NULL, hasNA = FALSE, alpha = 0, eps = 0)
}
\arguments{
\item{formula}{A formula which is interpreted relative to the
formula specified in the call to \code{\link[flexmix:flexmix]{flexmix::flexmix()}} using
\code{\link[stats:update.formula]{stats::update.formula()}}. Only the left-hand side (response)
of the formula is used. Default is to use the original model
formula specified in \code{\link[flexmix:flexmix]{flexmix::flexmix()}}.}

\item{size}{Number of trials (one or more). Default \code{NULL} implies
that the number of trials is inferred columnwise by the
maximum value observed.}

\item{hasNA}{Boolean whether the data set may contain NA
values. Default is FALSE. For data sets without NAs, the same
results are obtained but it runs slightly faster when the
absence of NAs can be assumed.}

\item{alpha}{A non-negative scalar acting as regularization
parameter. Can be regarded as adding \code{alpha} observations equal
to the population mean to each component.}

\item{eps}{A numeric value in [0, 1). When greater than zero,
probabilities are truncated to be within in [\code{eps}, 1-\code{eps}].}
}
\value{
An object of class \code{"FLXC"}.
}
\description{
This model driver can be used to cluster data using the binomial
distribution.
}
\details{
Using a regularization parameter \code{alpha} greater than zero can be
viewed as adding \code{alpha} observations equal to the population mean
to each component. This can be used to avoid degenerate solutions
(i.e., probabilites of 0 or 1). It also has the effect that
clusters become more similar to each other the larger \code{alpha} is
chosen. For small values this effect is, however, mostly
negligible.

Parameter estimation is achieved using the MAP estimator for each
component and variable using a Beta prior.
}
\examples{
library("flexmix")
library("flexord")
library("flexclust")

# Sample data
k <- 4     # nr of clusters
size <- 4  # nr of trials
N <- 100   # obs. per cluster

set.seed(0xdeaf)

# random probabilities per component
probs <- lapply(seq_len(k), \(ki) runif(10, 0.01, 0.99))

# sample data
dat <- lapply(probs, \(p) {
    lapply(p, \(p_i) {
        rbinom(N, size, p_i)
    }) |> do.call(cbind, args=_)
}) |> do.call(rbind, args=_)

true_clusters <- rep(1:4, rep(N, k))

# Cluster without regularization
m1 <- stepFlexmix(dat~1, model=FLXMCregbinom(size=size, alpha=0), k=k)

# Cluster with regularization
m2 <- stepFlexmix(dat~1, model=FLXMCregbinom(size=size, alpha=1), k=k)

# Both models are mostly able to reconstruct the true clusters (ARI ~ 0.96)
# (it's a very easy clustering problem)
# Small values for the regularization don't seem to affect the ARI (much)
randIndex(clusters(m1), true_clusters)
randIndex(clusters(m2), true_clusters)
}
\references{
\itemize{
\item Ernst, D, Ortega Menjivar, L, Scharl, T, Grün, B
(2025).  \emph{Ordinal Clustering with the flex-Scheme.} Austrian
Journal of Statistics. \emph{Submitted manuscript}.
}
}
