% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/ctgm-lfdr.R
\name{ctgm_lfdr}
\alias{ctgm_lfdr}
\title{Fitting Conditional Two-Groups Models on Unmasked P-Values}
\usage{
ctgm_lfdr(x, pvals, models, dist = beta_family(), type = c("over", "raw"),
  params0 = list(pix = NULL, mux = NULL), niter = 50, cr = "BIC",
  verbose = TRUE)
}
\arguments{
\item{x}{covariates (i.e. side-information). Should be compatible to \code{models}. See Details}

\item{pvals}{a vector of values in [0, 1]. P-values}

\item{models}{an object of class "\code{adapt_model}" or a list of objects of class "adapt_model". See Details}

\item{dist}{an object of class "\code{\link{gen_exp_family}}". \code{\link{beta_family}()} as default}

\item{type}{a character. Either "over" or "raw" indicating the type of local FDR estimates. See Details}

\item{params0}{a list in the form of list(pix = , mux = ). Initial values of pi(x) and mu(x). Both can be set as NULL}

\item{niter}{a positive integer. Number of EM iterations.}

\item{cr}{a string. The criterion for model selection with BIC as default. Also support AIC, AICC and HIC}

\item{verbose}{a logical values in the form of list(fit = , ms = ). Indicate whether the progress of model fitting and model selection is displayed}
}
\value{
\itemize{
\item{lfdr}{a vector of values in [0, 1]. Local FDR estimates of each hypothesis.}
\item{model}{an \code{adapt_model} object. The selected model if multiple models are provided.}
}
}
\description{
\code{ctgm_lfdr} computes the oracle local FDR estimate, by using all p-values without masking.
}
\details{
\code{ctgm_lfdr} implements the EM algorithm to fit pi(x) and mu(x) on unmasked p-values. Although it is not related to FDR control of AdaPT, it provides useful measures for post-hoc justification and other purposes.
For instance, one can use these local FDR estimates for prioritizing the hypotheses if strict FDR control is not required.

In contrast to \code{adapt}, \code{cytm_lfdr} does not guarantee FDR control unless the model is correctly specified. It is recommended to use \code{ctgm_lfdr} only when FDR control is not required.

\code{x} should have a type compatible to the fitting functions in \code{models}. For GLM and GAM, \code{x} should be a data.frame. For glmnet, \code{x} should be a matrix.

\code{models} could either be an \code{adapt_model} object, if a single model is used, or a list of \code{adapt_model} objects, each of which corresponding to a model. Each element should be generated by \code{\link{gen_adapt_model}}. For glm/gam/glmnet, one can use the shortcut by running \code{\link{gen_adapt_model}} with name = "glm" or "gam" or "glmnet" but without specifying \code{pifun}, \code{mufun}, \code{pifun_init} and \code{mufun_init}. See examples below.

When \code{type = "over"}, it yields a conservative estimate of local FDR
\deqn{lfdr(p) = (1 - \pi_{1} + \pi_{1}f_{1}(1)) / (1 - \pi_{1} + \pi_{1}f_{1}(p)).}
When \code{type = "raw"}, it yields the original local FDR.
\deqn{lfdr(p) = (1 - \pi_{1}) / (1 - \pi_{1} + \pi_{1}f_{1}(p)).}
The former is shown to be more stable and reliable because the weak identifiability in conditional mixture models.
}
\examples{
\donttest{
# Load estrogen data
data(estrogen)
pvals <- as.numeric(estrogen$pvals)
x <- data.frame(x = as.numeric(estrogen$ord_high))
dist <- beta_family()

# Subsample the data for convenience
inds <- (x$x <= 5000)
pvals <- pvals[inds]
x <- x[inds,,drop = FALSE]

# Generate models for function adapt
library("splines")
formulas <- paste0("ns(x, df = ", 6:10, ")")
models <- lapply(formulas, function(formula){
    piargs <- muargs <- list(formula = formula)
    gen_adapt_model(name = "glm", piargs = piargs, muargs = muargs)
})

# Run ctgm_lfdr with two types of lfdr estimates
res_over <- ctgm_lfdr(x, pvals, models, type = "over")
res_raw <- ctgm_lfdr(x, pvals, models, type = "raw")

# Compare two estimates
par(mfrow = c(2, 1))
hist(res_over$lfdr)
hist(res_raw$lfdr)
}

}
