Help for package LogicForest

Title:

Logic Forest

Version:

2.1.3

Depends:

R (≥ 2.10)

Imports:

LogicReg, methods, survival, utils

Suggests:

data.table

Description:

Logic Forest is an ensemble machine learning method that identifies important and interpretable combinations of binary predictors using logic regression trees to model complex relationships with an outcome. Wolf, B.J., Slate, E.H., Hill, E.G. (2010) <doi:10.1093/bioinformatics/btq354>.

License:

GPL-3

Encoding:

UTF-8

RoxygenNote:

7.3.1

NeedsCompilation:

Packaged:

2026-02-13 19:08:09 UTC; nika02

Author:

Bethany Wolf [aut], Melica Nikahd [ctb, cre], Andrew Gothard [ctb], Madison Hyer [ctb]

Maintainer:

Melica Nikahd <melica.nikahd@osumc.edu>

Repository:

CRAN

Date/Publication:

2026-02-13 19:30:08 UTC

Generate All Permutations of N Variables

Description

Creates a matrix representing all possible combinations of N binary variables. Each row corresponds to a unique combination, and each column represents a variable where 1 indicates inclusion and 0 indicates exclusion.

Usage

Perms(n)

Arguments

n

Integer. Number of variables to generate permutations for.

Details

This is an internal function called by TTab and is not intended for independent use.

Value

A matrix with 2^n rows and n columns, where each row is a unique permutation of 0s and 1s.

Author(s)

Bethany Wolf wolfb@musc.edu

Truth table

Description

Internal function to evaluate the importance of predictor combinations within a logic regression tree. This function is called by prime.imp and is not intended to be used independently.

Usage

TTab(data, tree, Xs, mtype)

Arguments

data

A data frame containing input predictors.

tree

An object of class logregtree representing a logic regression tree constructed for a sample.

Xs

A vector of predictor names corresponding to columns in data.

mtype

Model type: 1 = classification, 2 = linear regression, 3 = survival regression, 4 = other.

Details

Generates a matrix of all binary interactions contained in a single sample's logic regression tree. Only predictors included in the logic regression tree are included in the matrix. The resulting matrix can be used to evaluate the importance of specific predictor combinations.

Value

A matrix mat.truth of binary predictor values corresponding to the predictions of the logic tree. Rows correspond to all permutations of predictors included in the tree that yield the "truth" outcome.

Author(s)

Bethany J. Wolf wolfb@musc.edu

References

Wolf BJ, Hill EG, Slate EH. Logic Forest: an ensemble classifier for discovering logical combinations of binary markers. Bioinformatics. 2010;26(17):2183-2189. doi:10.1093/bioinformatics/btq354

Building Interactions

Description

Builds interactions found from logic forest fit

Usage

build.interactions(
  fit,
  test.data,
  n_ints = NULL,
  remove_negated = FALSE,
  req_frequency = NULL
)

Arguments

fit

Fitted logic regression tree object containing outcome, model type, and logic tree information.

test.data

Any dataset that contains the variables to create the interactions

n_ints

Max number of interactions to build

remove_negated

Whether to build interactions that consist of only negated PIs (True/False)

req_frequency

Minimum frequency required to build interaction (0-1)

Details

This function creates the interactions in the data that are found via logic forest.

Value

A dataframe containing the the input dataframe and the interactions built from logic forest.

Author(s)

Andrew Gothard andrew.gothard@osumc.edu

References

Wolf BJ, Hill EG, Slate EH. Logic Forest: an ensemble classifier for discovering logical combinations of binary markers. Bioinformatics. 2010;26(17):2183–2189. doi:10.1093/bioinformatics/btq354

Find Complement of a Logic Regression Tree

Description

Constructs the complement of a given logic regression tree and computes the complement of its prime interactions (PIs).

Usage

find.ctree(tree)

Arguments

tree

An object of class "logregtree" representing the original logic tree.

Details

This is an internal function called by pimp.import and is not intended for independent use. It generates a new "logregtree" object where the complements of the original tree's structure and PIs are calculated.

Value

An object of class "logregtree" that is the complement of tree.

Author(s)

Bethany Wolf wolfb@musc.edu

Evaluate Predicted Values for Logic Regression Trees

Description

INTERNAL FUNCTION TO EVALUATE IMPORTANCE OF PREDICTOR COMBINATIONS. Prepares a data frame with responses, weights, censoring indicators, and evaluated predicted values for each tree in a fitted logic regression model. Called by predict.logreg2, not intended to be used independently.

Usage

frame.logreg2(fit, msz, ntr, newbin, newresp, newsep, newcens, newweight)

Arguments

fit

An object of class "logreg" fit to the training data

msz

Maximum number of leaves on a tree (optional)

ntr

Number of trees in the fit (optional)

newbin

Binary matrix of predictors for new/out-of-sample data

newresp

Vector of response values for new data

newsep

Matrix of separate predictors for new data

newcens

Vector of censoring indicators for new data (for survival models)

newweight

Optional vector of observation weights

Details

This function constructs a data frame for evaluating predicted values from logic regression trees. It supports in-bag and out-of-sample data, handles optional censoring indicators, separate predictors, and observation weights. The resulting data frame contains columns for:

Response variable
Observation weights
Censoring indicators (for survival models)
Separate predictors (if applicable)
Predicted values from each tree in the model

Value

A data.frame containing the response, weights, censoring indicators (if applicable), separate predictors, and evaluated predicted values for each tree.

Logic Forest & Logic Survival Forest

Description

Constructs an ensemble of logic regression models using bagging for classification or regression, and identifies important predictors and interactions. Logic Forest (LF) efficiently searches the space of logical combinations of binary variables using simulated annealing. It has been extended to support linear and survival regression.

Usage

logforest(
  resp.type,
  resp,
  resp.time = data.frame(X = rep(1, nrow(resp))),
  Xs,
  nBSXVars,
  anneal.params,
  nBS = 100,
  h = 0.5,
  norm = TRUE,
  numout = 5,
  nleaves
)

Arguments

resp.type

String indicating regression type: "bin" for classification, "lin" for linear regression, "exp_surv" for exponential time-to-event, and "cph_surv" for Cox proportional hazards.

resp

Numeric vector of response values (binary for classification/survival, continuous for linear regression). For time-to-event, indicates event/censoring status.

resp.time

Numeric vector of event/censoring times (used only for survival models).

Xs

Matrix or data frame of binary predictor variables (0/1 only).

nBSXVars

Integer. Number of predictors sampled for each tree (default is all predictors).

anneal.params

A list of parameters for simulated annealing (see logreg.anneal.control). Defaults: start = 1, end = -2, iter = 50000.

nBS

Number of trees to fit in the logic forest.

h

Numeric. Minimum proportion of trees predicting "1" required to classify an observation as "1" (used for classification).

norm

Logical. If FALSE, importance scores are not normalized.

numout

Integer. Number of predictors and interactions to report.

nleaves

Integer. Maximum number of leaves (end nodes) allowed per tree.

Details

Logic Forest is designed to identify interactions between binary predictors without requiring their pre-specification. Using simulated annealing, it searches the space of all possible logical combinations (e.g., AND, OR, NOT) among predictors. Originally developed for binary outcomes in gene-environment interaction studies, it has since been extended to linear and time-to-event outcomes (Logic Survival Forest).

Value

A logforest object containing:

Predictor.frequency: Frequency of each predictor across trees.
Predictor.importance: Importance of each predictor.
PI.frequency: Frequency of each interaction across trees.
PI.importance: Importance of each interaction.

Note

Development of Logic Forest was supported by NIH/NCATS UL1RR029882. Logic Survival Forest development was supported by NIH/NIA R01AG082873.

Author(s)

Bethany J. Wolf wolfb@musc.edu
J. Madison Hyer madison.hyer@osumc.edu

References

Wolf BJ, Hill EG, Slate EH. (2010). Logic Forest: An ensemble classifier for discovering logical combinations of binary markers. Bioinformatics, 26(17):2183–2189. doi:10.1093/bioinformatics/btq354
Wolf BJ et al. (2012). LBoost: A boosting algorithm with application for epistasis discovery. PLoS One, 7(11):e47281. doi:10.1371/journal.pone.0047281
Hyer JM et al. (2019). Novel Machine Learning Approach to Identify Preoperative Risk Factors Associated With Super-Utilization of Medicare Expenditure Following Surgery. JAMA Surg, 154(11):1014–1021. doi:10.1001/jamasurg.2019.2979

Examples

## Not run: 
set.seed(10051988)
N_c <- 50
N_r <- 200
init <- as.data.frame(matrix(0, nrow = N_r, ncol = N_c))
colnames(init) <- paste0("X", 1:N_c)
for(n in 1:N_c){
  p <- runif(1, min = 0.2, max = 0.6)
  init[,n] <- rbinom(N_r, 1, p)
}

X3X4int <- as.numeric(init$X3 == init$X4)
X5X6int <- as.numeric(init$X5 == init$X6)
y_p <- -2.5 + init$X1 + init$X2 + 2 * X3X4int + 2 * X5X6int
p <- 1 / (1 + exp(-y_p))
init$Y.bin <- rbinom(N_r, 1, p)

# Classification
LF.fit.bin <- logforest("bin", init$Y.bin, NULL, init[,1:N_c], nBS=10, nleaves=8, numout=10)
print(LF.fit.bin)

# Continuous
init$Y.cont <- rnorm(N_r, mean = 0) + init$X1 + init$X2 + 5 * X3X4int + 5 * X5X6int
LF.fit.lin <- logforest("lin", init$Y.cont, NULL, init[,1:N_c], nBS=10, nleaves=8, numout=10)
print(LF.fit.lin)

# Time-to-event
shape <- 1 - 0.05*init$X1 - 0.05*init$X2 - 0.2*init$X3*init$X4 - 0.2*init$X5*init$X6
scale <- 1.5 - 0.05*init$X1 - 0.05*init$X2 - 0.2*init$X3*init$X4 - 0.2*init$X5*init$X6
init$TIME_Y <- rgamma(N_r, shape = shape, scale = scale)
LF.fit.surv <- logforest("exp_surv", init$Y.bin, init$TIME_Y, init[,1:N_c],
  nBS=10, nleaves=8, numout=10)
print(LF.fit.surv)

## End(Not run)

Generate All Combinations of N Variables with a Specified Conjunction Value

Description

Creates a matrix representing all possible combinations of n.pair variables. Each row corresponds to a unique combination, and each column represents a variable where 1 indicates inclusion and conj indicates exclusion.

Usage

p.combos(n.pair, conj = 0)

Arguments

n.pair

Integer. Number of predictors in the combination.

conj

Numeric. Value denoting absence of a variable in a combination (default is 0).

Details

This is an internal function called by prime.imp and is not intended for independent use.

Value

A matrix with 2^n.pair rows and n.pair columns, where each row is a unique combination of 1s and conj values.

Author(s)

Bethany Wolf wolfb@musc.edu

Predictor Importance – Variables and Interactions

Description

Calculates permutation-based importance measures for individual predictors and interactions within a logic regression tree in a logic forest.

Usage

pimp.import(fit, data, testdata, BSpred, pred, Xs, mtype)

Arguments

fit

Fitted logic regression tree object containing outcome, model type, and logic tree information.

data

In-bag sample (training data).

testdata

Out-of-bag sample (test data).

BSpred

Number of predictors included in the interactions (includes NOT-ed variables).

pred

Number of predictors in the model (used for constructing permuted matrices).

Xs

Matrix or data frame of 0/1 values representing all predictor variables.

mtype

Model type: "classification", "linear", "Cox-PH Time-to-Event", or "Exp. Time-to-Event".

Details

This function calculates importance measures for each bootstrapped sample by comparing model fit between the original out-of-bag sample and a permuted out-of-bag sample. Model fit is evaluated using:

Misclassification rate for classification models,
Log2 mean squared error for linear regression,
Harrell's C-index for survival regression (Cox-PH or exponential time-to-event models).

Value

A list with the following components:

single.vimp: Vector of importance estimates for individual predictors.
pimp.vimp: Vector of importance estimates for interactions (pimps).
Ipimat: Matrix indicating which predictors (and NOT-ed predictors) are used in each interaction.
vec.Xvars: Vector of predictor IDs used in the tree.
Xids: Vector of predictor column indices corresponding to vec.Xvars.

Author(s)

Bethany J. Wolf wolfb@musc.edu
J. Madison Hyer madison.hyer@osumc.edu

References

Wolf BJ, Hill EG, Slate EH. Logic Forest: an ensemble classifier for discovering logical combinations of binary markers. Bioinformatics. 2010;26(17):2183–2189. doi:10.1093/bioinformatics/btq354

Predictor Importance Matrix – Classification

Description

Internal function called by pimp.import to construct a binary/logical matrix representing which predictors (columns) are used in each interaction (rows) of a sample.

Usage

pimp.mat.bin(pimps.out, testdata)

Arguments

pimps.out

R object containing vec.primes, tmp.mat, vec.pimpvars, and list.pimps.

testdata

Data frame or matrix of out-of-bag (OOB) samples.

Details

Note: For regression models, see pimp.mat.nonbin which accommodates complements of logic trees.

Value

A list with the following components:

pimp.names: Vector of predictor names.
pimp.datamat: Logical matrix indicating which predictors (columns) are used in each interaction (rows).

Author(s)

Bethany J. Wolf wolfb@musc.edu

References

Wolf BJ, Hill EG, Slate EH. Logic Forest: an ensemble classifier for discovering logical combinations of binary markers. Bioinformatics. 2010;26(17):2183–2189. doi:10.1093/bioinformatics/btq354

Predictor Importance Matrix – Regression

Description

Internal function called by pimp.import to construct a binary/logical matrix representing which predictors (columns) are used in each interaction (rows) of a sample.

Usage

pimp.mat.nonbin(pimps.out, testdata)

Arguments

pimps.out

R object containing vec.primes, tmp.mat, vec.pimpvars, list.pimps, and cmp.

testdata

Data frame or matrix of out-of-bag (OOB) samples.

Details

Note: For classification models, see pimp.mat.bin.

Value

A list with the following components:

pimp.names: Vector of predictor names.
pimp.datamat: Logical matrix indicating which predictors (columns) are used in each interaction (rows).

Author(s)

Bethany J. Wolf wolfb@musc.edu

References

Wolf BJ, Hill EG, Slate EH. Logic Forest: an ensemble classifier for discovering logical combinations of binary markers. Bioinformatics. 2010;26(17):2183–2189. doi:10.1093/bioinformatics/btq354

Predict Outcomes Using a Logic Forest Model

Description

Computes predicted values for new observations or the out-of-bag (OOB) predictions for a logic forest model fitted using logforest.

Usage

## S3 method for class 'logforest'
predict(object, newdata, cutoff, ...)

Arguments

object

An object of class "logforest".

newdata

A matrix or data frame of new predictor values. If omitted, predictions are made for the original data used to fit the model (OOB predictions).

cutoff

A numeric value between 0 and 1 specifying the minimum proportion of trees that must predict a class of 1 for the overall prediction to be 1. Ignored for non-classification models.

...

Additional arguments (currently ignored).

Details

For classification models, predictions are determined based on the cutoff proportion. For regression or time-to-event models, the function returns predicted values and OOB statistics if newdata is not provided.

Value

An object of class "LFprediction" containing:

LFprediction: numeric vector of predicted responses.
proportion_one: numeric vector of the proportion of trees predicting class 1 (classification only).
AllTrees: matrix or data frame with predicted values from each tree, the proportion of trees predicting 1, and the overall predicted class (classification), or predicted values for regression/time-to-event models.

Author(s)

Bethany Wolf wolfb@musc.edu

Predict Method for Logic Regression Objects (Internal)

Description

INTERNAL FUNCTION: evaluates the importance of predictor combinations and generates predictions from a fitted logic regression object.

Usage

## S3 method for class 'logreg2'
predict(object, msz, ntr, newbin, newsep, newcens, ...)

Arguments

object

An object of class "logreg".

msz

Integer. Maximum number of leaves in a tree.

ntr

Integer. Number of trees in object.

newbin

Matrix containing binary predictor values for new data points.

newsep

Integer. Number of separate predictors in newbin.

newcens

Vector. Censoring indicator for survival data (if applicable).

...

Additional arguments (currently ignored).

Details

This function is typically called internally by other functions and is not intended for direct use by package users.

Depending on the model type (object$type), this function produces:

Classification predictions (0/1) if type == "classification".
Predicted probabilities if type == "logistic".
Survival model predictions if type == "proportional.hazards".

Value

A numeric vector or matrix of predictions.

Extract Prime Variable Interactions from a Logic Regression Tree

Description

Internal function called by pimp.import. It is not intended to be used independently. Generates a list of all variables and variable interactions identified by a specific logic regression tree within a logic forest or LBoost model.

Usage

prime.imp(tree, data, Xs, mtype)

Arguments

tree

An object of class "logregtree".

data

Data frame used to fit the logic forest.

Xs

A vector of predictor names corresponding to columns in data.

mtype

Model type (e.g., classification, linear regression, survival regression).

Details

This function constructs all possible interactions of the predictors contained in the tree, identifies those that contribute to a positive outcome ("prime interactions"), and returns information about which variables and interactions are included in each.

Value

An object of class "primeImp" with the following elements:

vec.primes

Character vector of variable interactions in logical format.

tmp.mat

Matrix of all binary interactions contained in the tree.

vec.pimpvars

Sorted vector of column indices in data for important predictors.

list.pimps

List of vectors, each containing indices of predictors involved in each interaction.

Author(s)

Bethany Wolf wolfb@musc.edu

Print Method for Logic Forest Predictions

Description

Displays predictions from a logic forest model, including the predicted classes and, for classification models, the proportion of trees predicting a class of one.

Usage

## S3 method for class 'LFprediction'
print(x, ...)

Arguments

x

An object of class "LFprediction".

...

Additional arguments (currently ignored).

Details

For classification models, this method prints the predicted classes for each observation and the proportion of trees in the logic forest that predict class 1. For linear regression models, it prints the predicted values and, if available, the out-of-bag mean squared error.

Value

No return value. This function is called for its side effects (printing).

Author(s)

Bethany Wolf wolfb@musc.edu

Print Method for Logic Forest Models

Description

Prints the most important predictors and interactions from a fitted logic forest model, along with their importance scores and frequency of occurrence.

Usage

## S3 method for class 'logforest'
print(x, sortby = "importance", ...)

Arguments

x

An object of class "logforest".

sortby

Character string specifying whether to sort the output by "importance" (default) or "frequency".

...

Additional arguments (currently ignored).

Details

This method displays a matrix of the top predictors and interactions from a logic forest model. If x$norm = TRUE, the variable importance scores are normalized such that the largest score is 1 and all other scores are scaled accordingly.

Value

No return value. This function is called for its side effect of printing.

Author(s)

Bethany Wolf wolfb@musc.edu

Proportion Positive Predictions

Description

Internal function used by predict.logforest to determine the proportion of logic regression trees within a logic forest that predict a class of one for new observations. It also returns the predicted class values based on a specified cutoff.

Usage

proportion.positive(predictmatrix, cutoff)

Arguments

predictmatrix

A matrix of predicted values from each tree (rows = observations, columns = trees).

cutoff

Numeric value specifying the proportion of trees that must predict a class of one for the overall prediction to be class one.

Details

This function is called internally by predict.logforest and is not intended for direct use. It calculates, for each observation, the fraction of trees in the logic forest predicting a positive outcome, and then assigns a predicted class based on whether this fraction meets or exceeds the cutoff.

Value

A list with:

predmat

A two-column matrix where the first column is the proportion of trees predicting class one for each observation, and the second column is the binary predicted class (0 or 1).

Note

This is a supplementary function and not intended to be used independently of the other functions in the package.

Author(s)

Bethany Wolf wolfb@musc.edu

Generate All Permutations of N Variables

Description

Usage

Arguments

Details

Value

Author(s)

Truth table

Description

Usage

Arguments

Details

Value

Author(s)

References

See Also

Building Interactions

Description

Usage

Arguments

Details

Value

Author(s)

References

See Also

Find Complement of a Logic Regression Tree

Description

Usage

Arguments

Details

Value

Author(s)

Evaluate Predicted Values for Logic Regression Trees

Description

Usage

Arguments

Details

Value

Logic Forest & Logic Survival Forest

Description

Usage

Arguments

Details

Value

Note

Author(s)

References

See Also

Examples

Generate All Combinations of N Variables with a Specified Conjunction Value

Description

Usage

Arguments

Details

Value

Author(s)

Predictor Importance – Variables and Interactions

Description

Usage

Arguments

Details

Value

Author(s)

References

See Also

Predictor Importance Matrix – Classification

Description

Usage

Arguments

Details

Value

Author(s)

References

See Also

Predictor Importance Matrix – Regression

Description

Usage

Arguments

Details

Value