| Title: | Logic Forest |
| Version: | 2.1.3 |
| Depends: | R (≥ 2.10) |
| Imports: | LogicReg, methods, survival, utils |
| Suggests: | data.table |
| Description: | Logic Forest is an ensemble machine learning method that identifies important and interpretable combinations of binary predictors using logic regression trees to model complex relationships with an outcome. Wolf, B.J., Slate, E.H., Hill, E.G. (2010) <doi:10.1093/bioinformatics/btq354>. |
| License: | GPL-3 |
| Encoding: | UTF-8 |
| RoxygenNote: | 7.3.1 |
| NeedsCompilation: | no |
| Packaged: | 2026-02-13 19:08:09 UTC; nika02 |
| Author: | Bethany Wolf [aut], Melica Nikahd [ctb, cre], Andrew Gothard [ctb], Madison Hyer [ctb] |
| Maintainer: | Melica Nikahd <melica.nikahd@osumc.edu> |
| Repository: | CRAN |
| Date/Publication: | 2026-02-13 19:30:08 UTC |
Generate All Permutations of N Variables
Description
Creates a matrix representing all possible combinations of N binary variables. Each row corresponds to a unique combination, and each column represents a variable where 1 indicates inclusion and 0 indicates exclusion.
Usage
Perms(n)
Arguments
n |
Integer. Number of variables to generate permutations for. |
Details
This is an internal function called by TTab and is not intended for independent use.
Value
A matrix with 2^n rows and n columns, where each row is a unique permutation of 0s and 1s.
Author(s)
Bethany Wolf wolfb@musc.edu
Truth table
Description
Internal function to evaluate the importance of predictor combinations within a logic regression tree.
This function is called by prime.imp and is not intended to be used independently.
Usage
TTab(data, tree, Xs, mtype)
Arguments
data |
A data frame containing input predictors. |
tree |
An object of class |
Xs |
A vector of predictor names corresponding to columns in |
mtype |
Model type: 1 = classification, 2 = linear regression, 3 = survival regression, 4 = other. |
Details
Generates a matrix of all binary interactions contained in a single sample's logic regression tree. Only predictors included in the logic regression tree are included in the matrix. The resulting matrix can be used to evaluate the importance of specific predictor combinations.
Value
A matrix mat.truth of binary predictor values corresponding to the predictions of the logic tree.
Rows correspond to all permutations of predictors included in the tree that yield the "truth" outcome.
Author(s)
Bethany J. Wolf wolfb@musc.edu
References
Wolf BJ, Hill EG, Slate EH. Logic Forest: an ensemble classifier for discovering logical combinations of binary markers. Bioinformatics. 2010;26(17):2183-2189. doi:10.1093/bioinformatics/btq354
See Also
Building Interactions
Description
Builds interactions found from logic forest fit
Usage
build.interactions(
fit,
test.data,
n_ints = NULL,
remove_negated = FALSE,
req_frequency = NULL
)
Arguments
fit |
Fitted logic regression tree object containing outcome, model type, and logic tree information. |
test.data |
Any dataset that contains the variables to create the interactions |
n_ints |
Max number of interactions to build |
remove_negated |
Whether to build interactions that consist of only negated PIs (True/False) |
req_frequency |
Minimum frequency required to build interaction (0-1) |
Details
This function creates the interactions in the data that are found via logic forest.
Value
A dataframe containing the the input dataframe and the interactions built from logic forest.
Author(s)
Andrew Gothard andrew.gothard@osumc.edu
References
Wolf BJ, Hill EG, Slate EH. Logic Forest: an ensemble classifier for discovering logical combinations of binary markers. Bioinformatics. 2010;26(17):2183–2189. doi:10.1093/bioinformatics/btq354
See Also
Find Complement of a Logic Regression Tree
Description
Constructs the complement of a given logic regression tree and computes the complement of its prime interactions (PIs).
Usage
find.ctree(tree)
Arguments
tree |
An object of class |
Details
This is an internal function called by pimp.import and is not intended
for independent use. It generates a new "logregtree" object where the
complements of the original tree's structure and PIs are calculated.
Value
An object of class "logregtree" that is the complement of tree.
Author(s)
Bethany Wolf wolfb@musc.edu
Evaluate Predicted Values for Logic Regression Trees
Description
INTERNAL FUNCTION TO EVALUATE IMPORTANCE OF PREDICTOR COMBINATIONS.
Prepares a data frame with responses, weights, censoring indicators, and evaluated predicted values for each tree in a fitted logic regression model.
Called by predict.logreg2, not intended to be used independently.
Usage
frame.logreg2(fit, msz, ntr, newbin, newresp, newsep, newcens, newweight)
Arguments
fit |
An object of class |
msz |
Maximum number of leaves on a tree (optional) |
ntr |
Number of trees in the fit (optional) |
newbin |
Binary matrix of predictors for new/out-of-sample data |
newresp |
Vector of response values for new data |
newsep |
Matrix of separate predictors for new data |
newcens |
Vector of censoring indicators for new data (for survival models) |
newweight |
Optional vector of observation weights |
Details
This function constructs a data frame for evaluating predicted values from logic regression trees. It supports in-bag and out-of-sample data, handles optional censoring indicators, separate predictors, and observation weights. The resulting data frame contains columns for:
Response variable
Observation weights
Censoring indicators (for survival models)
Separate predictors (if applicable)
Predicted values from each tree in the model
Value
A data.frame containing the response, weights, censoring indicators (if applicable), separate predictors, and evaluated predicted values for each tree.
Logic Forest & Logic Survival Forest
Description
Constructs an ensemble of logic regression models using bagging for classification or regression, and identifies important predictors and interactions. Logic Forest (LF) efficiently searches the space of logical combinations of binary variables using simulated annealing. It has been extended to support linear and survival regression.
Usage
logforest(
resp.type,
resp,
resp.time = data.frame(X = rep(1, nrow(resp))),
Xs,
nBSXVars,
anneal.params,
nBS = 100,
h = 0.5,
norm = TRUE,
numout = 5,
nleaves
)
Arguments
resp.type |
String indicating regression type: |
resp |
Numeric vector of response values (binary for classification/survival, continuous for linear regression). For time-to-event, indicates event/censoring status. |
resp.time |
Numeric vector of event/censoring times (used only for survival models). |
Xs |
Matrix or data frame of binary predictor variables (0/1 only). |
nBSXVars |
Integer. Number of predictors sampled for each tree (default is all predictors). |
anneal.params |
A list of parameters for simulated annealing (see |
nBS |
Number of trees to fit in the logic forest. |
h |
Numeric. Minimum proportion of trees predicting "1" required to classify an observation as "1" (used for classification). |
norm |
Logical. If |
numout |
Integer. Number of predictors and interactions to report. |
nleaves |
Integer. Maximum number of leaves (end nodes) allowed per tree. |
Details
Logic Forest is designed to identify interactions between binary predictors without requiring their pre-specification. Using simulated annealing, it searches the space of all possible logical combinations (e.g., AND, OR, NOT) among predictors. Originally developed for binary outcomes in gene-environment interaction studies, it has since been extended to linear and time-to-event outcomes (Logic Survival Forest).
Value
A logforest object containing:
- Predictor.frequency
Frequency of each predictor across trees.
- Predictor.importance
Importance of each predictor.
- PI.frequency
Frequency of each interaction across trees.
- PI.importance
Importance of each interaction.
Note
Development of Logic Forest was supported by NIH/NCATS UL1RR029882. Logic Survival Forest development was supported by NIH/NIA R01AG082873.
Author(s)
Bethany J. Wolf wolfb@musc.edu
J. Madison Hyer madison.hyer@osumc.edu
References
Wolf BJ, Hill EG, Slate EH. (2010). Logic Forest: An ensemble classifier for discovering logical combinations of binary markers. Bioinformatics, 26(17):2183–2189. doi:10.1093/bioinformatics/btq354
Wolf BJ et al. (2012). LBoost: A boosting algorithm with application for epistasis discovery. PLoS One, 7(11):e47281. doi:10.1371/journal.pone.0047281
Hyer JM et al. (2019). Novel Machine Learning Approach to Identify Preoperative Risk Factors Associated With Super-Utilization of Medicare Expenditure Following Surgery. JAMA Surg, 154(11):1014–1021. doi:10.1001/jamasurg.2019.2979
See Also
pimp.import, logreg.anneal.control
Examples
## Not run:
set.seed(10051988)
N_c <- 50
N_r <- 200
init <- as.data.frame(matrix(0, nrow = N_r, ncol = N_c))
colnames(init) <- paste0("X", 1:N_c)
for(n in 1:N_c){
p <- runif(1, min = 0.2, max = 0.6)
init[,n] <- rbinom(N_r, 1, p)
}
X3X4int <- as.numeric(init$X3 == init$X4)
X5X6int <- as.numeric(init$X5 == init$X6)
y_p <- -2.5 + init$X1 + init$X2 + 2 * X3X4int + 2 * X5X6int
p <- 1 / (1 + exp(-y_p))
init$Y.bin <- rbinom(N_r, 1, p)
# Classification
LF.fit.bin <- logforest("bin", init$Y.bin, NULL, init[,1:N_c], nBS=10, nleaves=8, numout=10)
print(LF.fit.bin)
# Continuous
init$Y.cont <- rnorm(N_r, mean = 0) + init$X1 + init$X2 + 5 * X3X4int + 5 * X5X6int
LF.fit.lin <- logforest("lin", init$Y.cont, NULL, init[,1:N_c], nBS=10, nleaves=8, numout=10)
print(LF.fit.lin)
# Time-to-event
shape <- 1 - 0.05*init$X1 - 0.05*init$X2 - 0.2*init$X3*init$X4 - 0.2*init$X5*init$X6
scale <- 1.5 - 0.05*init$X1 - 0.05*init$X2 - 0.2*init$X3*init$X4 - 0.2*init$X5*init$X6
init$TIME_Y <- rgamma(N_r, shape = shape, scale = scale)
LF.fit.surv <- logforest("exp_surv", init$Y.bin, init$TIME_Y, init[,1:N_c],
nBS=10, nleaves=8, numout=10)
print(LF.fit.surv)
## End(Not run)
Generate All Combinations of N Variables with a Specified Conjunction Value
Description
Creates a matrix representing all possible combinations of n.pair variables.
Each row corresponds to a unique combination, and each column represents a variable
where 1 indicates inclusion and conj indicates exclusion.
Usage
p.combos(n.pair, conj = 0)
Arguments
n.pair |
Integer. Number of predictors in the combination. |
conj |
Numeric. Value denoting absence of a variable in a combination (default is 0). |
Details
This is an internal function called by prime.imp and is not intended for independent use.
Value
A matrix with 2^n.pair rows and n.pair columns, where each row is a unique combination of 1s and conj values.
Author(s)
Bethany Wolf wolfb@musc.edu
Predictor Importance – Variables and Interactions
Description
Calculates permutation-based importance measures for individual predictors and interactions within a logic regression tree in a logic forest.
Usage
pimp.import(fit, data, testdata, BSpred, pred, Xs, mtype)
Arguments
fit |
Fitted logic regression tree object containing outcome, model type, and logic tree information. |
data |
In-bag sample (training data). |
testdata |
Out-of-bag sample (test data). |
BSpred |
Number of predictors included in the interactions (includes NOT-ed variables). |
pred |
Number of predictors in the model (used for constructing permuted matrices). |
Xs |
Matrix or data frame of 0/1 values representing all predictor variables. |
mtype |
Model type: |
Details
This function calculates importance measures for each bootstrapped sample by comparing model fit between the original out-of-bag sample and a permuted out-of-bag sample. Model fit is evaluated using:
Misclassification rate for classification models,
Log2 mean squared error for linear regression,
Harrell's C-index for survival regression (Cox-PH or exponential time-to-event models).
Value
A list with the following components:
- single.vimp
Vector of importance estimates for individual predictors.
- pimp.vimp
Vector of importance estimates for interactions (pimps).
- Ipimat
Matrix indicating which predictors (and NOT-ed predictors) are used in each interaction.
- vec.Xvars
Vector of predictor IDs used in the tree.
- Xids
Vector of predictor column indices corresponding to
vec.Xvars.
Author(s)
Bethany J. Wolf wolfb@musc.edu
J. Madison Hyer madison.hyer@osumc.edu
References
Wolf BJ, Hill EG, Slate EH. Logic Forest: an ensemble classifier for discovering logical combinations of binary markers. Bioinformatics. 2010;26(17):2183–2189. doi:10.1093/bioinformatics/btq354
See Also
Predictor Importance Matrix – Classification
Description
Internal function called by pimp.import to construct a binary/logical matrix
representing which predictors (columns) are used in each interaction (rows) of a sample.
Usage
pimp.mat.bin(pimps.out, testdata)
Arguments
pimps.out |
R object containing |
testdata |
Data frame or matrix of out-of-bag (OOB) samples. |
Details
Note: For regression models, see pimp.mat.nonbin which accommodates complements of logic trees.
Value
A list with the following components:
- pimp.names
Vector of predictor names.
- pimp.datamat
Logical matrix indicating which predictors (columns) are used in each interaction (rows).
Author(s)
Bethany J. Wolf wolfb@musc.edu
References
Wolf BJ, Hill EG, Slate EH. Logic Forest: an ensemble classifier for discovering logical combinations of binary markers. Bioinformatics. 2010;26(17):2183–2189. doi:10.1093/bioinformatics/btq354
See Also
Predictor Importance Matrix – Regression
Description
Internal function called by pimp.import to construct a binary/logical matrix
representing which predictors (columns) are used in each interaction (rows) of a sample.
Usage
pimp.mat.nonbin(pimps.out, testdata)
Arguments
pimps.out |
R object containing |
testdata |
Data frame or matrix of out-of-bag (OOB) samples. |
Details
Note: For classification models, see pimp.mat.bin.
Value
A list with the following components:
- pimp.names
Vector of predictor names.
- pimp.datamat
Logical matrix indicating which predictors (columns) are used in each interaction (rows).
Author(s)
Bethany J. Wolf wolfb@musc.edu
References
Wolf BJ, Hill EG, Slate EH. Logic Forest: an ensemble classifier for discovering logical combinations of binary markers. Bioinformatics. 2010;26(17):2183–2189. doi:10.1093/bioinformatics/btq354
See Also
Predict Outcomes Using a Logic Forest Model
Description
Computes predicted values for new observations or the out-of-bag (OOB) predictions
for a logic forest model fitted using logforest.
Usage
## S3 method for class 'logforest'
predict(object, newdata, cutoff, ...)
Arguments
object |
An object of class |
newdata |
A matrix or data frame of new predictor values. If omitted, predictions are made for the original data used to fit the model (OOB predictions). |
cutoff |
A numeric value between 0 and 1 specifying the minimum proportion of trees that must predict a class of 1 for the overall prediction to be 1. Ignored for non-classification models. |
... |
Additional arguments (currently ignored). |
Details
For classification models, predictions are determined based on the cutoff proportion.
For regression or time-to-event models, the function returns predicted values and OOB statistics if newdata is not provided.
Value
An object of class "LFprediction" containing:
-
LFprediction: numeric vector of predicted responses. -
proportion_one: numeric vector of the proportion of trees predicting class 1 (classification only). -
AllTrees: matrix or data frame with predicted values from each tree, the proportion of trees predicting 1, and the overall predicted class (classification), or predicted values for regression/time-to-event models.
Author(s)
Bethany Wolf wolfb@musc.edu
See Also
Predict Method for Logic Regression Objects (Internal)
Description
INTERNAL FUNCTION: evaluates the importance of predictor combinations and generates predictions from a fitted logic regression object.
Usage
## S3 method for class 'logreg2'
predict(object, msz, ntr, newbin, newsep, newcens, ...)
Arguments
object |
An object of class |
msz |
Integer. Maximum number of leaves in a tree. |
ntr |
Integer. Number of trees in |
newbin |
Matrix containing binary predictor values for new data points. |
newsep |
Integer. Number of separate predictors in |
newcens |
Vector. Censoring indicator for survival data (if applicable). |
... |
Additional arguments (currently ignored). |
Details
This function is typically called internally by other functions and is not intended for direct use by package users.
Depending on the model type (object$type), this function produces:
Classification predictions (0/1) if
type == "classification".Predicted probabilities if
type == "logistic".Survival model predictions if
type == "proportional.hazards".
Value
A numeric vector or matrix of predictions.
Extract Prime Variable Interactions from a Logic Regression Tree
Description
Internal function called by pimp.import.
It is not intended to be used independently.
Generates a list of all variables and variable interactions identified by a specific
logic regression tree within a logic forest or LBoost model.
Usage
prime.imp(tree, data, Xs, mtype)
Arguments
tree |
An object of class |
data |
Data frame used to fit the logic forest. |
Xs |
A vector of predictor names corresponding to columns in |
mtype |
Model type (e.g., classification, linear regression, survival regression). |
Details
This function constructs all possible interactions of the predictors contained in the tree, identifies those that contribute to a positive outcome ("prime interactions"), and returns information about which variables and interactions are included in each.
Value
An object of class "primeImp" with the following elements:
vec.primes |
Character vector of variable interactions in logical format. |
tmp.mat |
Matrix of all binary interactions contained in the tree. |
vec.pimpvars |
Sorted vector of column indices in |
list.pimps |
List of vectors, each containing indices of predictors involved in each interaction. |
Author(s)
Bethany Wolf wolfb@musc.edu
Print Method for Logic Forest Predictions
Description
Displays predictions from a logic forest model, including the predicted classes and, for classification models, the proportion of trees predicting a class of one.
Usage
## S3 method for class 'LFprediction'
print(x, ...)
Arguments
x |
An object of class |
... |
Additional arguments (currently ignored). |
Details
For classification models, this method prints the predicted classes for each observation and the proportion of trees in the logic forest that predict class 1. For linear regression models, it prints the predicted values and, if available, the out-of-bag mean squared error.
Value
No return value. This function is called for its side effects (printing).
Author(s)
Bethany Wolf wolfb@musc.edu
See Also
Print Method for Logic Forest Models
Description
Prints the most important predictors and interactions from a fitted logic forest model, along with their importance scores and frequency of occurrence.
Usage
## S3 method for class 'logforest'
print(x, sortby = "importance", ...)
Arguments
x |
An object of class |
sortby |
Character string specifying whether to sort the output by |
... |
Additional arguments (currently ignored). |
Details
This method displays a matrix of the top predictors and interactions from a logic forest model.
If x$norm = TRUE, the variable importance scores are normalized such that the largest
score is 1 and all other scores are scaled accordingly.
Value
No return value. This function is called for its side effect of printing.
Author(s)
Bethany Wolf wolfb@musc.edu
See Also
Proportion Positive Predictions
Description
Internal function used by predict.logforest to determine the proportion of logic regression trees
within a logic forest that predict a class of one for new observations.
It also returns the predicted class values based on a specified cutoff.
Usage
proportion.positive(predictmatrix, cutoff)
Arguments
predictmatrix |
A matrix of predicted values from each tree (rows = observations, columns = trees). |
cutoff |
Numeric value specifying the proportion of trees that must predict a class of one for the overall prediction to be class one. |
Details
This function is called internally by predict.logforest and is not intended for direct use.
It calculates, for each observation, the fraction of trees in the logic forest predicting a positive outcome,
and then assigns a predicted class based on whether this fraction meets or exceeds the cutoff.
Value
A list with:
predmat |
A two-column matrix where the first column is the proportion of trees predicting class one for each observation, and the second column is the binary predicted class (0 or 1). |
Note
This is a supplementary function and not intended to be used independently of the other functions in the package.
Author(s)
Bethany Wolf wolfb@musc.edu