Brute force model search with a min and max number of features. — model.subset • moleculaR

Ranks models based on K-fold CV Q2

Usage

model.subset(
  data,
  out.col = dim(data)[2],
  min = 2,
  max = floor(dim(data)[1]/5),
  folds = nrow(data),
  iterations = 1,
  cutoff = 0.85,
  cross.terms = F
)

Arguments

data: data frame with output column
out.col: number of output column
min: minimum # of features (default = 2)
max: max # of features (defaults = # of observations / 5)
folds: defaults to nrow(data)
iterations: defaults to 1 (LOOCV)
cutoff: search for Q2 above 0.85 (if there isn't will look for lower)
cross.terms: if TRUE includes feature interactions (explosive - try avoiding)

Value

table with 10 best models (at max)