Skip to contents

Used to resize (under-sample) classification oriented data sets. Returns the samples row numbers.

Usage

simi.sampler(
  data,
  class,
  compare.with = 0,
  plot = F,
  sample.size = min(summary(as.factor(data$class)))
)

Arguments

data

data frame with class column

class

class of interest number

compare.with

choose the class with which similarity is computed. Defaults to 0, being similarity of each sample with its own group. Any other number will compare with the class represented by that number.

plot

create a plot of the similarity before and after sampling.

sample.size

how many are to be sampled. Defaults to the number of sample in the smallest class.

Value

A vector with row numbers of sampled observations