a similarity based sampling algorithm
simi.sampler.Rd
Used to resize (under-sample) classification oriented data sets. Returns the samples row numbers.
Arguments
- data
data frame with class column
- class
class of interest number
- compare.with
choose the class with which similarity is computed. Defaults to 0, being similarity of each sample with its own group. Any other number will compare with the class represented by that number.
- plot
create a plot of the similarity before and after sampling.
- sample.size
how many are to be sampled. Defaults to the number of sample in the smallest class.