Research Article

Automatic Benchmark Generation Framework for Malware Detection

Algorithm 1

Sampling on one cluster.
1: Input: cluster i which contained n samples
2: Output: k selected samples
3: Step 1:
4: for sample in cluster:
5:   calculate MD of each sample
6: end for
7: for sample in cluster:
8:  calculate of each sample
9:  map on a roulette wheel
10: end for
11: while selected samples < k:
12:  generate random number i between 0 and 1
13:  map i to the corresponding range on roulette wheel
14:  select corresponding sample
15: end while
16: return k selected samples