Research Article
Similarity Digest Search: A Survey and Comparative Analysis of Strategies to Perform Known File Filtering Using Approximate Matching
Table 2
Similarity digest search strategies: performance assessment of different properties.
| Strategy | Memory requirements | Single lookup complexity | False positives | Resemblance/containment detection | 1 GiB | 10 GiB | 100 GiB | 1 TiB |
| Brute force (sdhash) | 25.60 MiB (2.50%) | 256.00 MiB (2.50%) | 2.50 GiB (2.50%) | 25.60 GiB (2.50%) | | No | ✓/✓ | Brute force (ssdeep) | 0.19 MiB (0.02%) | 1.87 MiB (0.02%) | 18.75 MiB (0.02%) | 192.00 MiB (0.02%) | ) | No | ✓/× | Brute force (TLSH) | 0.07 MiB (0.01%) | 0.68 MiB (0.01%) | 6.84 MiB (0.01%) | 70.00 MiB (0.01%) | ) | No | ✓/× | DHTnil | 32.49 MiB (3.17%) | 33.05 MiB (0.32%) | 38.68 MiB (0.04%) | 96.43 MiB (0.01%) | ) | No | ✓/× | iCTPH | 96.62 MiB (9.44%) | 98.30 MiB (0.96%) | 115.18 MiB (0.11%) | 288.43 MiB (0.03%) | | No | ✓/× | F2S2 | 1.71 MiB (0.17%) | 17.07 MiB (0.17%) | 170.70 MiB (0.17%) | 1.71 GiB (0.17%) | | No | ✓/× | MRSH-NET | 16.00 MiB (1.56%) | 128.00 MiB (1.25%) | 1.00 GiB (1.00%) | 16.00 GiB (1.56%) | | Yes | ✓/✓ | BF-based tree | 176.00 MiB (17.19%) | 1.79 GiB (17.90%) | 17.64 GiB (17.64%) | 336.00 GiB (32.81%) | | Yes | ✓/✓ | MRSH-CF | 14.00 MiB (1.37%) | 140.00 MiB (1.37%) | 1.37 GiB (1.37%) | 14.00 GiB (1.37%) | | Yes | ✓/✓ |
|
|