Research Article

Similarity Digest Search: A Survey and Comparative Analysis of Strategies to Perform Known File Filtering Using Approximate Matching

Table 2

Similarity digest search strategies: performance assessment of different properties.

StrategyMemory requirementsSingle lookup complexityFalse positivesResemblance/containment detection
1 GiB10 GiB100 GiB1 TiB

Brute force (sdhash)25.60 MiB (2.50%)256.00 MiB (2.50%)2.50 GiB (2.50%)25.60 GiB (2.50%)No✓/✓
Brute force (ssdeep)0.19 MiB (0.02%)1.87 MiB (0.02%)18.75 MiB (0.02%)192.00 MiB (0.02%))No✓/×
Brute force (TLSH)0.07 MiB (0.01%)0.68 MiB (0.01%)6.84 MiB (0.01%)70.00 MiB (0.01%))No✓/×
DHTnil32.49 MiB (3.17%)33.05 MiB (0.32%)38.68 MiB (0.04%)96.43 MiB (0.01%))No✓/×
iCTPH96.62 MiB (9.44%)98.30 MiB (0.96%)115.18 MiB (0.11%)288.43 MiB (0.03%)No✓/×
F2S21.71 MiB (0.17%)17.07 MiB (0.17%)170.70 MiB (0.17%)1.71 GiB (0.17%)No✓/×
MRSH-NET16.00 MiB (1.56%)128.00 MiB (1.25%)1.00 GiB (1.00%)16.00 GiB (1.56%)Yes✓/✓
BF-based tree176.00 MiB (17.19%)1.79 GiB (17.90%)17.64 GiB (17.64%)336.00 GiB (32.81%)Yes✓/✓
MRSH-CF14.00 MiB (1.37%)140.00 MiB (1.37%)1.37 GiB (1.37%)14.00 GiB (1.37%)Yes✓/✓