Research Article

Effective and Fast Near Duplicate Detection via Signature-Based Compression Metrics

Figure 1

Normalized compression distance under SigNCD and NCD for the first bytes of two identical documents.