Research Article
CBR-Based Decision Support Methodology for Cybercrime Investigation: Focused on the Data-Driven Website Defacement Analysis
Table 2
Value and the weight for the similarity score by the case vector. All of the values of the similarity score are normalized to 0 or 1.
| Case vector | Weight | Impact | The similarity measure between a new case and existing cases | Value |
| Encoding | 0.5 | High | ā | 0 or 1 |
| IP address | 0.2 | Medium | If the same (e.g., 143.248.1.6 and 143.248.1.6) | 1 | If the 1st, 2nd, and 3rd octet are matched (e.g., 143.248.1.6 and 143.248.1.8) | 0.75 | If the 1st and 2nd octet are matched (e.g., 143.248.1.6 and 143.248.4.4) | 0.5 | Only the 1st octet is matched (e.g., 143.248.1.6 and 143.13.2.4) | 0.25 | No common octet (e.g., 143.248.1.6 and 163.13.2.5) | 0 |
| Domain | 0.15 | Medium | An identical domain | 1 | Service name is matched, and one of the gTLD and ccTLD is matched | 0.8 | gTLD and ccTLD is matched | 0.3 | Service name is matched | 0.1 | ccTLD is matched | 0.1 | gTLD is matched | 0.1 | Nonidentical domain | 0 |
| Date | 0.1 | Low | Period of about 6 months back and forth (1 year) | 1 | Period of about 18 months back and forth (3 years) | 0.75 | Period of about 30 months back and forth (5 years) | 0.5 | Period of about 42 months back and forth (7 years) | 0.25 | Over period of about 42 months (over 7 years) | 0 |
| OS | 0.05 | Low | ā | 0 or 1 |
|
|