Research Article

Using XGBoost to Discover Infected Hosts Based on HTTP Traffic

Table 1

The description of the selected fields in HTTP request header.

The selected fields nameDescription

URIURI is uniform resource identifier, a reference for resources available on the Internet such as HTML documents, images, or videos, and the URI field plays an important role in detection for malicious traffic [19, 22]

HostThe Host field recorded the domain name of the server and the TCP port number monitored by the server

User-AgentThe User-Agent field is still an effective indicator of compromised hosts because malware may carry the fake browser-like information or its own unique identification

Request-MethodRequest type

Request-VersionHTTP protocol version

AcceptThis field contains media type information and relative priority of media type

Accept-EncodingThe information in Accept-Encoding field is an encoding method of the content received by the client, and it is usually some kind of compression algorithm

ConnectionThe Connection field represents a connection state of the client and the server

Content-typeThe value of the Content-type could help our model filter some legal traffic. The HTTP protocol carries data transmission of various types, such as text, pictures, sounds, videos, and others. Legal traffic tends to vary significantly. In contrast, most malware chooses text-related values such as textl/html; charset = UTF-8

Cache-ControlCache-Control message indicating a request caching mechanisms need to be implemented

Content-lengthThis field indicates the size of the entity-body