Using XGBoost to Discover Infected Hosts Based on HTTP Traffic
Table 1
The description of the selected fields in HTTP request header.
The selected fields name
Description
URI
URI is uniform resource identifier, a reference for resources available on the Internet such as HTML documents, images, or videos, and the URI field plays an important role in detection for malicious traffic [19, 22]
Host
The Host field recorded the domain name of the server and the TCP port number monitored by the server
User-Agent
The User-Agent field is still an effective indicator of compromised hosts because malware may carry the fake browser-like information or its own unique identification
Request-Method
Request type
Request-Version
HTTP protocol version
Accept
This field contains media type information and relative priority of media type
Accept-Encoding
The information in Accept-Encoding field is an encoding method of the content received by the client, and it is usually some kind of compression algorithm
Connection
The Connection field represents a connection state of the client and the server
Content-type
The value of the Content-type could help our model filter some legal traffic. The HTTP protocol carries data transmission of various types, such as text, pictures, sounds, videos, and others. Legal traffic tends to vary significantly. In contrast, most malware chooses text-related values such as textl/html; charset = UTF-8
Cache-Control
Cache-Control message indicating a request caching mechanisms need to be implemented