Research Article

Visual Analysis of Multisource Heterogeneous Data Based on Improved DPCA Algorithm

Table 1

The format of data.

FormField nameField meaningRelevant description

Login.csvTimeLog generation time
userUser nameLogin user name
protoApplied protocolSSH, mysql, and so on
dipDestination IPLogged in IP
dportDestination portLogged in port
sipSource IPLogin initiation IP
SportSource portLogin initiation port
StateLogin resultsSuccess or failure

Weblog.csvTimeLog generation time
sipSource IPClient IP
sportSource portClient application port
dipDestination IPServer IP
dportDestination portServer application port
HostRequested domain nameHost field of HTTP header

TcpLog.csvstimeTCP data flow start timeThe start time of the TCP stream, that is, the time when the first syn packet of the stream is received
dtimeEnd time of TCP data flowThe end time of TCP flow, that is, the time when the last packet of the flow is received
protoAgreementProtocol field value in IP packet header
dipDestination IPServer IP of destination iptcp data stream
dportDestination portServer application port of TCP data flow
sipSource IPClient initiated IP of TCP data stream
SportSource portClient application port of TCP data stream
uplink_lengthUplink bytesThe total number of bytes of application layer data sent from the client to the server from the establishment of the TCP stream to the end of the stream
downlink_lengthDownlink bytesThe total number of bytes of application layer data sent from the server to the client from the establishment of the TCP stream to the end of the stream

Email.csvTimeMail sending/receiving timeSending/receiving time of mail in the header
protoApplication protocolSMTP
sipSource IPIP header source IP address
SportSource portTCP header source application port
dipDestination IPIP header destination IP address
dportDestination portTCP header destination application port
fromMail senderFrom the corresponding field in the message header
toMail recipientIt comes from the corresponding field in the mail header. When multiple recipients appear, they are separated by semicolons.
SubjectThemeFrom the corresponding field in the message header

Checking.csvidEmployee ID
DayDate
checkinCheck in time
checkoutOff duty sign off time