Security and Communication Networks

Security and Communication Networks / 2020 / Article
Special Issue

Machine Learning for Wireless Multimedia Data Security 2020

View this Special Issue

Research Article | Open Access

Volume 2020 |Article ID 8848315 | https://doi.org/10.1155/2020/8848315

Mingqian Wang, Weijie Gu, Changshen Ma, "A Multimode Network Steganography for Covert Wireless Communication Based on BitTorrent", Security and Communication Networks, vol. 2020, Article ID 8848315, 14 pages, 2020. https://doi.org/10.1155/2020/8848315

A Multimode Network Steganography for Covert Wireless Communication Based on BitTorrent

Academic Editor: Zhaoqing Pan
Received11 Mar 2020
Revised15 Jun 2020
Accepted20 Jun 2020
Published05 Jul 2020

Abstract

Network steganography is a hidden communication technique, which utilizes the legitimate traffic as the vehicle to transfer the secret information covertly over the untrusted network. BitTorrent (BT) is one of the most prevalent P2P services for transmitting video files over wireless networks. An enormous amount of video data is transmitted over BitTorrent traffic continuously, to make it potentially available for confidential information transfer. Hence, in this paper, the BitTorrent file-sharing service of P2P is chosen as the host for information hiding, and a multimode steganographic method based on Bitfield message is proposed. Taking advantage of BitTorrent cooperative transmission and the non-content-authentication mechanism of Bitfield message, the secret information is delivered during the exchange process of BitMapInfo between two peers. The steganographic mode is dynamically selected in view of the secret size, achieving adaptive bandwidth. The experimental results show that our scheme can resist statistical-based detection effectively and outperform the existing method by obtaining a lower degree of detection rate under machine learning-based steganalysis.

1. Introduction

Data transmission security, such as information hiding, is an important field that has been discussed by many works [1, 2]. As a subfield of data transmission security, network steganography exploits the normal traffic as the carrier to transmit information stealthily via untrusted networks. Compared with the static multimedia steganography, such as image steganography [3], it is difficult for the monitor to locate and extract the covert data in tremendous network flow. Hence, network steganography is an effective means of transporting confidential information in networks. In recent years, it has become a hot research topic in the field of information security due to the fine properties of network traffic.

There are two broad types of network covert channels: covert storage channel and covert timing one. Covert storage channel embeds the secret information into the redundancies of network protocols [47]. Although it is simple and easy to implement, it can be easily detected by the existing methods. Covert timing channel delivers the secret information by exploiting time-relevant events of network packets and it has better stealthiness than the covert storage one. Generally, it can be divided into three subclasses: On-Off covert channel [8], interpacket delay- (IPD-) based covert channel [912], packet sorting [13, 14], and combination-based ones [15, 16]. Synchronization is always a difficult problem to solve since the covert timing channel is susceptible to the unstable network condition, such as jitter and delay. To guarantee the reliability, Archibald and Ghosal[17] designed a mechanism by using TCP ACKs to synchronize the covert channel. Houmansadr [18] and Archibald [19] used the Error Correction Code to encode the secret information to increase accuracy, which sacrificed the bandwidth of the covert channel and increase the transmission overhead. Countering the above deficiencies, network steganography tends to mimic the normal traffic by shape-fitting. The feature model is considered in the modulation process of the secret information to resist statistical detection tools. Predominantly, appropriate and feasible network services with more popularity, reliability, and security are sought as a steganographic carrier.

It is revealed in the 2017 network traffic report that the P2P traffic accounts for 79.25%, 83.46%, 63.94%, and 67.19% of the total flow in Germany, Eastern Europe, Middle East, and Australia, respectively, which wins the top ranking. Among them, the BitTorrent (BT) file sharing service occupies more than 50% to 70% of P2P traffic. Additionally, the result of traffic monitor between the boundary route of Jiangsu Education Network and the national trunk route indicates that the BitTorrent traffic accounts for 60% of all, which has become the most prevailing P2P protocol.

As we know, an ideal steganographic carrier should possess two properties: popularity and complexity. Since the massive communication, traffic and complex pattern of such a carrier can improve the undetectability of steganography. Nowadays, wireless network has become a predominant means of data transmission, dynamically evolving network steganography subfield. Under this background, recent network steganography solutions exploit popular P2P services like file-sharing systems like BitTorrent, search features like Google Suggest, and multimedia and Voice-over-IP services like Skype and WeChat [2024]. The continuous and large amount of video data transmission during BitTorrent wireless communication provides chances of launching steganography.

Kopiczko et al. [25] proposed a BitTorrent-based network steganographic method called StegTorrent. It was based on modifying the order of data packets in the peer-peer data exchange protocol, which can provide steganographic bandwidth of up to 270 b/s. However, the perturbation of inherent network noise such as jitter or packet loss may affect the order of certain packet on the receiver side. Meanwhile, ordering of packets with distinct IP inevitably altered the interpacket delays, which could be easily differentiated from the normal distribution of BitTorrent IPDs by the adversary. To overcome the drawbacks of the existing method, the steganographic procedure should retain a normal communication mode of network service. In order to improve undetectability, the modulation of secret information cannot generate abnormal traffic or properties. Therefore, in this paper, BitTorrent is selected as the steganographic carrier, of which the system structure, communication mode, and protocols are analyzed comprehensively. On this basis, the protocol redundancy is deeply tapped in which the secret information can be embedded. The main contribution of this paper is as follows:(1)Taking advantage of the non-content-authentication mechanism of Bitfield message, the secret information is transferred during the exchange process of BitMapInfo (BitMap Information) between two legitimate BT peers (Clients) and is embedded into the content of <bitfield> according to the given format. Thus, the covert traffic of our scheme preserves normal behavior and property, so as to resist detection.(2)The multimode steganographic method is proposed which exploits the cooperative transmission of BT peers. The steganographic mode (Single-Link Steg or Multi-Link Steg) is dynamically selected in view of the secret size, achieving adaptive bandwidth. In Multi-Link Steg, multiple peers participate in the transmission of secret information concurrently to accomplish collaborative steganography. Meanwhile, such method is noise-tolerated due to the reliable transmission mechanism of TCP, making the proposed scheme more robust.(3)Multidimensional steganalysis is employed to comprehensively verify the undetectability of the proposed scheme. Statistical properties-based detection methods are used to measure the traffic regularity, such as Entropy test (EN-test). The nonparametric statistical method is exploited to measure the distance between two distributions, such as the Kolmogorov–Smirnov test (K-S test). Machine learning-based steganalysis is used to classify the tested traffic into covert or normal, such as SVM, Random Forest, and Deep Neural Networks.

The remainder of this paper is organized as follows. In Section 2, related works are reviewed. The basics of the BitTorrent system are described in Section 3. In Section 4, the proposed scheme is introduced in detail. In Section 5, experimental results are presented and analyzed. Finally, the whole paper is concluded in Section 6.

Kopiczko et al. [25] proposed a BitTorrent-based network steganographic method called StegTorrent, which is illustrated in Figure 1. It is assumed that both the secret data sending and receiving sides are in control of a certain number of BitTorrent clients and, as mentioned above, their IP addresses are known to each other. In Figure 1, for the sake of clarity, only single direction steganographic transmission is presented, but of course, end-to-end bidirectional communication is possible and the other direction is analogous. No knowledge of the network’s topology is necessary. The hidden data sender uses the modified BitTorrent client—StegTorrent client—to share a resource that is downloaded by the second StegTorrent client that consists of a group of controlled BitTorrent clients.

For the sake of the proposed method’s description and analysis, the term data package is defined as a set of IP addresses that are sent within the IP packets in a predetermined order and the term data package size as the total number of elements in this set. For example, it is assumed that the data package size is 2. In this case, two packets with two different IP addresses (e.g., IP1 and IP2) are used to send bits of hidden data. In this simple scenario, if the order of the packets is modified for steganographic purposes and the BitTorrent client receives a packet that was sent from IP1 and then from IP2, then it will be interpreted as binary “0” and in other cases as binary “1.” It is assumed that the data package and its size are a shared secret between transmitting and receiving StegTorrent clients.

It must be noted that this method’s performance depends on the size of the data package while the latter relies on the number of available receiving IP addresses (receiving BitTorrent clients under control). However, the perturbation of inherent network noise such as jitter or packet loss may affect the order of certain packet on the receiver side. Meanwhile, ordering of packets with distinct IP inevitably altered the interpacket delays, which could be easily differentiated from the normal distribution of BitTorrent IPDs by the adversary.

3. BitTorrent Analysis

BitTorrent is a P2P file-sharing system that allows its users to distribute large files over networks. BitTorrent is distinguished from other similar file transfer applications in that instead of downloading a resource from a single central server, users download fragmented files from other users simultaneously. As a result, the file transfer time is considerably decreased because the group of users that share the same resource or part of it may consist of several to thousands of hosts. Such a group of users interested in the same resource known as “peers” combine together with a central component known as a “tracker” in BitTorrent. This combination of peers and trackers is called a “swarm.” Trackers are responsible for controlling the resource transfer between peers. Peers that hold onto a particular resource or part of a resource are required to share the resource and to perform the transfer.

There are two types of BitTorrent peers based on the stage at which they are involved in downloading or sharing a given resource:(1)Seeders: peers that possess the complete resource and are only sharing it.(2)Leechers: peers that do not possess the complete resource but they are interested in doing so. They also share the fragments they have already downloaded. When a leecher obtains all the remaining fragments of the resource, it automatically becomes a seed.

In order to preserve the communication mode and properties of normal BT traffic during the steganographic process, it is essential to analyze the operation mode and protocols in the BT system. The concrete communication procedure of BT file sharing is shown in Figure 2:(1)A seed file (.torrent) is produced by the seeder and then released to the Tracker, which is combined with the web service(2)Peer1 (Leecher) queries and downloads the seed file of the required resource from the Tracker(3)Peer1 (Leecher) requests the list of peers which possess the shared resource from the Tracker(4)The Tracker returns the corresponding peers-list to Peer1 (Leecher)(5)Peer1 (Leecher) conducts “three-way handshake” of TCP with the other peer, and then the connection between them is established(6)Once Peer1 (Leecher) is connected with Peer2 successfully, they will immediately send and reply the Handshake messages, in order to confirm their identities(7)Peer1 (Leecher) exchanges the Bitfield message with Peer2, informing each other of the indexes of file fragments which are already owned by themselves(8)Peer1 (Leecher) exchanges a series of negotiation messages with Peer2, such as choke, unchoke, interested, and not interested(9)Peer1 (Leecher) sends the Request message to Peer2, asking for the specific file fragments(10)Peer2 replies the Piece message to Peer1 (Leecher), containing the corresponding file fragments

Among them, the Bitfield message is used to indicate the bitmap information of certain file fragments, which have already been obtained by the current peer. In the BT client, a file is generally divided into several fragments whose size is 256 kB. Then, the fragments are indexed from 0 in sequence. Since the number of fragments is distinct for each file, the length of the Bitfield message is variable.

The format of the Bitfield message is shown in Figure 3, where len refers to the length of the Bitfield message, which occupies 4 Bytes. And id is the identifier of Bitfield message, of which the value is set to 5 occupying 1 Byte. The <bitfield> of X Bytes indicates the possession of specific file fragments, as depicted in Figure 4. The fragment with index 0 corresponds to the highest bit of the first byte, and so on. If the bit is “1” in the position, it is revealed that the corresponding fragment is possessed while a bit “0” means that certain fragment is not possessed by the peer.

It can be observed that the Bitfield message is only sent after completing a “handshake” immediately. Since there is no content-authentication mechanism of Bitfield message in BT client, the modification of <bitfield> may not arise abnormally. In other words, the altered <bitfield> will default as the original content. Although the Bitfield message is only exchanged for once during the single interaction of two peers, the size of the delivered data is considerable. Therefore, the Bitfield message is employed as the steganographic carrier in this paper.

4. The Proposed Scheme

4.1. System Model

The proposed steganographic system model is presented in Figure 5. The steganographic peers include steganographic sender and receiver, which disguise as the legitimate BT clients. The open-source code of the BT client is modified according to the proposed scheme, which is implemented as follows:(1)Steg-preparing: first, the steganographic mode (Single-Link Steg or Multi-Link Steg) is selected by the sender-peer in accordance with the secret size. And a suitable video file is chosen as a shared resource. Second, the critical information of the shared video file, such as file name and format, is delivered to the receiver-peer via e-mail, instant messaging, and so on.(2)Normal BitTorrent communication/before-steg: the steganographic peers request the common file resource from the Tracker and establish TCP link with each other.(3)Steg-synchronization: the steganographic peers exchange the Handshake message to authenticate their identities in covert communication.(4)Steg-implementation: the sender-peer embeds the secret information into the Bitfield message according to the selected steganographic mode. Then, the altered Bitfield message is sent to the receiver-peer, from which the secret information can be extracted.(5)Normal BitTorrent communication/after-steg: after accomplishing the transmission of secret information, the steganographic peers still exchange the negotiation messages and transfer the required video file fragments, as the other normal BT peers.

4.2. Multimode Steganography

In BT communication, two peers only exchange Bitfield message for once during the entire process of video file transfer, in order to share the concrete bitmap information of themselves. As mentioned above, the bitmap information is used to inform the other peer which file fragments have been possessed by one peer and is sent after completing “handshake” immediately. Hence, the secret data that can be transferred is limited during the single interaction of two peers. If more secret data is required to be delivered, multiple peers might be employed in sharing the common resource. Multipeers participate in the transmission of secret information concurrently to accomplish cooperative steganography. Accordingly, there are two proposed steganographic modes based on the data size of secret information: Single-Link Steg and Multi-Link Steg. The main notations and symbols of our scheme are presented in Table 1.


NotationDescription

Single-Link StegSingle-link steganography mode
Multi-Link StegMulti-link steganography mode
ModeSteganography mode
S_lenThe length of secret information
Secret_infoThe content of secret information
PaddingThe remainder of Bitfield field
FileThe shared video file
sizeofFunction of calculating the shared file size
IndexThe index of secret data block
S_block (i)The i-th secret data block

4.2.1. Single-Link Steg

The Single-Link Steg mode is suitable for transmitting less secret information such as key and parameter. In this scenario, there are only two peers participating in covert communication. As mentioned above, the steganographic sender must be a seeder. The Single-Link Steg is implemented as follows:Step 1. Bitmap Info <bitfield> is partitioned into four steganographic fields, as shown in Figure 6. Assume that the length of <bitfield> is X Bytes. The meaning of each field is illustrated as follows:(i)Mode refers to the steganographic mode, which occupies 1 Byte. When this value is set to “0,” it is denoted that our steganography is working in Single-Link state(ii)S_len refers to the length of secret information, which occupies 1 Byte. And it is defined as L Bytes(iii)Secret_info refers to the content of secret information, whose size is L Bytes(iv)Padding refers to the remaining original content of <bitfield> after the substitution, whose size is (X-L-2) Bytes. And it should be satisfied that .Step 2. The original <bitfield> is substituted with the secret information according to the aforementioned steganographic format. In addition, the shared video file between steganographic peers must be appropriately selected in accordance with the secret size L. In particular, the size of the video file should satisfy the certain requirement, as denoted inwhere sizeof is represented as the function of calculating the file size. The video file is generally divided into several fragments whose size is 256 kB.

4.2.2. Multi-Link Steg

In order not to disrupt the legitimate BT communication of file sharing, when it is necessary to transfer a larger amount of secret data, the steganographic peers are not allowed to send Bitfield message several times. Thus, the Multi-Link Steg mode is exploited in case that more secret information is required to deliver. Cooperative steganography can be realized by the collaborative transfer of multiple BT peers. In this scenario, the steganographic peers disguise as the legitimate BT clients intended to download the common video resource. They collaborate to transfer the secret segments in accordance with prior careful planning. The Multi-Link Steg is implemented as follows, which is shown in Figure 7.Step 1. Bitmap Info <bitfield> is partitioned into five steganographic fields, as shown in Figure 8. Assume that the length of <bitfield> is X Bytes. The meaning of each field is illustrated as follows:(i)Mode refers to the steganographic mode, which occupies 1 Byte. When this value is set to “1,” it is denoted that our steganography is working in Multi-Link state(ii)S_len refers to the length of the secret block, which occupies 1 Byte. And it is defined as L Bytes(iii)Index refers to the index of the secret block, which initiates from 1(iv)S_block refers to the content of the secret block, whose size is L Bytes(v)Padding refers to the remaining original content of <bitfield> after the substitution, whose size is (X-L-3) Bytes. And it should be satisfied that .Step 2. The secret information is divided into n blocks, whose size is L. S_block (i) refers to the i-th secret data block, where i = 1, 2, …, n.Step 3. n peers (legitimate BT clients) are controlled by the steganographic sender to transfer the secret blocks collaboratively.Step 4. The sender-peers then connected with the steganographic receiver, respectively, establishing n covert links.Step 5. For each sender-peer, the original <bitfield> is substituted with the secret block according to the aforementioned steganographic format.Step 6. The steganographic receiver extracts the secret blocks according to the agreed format. Then, the blocks are reordered to retrieve the complete secret information, which is denoted as secret_info as follows:

5. Experiment Results and Analysis

5.1. Data Set and Implementation

Single-Link Steg and Multi-Link Steg are realized in the experiment, respectively. The open-source BT clients are modified to implement the proposed scheme, delivering the secret information covertly. Under the Single-Link Steg mode, steganographic receiver disguises as the BT seeder. The data size of secret information is 255 bytes and the shared video file is selected whose size is 104 MB. The communication packets between the steganographic peers are captured by Wireshark, as shown in Figure 9. It can be seen that the secret data is transferred successfully by format substituting the partial content of the Bitfield message. Besides, it is verified that the legitimate BT communication has not been affected by the revision of the Bitfield message. The negotiation messages such as Interested and Unchoke are exchanged subsequently and so are the file fragment transmission messages such as Request and Piece. In that, it can be concluded that the proposed steganography retains normal communication without introducing any additional anomaly.

Under the Multi-Link Steg mode, The data size of secret information is 1 kB, and the shared video file is selected whose size is 90 MB. In this scenario, there are three steganographic peers involving in the covert communication, in which peer1 and peer2 are all controlled by the steganographic sender, in order to cooperatively transfer the secret data. Peer3 is the steganographic sender, which acts as the BT seeder. Figure 10 presents the Bitfield messages of steganographic peer1 and peer2, which contain the secret block, respectively.

Further experiments are performed to evaluate the main performance metrics of the proposed scheme, which contain the undetectability, robustness, and capacity analysis. As the undetectability and robustness will not be affected by the number of steganographic peers, only the mode of Single-Link Steg is considered in the corresponding experiment.

5.2. Undetectability

As the core property, undetectability refers to the covert traffic that cannot be differentiated from the normal one, which is all depended on the similarity between the two. Therefore, in order to improve undetectability, the modulation of secret information cannot generate abnormal traffic or properties. In the experiment, normal traffic of downloading general video files in BT clients (BitTorrent, μTorrent, and Vuze) is captured by Wireshark. Then, the lengths of <bitfield> in bitfield messages are extracted to form the normal samples. The number of normal and steganography samples is 20,000. In the following, statistical and machine learning-based steganalysis is utilized to detect our proposed scheme, respectively.

5.2.1. Statistical-Based Steganalysis

Statistical-based steganalysis is the most common and popular method to detect the potential covert traffic, in which statistical properties such as traffic regularity or distribution function are exploited to distinguish the normal and covert traffic. As we know, the histogram is a significant property that can reveal the statistical distribution feature of traffic. Therefore, the histograms of normal and covert traffic of our scheme are compared in Figure 11, where the x-axis shows the field length of <bitfield> ranging from 0 to 2500 Bytes and the y-axis indicates the number of lengths that occurred within each bin (the x-axis is divided into eight bins). As shown in the figure, the field length of normal <bitfield> occurs most between 800 and 1200 Bytes, with a peak value of 1000 Bytes. It is obvious that the histogram of our scheme matches the normal one quiet well. The file size which is calculated is approximately 4.9 GB corresponding to the maximum <bitfield> length of 2500 Bytes.

Meanwhile, two notable detection methods are employed to reckon the detection resistance of our scheme compared with StegTorrent [25] quantitatively, which are the Entropy test [26] and Kolmogorov–Smirnov test [27]. For normal and covert samples, they are both divided into 20 consecutive windows whose size is 1000. Certain statistical feature of each window is calculated and used during the detection process, as depicted in Figure 12.

(1) Entropy Test. Entropy can describe the degree of chaos in a process. In the Entropy test (EN-test), it is utilized to measure the regularity of data traffic [26]. If the traffic is less regular, the Entropy value will be larger, and vice versa. Since the less regularity indicates more randomness, the more amount of information is contained in the traffic. The Entropy value is obtained by calculating the statistical average of all possible self-information, which is denoted inwhere X represents a one-dimensional discrete random variable, whose set of values is Ω = {xi|i= 1, 2, …, n}. The self-information of xi is I(xi) and the probability of xi is denoted as . The Entropy values of 20 windows for normal and covert samples are compared in Figure 13. From the result, it can be found that most Entropy values of normal samples range approximately from 0.5 to 1.3, whereas those of the covert samples generated by StegTorrent are from 0.8 to 1.5. But the values of our scheme mix with those of the normal samples, which can hardly be differentiated.

Then, 20 windows of normal and covert samples are tested using the Entropy test, respectively, when the window size is 1000. The results are presented in Table 2, where the detection threshold is denoted as THD. It is observed that the false-negative rate of normal samples declines when the threshold increases. Meanwhile, the detection rates (true positive rates) of covert samples are shown in the table. And we can see, the detection rate of StegTorrent ranges from 91% to 98%, while that of our scheme is only below 7%. Hence, the Entropy test fails to distinguish the covert samples of our scheme from the normal one.


Detection resultTP (%)FN (%)TP (%)FN (%)TP (%)FN (%)

Detection thresholdTHD = 0.95THD = 0.98THD = 1.03
Our scheme0.070.090.040.070.020.04
StegTorrent0.980.090.920.070.910.04

(2) Kolmogorov–Smirnov Test. K-S test [27] measures the maximum distance between two distributions. A small value indicates that two distributions are close to each other. Conversely, a large value means that one distribution does not fit the other one. The Kolmogorov-Smirnov test value (KS-test value) is attained by taking the supremum of the absolute difference between two empirical distribution functions for all x, which can be defined inwhere S1(x) and S2(x) refer to the empirical distribution functions of two samples. The comparison of KS-test values between the normal and covert samples is shown in Figure 14. Likewise, 20 windows of normal and covert samples are tested in the experiment. The x-axis is the window number and the y-axis shows the corresponding KS-test value. It is observed that the KS-test values of our scheme are under 0.15, confused with those of the normal traffic. Thus, the distribution of our scheme is close to that of the normal one. Nevertheless, the corresponding values of StegTorrent occur from 0.15 to 0.25, which is deviated from the normal case.

Then, the covert traffic is detected using the K-S test and the detection results are shown in Table 3, where the detection threshold is denoted by THD. It is observed that the false negative (FN) rate of the normal traffic declines when the threshold increases. FN refers to the normal sample which is misclassified as the covert one. Hence, the detection threshold is set appropriately from 0.13 to 0.15, in order to guarantee that the false-negative rate remains under 1%. Meanwhile, the true positive (TP) rates of covert samples are presented in the table. In this paper, the detection rate is represented by TP. From the results, it is easily seen that the detection rate of StegTorrent is more than 92% when tested with different thresholds. But in our case, it is located under 3%, indicating that the Kolmogorov–Smirnov test cannot effectively detect the covert traffic generated by our scheme.


Detection resultTP (%)FN (%)TP (%)FN (%)TP (%)FN (%)

Detection thresholdTHD = 0.13THD = 0.14THD = 0.15
Our scheme0.030.010.010.000.000.00
StegTorrent0.990.010.950.000.920.00

5.2.2. Machine Learning-Based Steganalysis

Recently, the machine learning technique performs quite well in resolving complex problems in various domains. In particular, it has progressively become a novel and effective means of detecting covert channels. In machine learning-based steganalysis, various statistical metrics (features) of normal and covert samples are utilized by classifier models and eventually be trained to distinguish covert traffic. The classifiers used in machine learning-based detection mainly include SVM, Neural Network, Logistic Regression, Naive Bayes, Random Forest, and Deep Neural Network [2830]. In this paper, Deep Neural Network (DNN) is principally employed to further estimate the undetectability of our scheme compared with StegTorrent.


Input variableFeatureFormulaExplanation

x1Meanli is the length of <bitfield>; n is the subsample size
x2MedianWhere the lengths are sorted in ascending order
x3Entropyp(li) is the probability of length li
x4Standard deviationli is the length of <bitfield>; μ is the mean of the lengths
x5Root of average mean errorli is the length of <bitfield>; μ is the mean of the lengths

(1) Detection Process. The proposed scheme is detected using DNN by the following steps, as depicted in Figure 15.Step 1. Network traffic of downloading general video files in BT clients is captured by Wireshark. Then, the lengths of <bitfield> are extracted to form the normal or covert samples, whose size is 5,000,000, respectively. The samples are divided into 10,000 subsamples, each containing 500 lengths.Step 2. For each subsample, values of five statistical features, including mean, median, entropy, standard deviation, and root of average mean error are calculated as described in Table 4. The data set of statistical features contains two types of samples, which are the normal and covert ones. It will be then used for training or testing in the classifier.Step 3. The data set is divided into two parts, 70% of which is used for training in the DNN classifier model and 30% of which is used for testing. The normal traffic is labeled “0” and the covert one is labeled “1.” After training the DNN classifier, it can be exploited to detect the covert traffic online.

The structure of DNN is shown in Figure 16. In the input layers, 5 statistical features are fed to DNN as the input variables. In the hidden layers, each layer consists of a number of neurons involved in the prediction phase. Each neuron adjusts its weight based on the learning process and participates in calculating the coefficients of the final equations, which will be used to determine the class label (normal or overt) of tested samples. The output layer is responsible for determining the predicted value of the class label.

(2) Detection Result. Figure 17 depicts the effect on the detection rate of covert samples when increasing the number of neurons inside the DNN hidden layers. It can be noted that the detection rate improves as the number of neurons increases until it reaches 13, where the highest rate of 37% is achieved in detecting our proposed scheme. Nevertheless, at most 96% of StegTorrent is differentiated successfully by the DNN classifier.

Subsequently, the effect on the detection rate of increasing the number of hidden layers in DNN is shown in Figure 18. It is observed that the detection rate also increases as the increment of hidden layers until reaching a certain level. And the rate declines after the peak value since the classifier model is overfitted. It is easily found that 43% of covert samples of our scheme are detected when the number of hidden layers is 5 while the detection rate of StegTorrent reaches above 97% under the same circumstances.

Finally, the proposed scheme is tested by other machine learning-based detection methods, such as SVM, Logistic Regression, Naive Bayes, Random Forest. And the detection rates of our scheme and StegTorrent are compared in Figure 19. It is observed that 24% to 43% of our scheme is detected by different classifiers, while the detection rates of StegTorrent appear from 92% to 98%. It is clearly noticeable that the proposed scheme has outperformed StegTorrent by obtaining a lower degree of detection rate. Therefore, it can be concluded that our scheme possesses better undetectability than the existing method.

5.3. Robustness

Robustness requires the covert channel to keep working with relatively high accuracy and low bit error rate (BER), resisting the perturbation of network noise such as network jitter and packet disorder and loss. In the experiment, the robustness of our proposed scheme is measured considering packet loss (pl) and packet disorder (pd). The BERs of the proposed scheme are compared with those of StegTorrent in terms of different rates of packet disorder/loss, as given in Figure 20. It is obvious that the secret information about our scheme can be accurately obtained under different rates of packet loss or disorder. However, the BER of StegTorrent increases with the increment of packet loss/disorder rate. The BER of StegTorrent reaches up to 11% when 20% of packets are lost, which will degrade the reliability of covert communication in StegTorrent.

On the one hand, the good performance in resisting packet loss and disorder of our scheme is due to the TCP reliable transmission mechanism of normal BT traffic, which serves as the carrier of our steganography. Therefore, the proposed method is noise-tolerated. On the other hand, packet loss or disorder alters the packet-arriving order in StegTorrent, which will lead to the misrecovery of secret data on the receiver side. Hence, we can conclude that our scheme is superior to StegTorrent in respect to robustness.

5.4. Capacity

Capacity is the maximum data size that can be reliably transmitted over the covert channel per second or packet. In other words, capacity refers to the transfer rate of secret information. It is closely related to the bandwidth of normal carrier and the steganographic modulation algorithms. As revealed in Figure 21, the field length of <bitfield> ranges from 0 to 2500 Bytes in normal BT communication, which means that the maximum capacity of Single-Link Steg is 2500 B/P. Meanwhile, in Multi-Link Steg, the capacity will increase linearly with the number of steganographic peers, which is shown in Figure 21. Since the field length of normal <bitfield> occurs most between 800 and 1200 Bytes as mentioned above, the secret data of a certain size (L) is transmitted by each peer engaged in the steganography. It is found that when 64 peers transfer the secret information concurrently, the capacity reaches up to 76,800 B/P.

However, more peers might increase the overhead of system resources and the complexity of the steganographic control mechanism, which will make the scheme more difficult to implement. Thus, the tradeoff between the number of steganographic peers and system overhead will be taken into consideration in future research. And then, the capacity of Multi-Link Steg mode can be analyzed under the optimal number of steganographic peers.

6. Conclusions

BitTorrent file sharing the protocol of P2P is a steganographic carrier with high covertness, which has massive network traffic and complex communication mechanism. The steganographic peers are confused with numerous legitimate BT peers, owing to the cooperative transmission in the P2P network. Thus, it is extremely difficult to locate steganographic peers in the tremendous BT traffic. The steganographic peers disguise as the legitimate BT clients, who are interested in possessing the common video file. They participate in downloading the same resource following the normal BT communication mode, without introducing any extra traffic. Taking advantage of the non-content-authentication mechanism of Bitfield message, the secret information is embedded into the content of <bitfield> according to the given format. The altered Bitfield message can bypass the security censorship of the BT system and network monitor device. Hence, our scheme has proved better undetectability and robustness than the current methods. In the future work, another BitTorrent-based steganographic algorithm will be designed and researched, in which the tradeoff between the numbers of steganographic peers and system overhead will be taken into consideration. And then, the optimal steganographic mode can be analyzed and selected.

Data Availability

The software code and data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

All authors declare that there are no conflicts of interest regarding the publication of this paper.

Acknowledgments

This work was supported by the Natural Science Foundation of the Higher Education Institutions of Jiangsu Province under Grant no. 19KJB510019, Innovation and Entrepreneur-ship Training Program for College Students of Jiangsu Province under Grant no. 201913114004Y, Changzhou Key Laboratory of Industrial Internet and Data Intelligence under Grant no. CM20183002, and the Project of Changzhou Vocational Institute of Mechatronic Technology under Grant no. 2019-YBKJ-05.

References

  1. X. Chen, J. Li, J. Weng, J. Ma, and W. Lou, “Verifiable computation over large database with incremental updates,” IEEE Transactions on Computers, vol. 65, no. 10, pp. 3184–3195, 2016. View at: Publisher Site | Google Scholar
  2. Z. Zhou, Y. Cao, M. Wang, E. Fan, and Q. M. J. Wu, “Faster-RCNN based robust coverless information hiding system in cloud environment,” IEEE Access, vol. 7, pp. 179891–179897, 2019. View at: Publisher Site | Google Scholar
  3. Z. Zhou, Y. Mu, and Q. M. J. Wu, “Coverless image steganography using partial-duplicate image retrieval,” Soft Computing, vol. 23, no. 13, pp. 4927–4938, 2019. View at: Publisher Site | Google Scholar
  4. M. A. Elsadig and Y. A. Fadlalla, “Survey on covert storage channel in computer network protocols: detection and mitigation techniques,” International Journal of Advances in Computer Networks and Its Security, vol. 6, no. 3, pp. 11–17, 2016. View at: Google Scholar
  5. R. Sun, L. Shi, C. Yin, and J. Wang, “An improved method in deep packet inspection based on regular expression,” The Journal of Supercomputing, vol. 75, no. 6, pp. 3317–3333, 2019. View at: Publisher Site | Google Scholar
  6. W. Mazurczyk and K. Szczypiorski, “Evaluation of steganographic methods for oversized IP packets,” Telecommunications Systems, vol. 49, no. 2, pp. 210–217, 2012. View at: Publisher Site | Google Scholar
  7. Y. Jiang, M. Zhao, C. Hu, L. He, H. Bai, and J. Wang, “A parallel FP-growth algorithm on World Ocean Atlas data with multi-core CPU,” The Journal of Supercomputing, vol. 75, no. 2, pp. 732–745, 2019. View at: Publisher Site | Google Scholar
  8. S. Cabuk, C. Brodley, and C. Shields, “IP covert timing channels: design and detection,” in Proceedings of the 2004 ACM Conference on Computer and Communications Security, pp. 55–74, Washington, DC, USA, October 2004. View at: Google Scholar
  9. X. Zi, L. Yao, L. Pan, and J. Li, “Implementing a passive network covert timing channel,” Computers & Security, vol. 29, no. 6, pp. 686–696, 2010. View at: Publisher Site | Google Scholar
  10. T. Zhu, Y. Lin, Y. Liu, W. Zhang, and J. Zhang, “Minority oversampling for imbalanced ordinal regression,” Knowledge-Based Systems, vol. 166, no. 15, pp. 140–155, 2019. View at: Publisher Site | Google Scholar
  11. S. Gianvecchio, H. Wang, and D. Wijesekera, “Model based covert timing channels: automated modeling and evasion,” Lecture Notes In Computer Science, Springer, Berlin, Germany, 2008. View at: Publisher Site | Google Scholar
  12. G. Liu, J. Zhai, and Y. Dai, “Network covert timing channel with distribution matching,” Telecommunication Systems, vol. 49, no. 2, pp. 199–205, 2012. View at: Publisher Site | Google Scholar
  13. X. Zhang, C. Liang, Q. Zhang, Y. Li, J. Zheng, and Y.-a. Tan, “Building covert timing channels by packet rearrangement over mobile networks,” Information Sciences, vol. 445-446, pp. 66–78, 2018. View at: Publisher Site | Google Scholar
  14. X. Zhang, L. Zhu, X. Wang, C. Zhang, H. Zhu, and Y.-a. Tan, “A packet-reordering covert channel over VoLTE voice and video traffics,” Journal of Network and Computer Applications, vol. 126, pp. 29–38, 2019. View at: Publisher Site | Google Scholar
  15. Z. Pan, X. Yi, Y. Zhang, B. Jeon, and S. Kwong, “Efficient in-loop filtering based on enhanced deep convolutional neural networks for HEVC,” IEEE Transactions on Image Processing, vol. 29, pp. 5352–5366, 2020. View at: Publisher Site | Google Scholar
  16. X. Luo, E. W. W. Chan, P. Zhou, and R. K. C. Chang, “Robust network covert communications based on TCP and enumerative combinatorics,” IEEE Transactions on Dependable and Secure Computing, vol. 9, no. 6, pp. 890–902, 2012. View at: Publisher Site | Google Scholar
  17. R. Archibald and D. Ghosal, “Design and performance evaluation of a covert timing channel,” Security and Communication Networks, vol. 9, no. 8, pp. 755–770, 2016. View at: Publisher Site | Google Scholar
  18. A. Houmansadr and N. Borisov, “CoCo: coding-based covert timing channels for network flows,” in Proceedings of the 13th International Conference on Information Hiding, pp. 314–328, Prague, Czech Republic, May 2011. View at: Google Scholar
  19. R. Archibald and D. Ghosal, “A covert timing channel based on fountain codes,” in Proceedings of the IEEE 11th International Conference on Trust, Security and Privacy in Computing and Communications, pp. 970–977, Liverpool, UK, June 2012. View at: Google Scholar
  20. J. Lei, D. Li, Z. Pan, Z. Sun, S. Kwong, and C. Hou, “Fast intra prediction based on content property analysis for low complexity HEVC-based screen content coding,” IEEE Transactions on Broadcasting, vol. 63, no. 1, pp. 48–58, 2017. View at: Publisher Site | Google Scholar
  21. F. W. Xu, “Research on the hidden anonymous communication system based on P2P,” Beijing University of Posts and Telecommunications, Beijing, China, 2013, M. S. thesis. View at: Google Scholar
  22. W. Mazurczyk, M. Karas, and K. Szczypiorski, “SkyDe: a skype-based steganographic method,” International Journal of Computers, Communications and Control, vol. 8, no. 3, pp. 1841–1847, 2013. View at: Publisher Site | Google Scholar
  23. J. Lei, J. Sun, Z. Pan, S. Kwong, J. Duan, and C. Hou, “Fast mode decision using inter-view and inter-component correlations for multiview depth video coding,” IEEE Transactions on Industrial Informatics, vol. 11, no. 4, pp. 978–986, 2015. View at: Publisher Site | Google Scholar
  24. J. Lv, C. Zhu, S. Tang, and C. Yang, “Deepflow: hiding anonymous communication traffic in P2P streaming networks,” Wuhan University Journal of Natural Sciences, vol. 19, no. 5, pp. 417–425, 2014. View at: Publisher Site | Google Scholar
  25. P. Kopiczko, W. Mazurczyk, and K. Szczypiorski, “StegTorrent: a steganographic method for the P2P file sharing service,” IEEE Security and Privacy Workshops, vol. 42, no. 6, pp. 151–157, 2013. View at: Publisher Site | Google Scholar
  26. S. Gianvecchio and H. Haining Wang, “An entropy-based approach to detecting covert timing channels,” IEEE Transactions on Dependable and Secure Computing, vol. 8, no. 6, pp. 785–797, 2011. View at: Publisher Site | Google Scholar
  27. D. Zhang, G. Wang, X. Wang, Z. Li, W. Li, and J. Wang, “Cyberspace security for future Internet,” Security and Communication Networks, vol. 2018, Article ID 5313980, p. 1, 2018. View at: Publisher Site | Google Scholar
  28. Y. Chen, J. Xiong, W. Xu, and J. Zuo, “A novel online incremental and decremental learning algorithm based on variable support vector machine,” Cluster Computing, vol. 22, no. 8, pp. 7435–7445, 2019. View at: Publisher Site | Google Scholar
  29. Y. Chen, W. Xu, J. Zuo, and K. Yang, “The fire recognition algorithm using dynamic feature fusion and IV-SVM classifier,” Cluster Computing, vol. 22, no. 10, pp. 7665–7675, 2019. View at: Publisher Site | Google Scholar
  30. D. Omar, A.-F. Ala, B. B. Ghassen, and J. Ilyes, “Using hierarchical statistical analysis and deep neural networks to detect covert timing channels,” Applied Soft Computing Journal, vol. 82, Article ID 105546, 2019. View at: Publisher Site | Google Scholar

Copyright © 2020 Mingqian Wang et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.


More related articles

 PDF Download Citation Citation
 Download other formatsMore
 Order printed copiesOrder
Views766
Downloads254
Citations

Related articles

Article of the Year Award: Outstanding research contributions of 2020, as selected by our Chief Editors. Read the winning articles.