Abstract
With the rapid development of 5G era, the number of messages on the network has increased sharply. The traditional opportunistic networks algorithm has some shortcomings in processing data. Most traditional algorithms divide the nodes into communities and then perform data transmission according to the divided communities. However, these algorithms do not consider enough nodes’ characteristics in the communities’ division, and two positively related nodes may divide into different communities. Therefore, how to accurately divide the community is still a challenging issue. We propose an efficient data transmission strategy for community detection (EDCD) algorithm. When dividing communities, we use mobile edge computing to combine network topology attributes with social attributes. When forwarding the message, we select optimal relay node as transmission according to the coefficients of channels. In the simulation experiment, we analyze the efficiency of the algorithm in four different real datasets. The results show that the algorithm has good performance in terms of delivery ratio and routing overhead.
1. Introduction
With the booming of information technology and the popularization of wireless network equipment [1], people have a growing demand for the network. As a fresh type of selforganizing network [2], an opportunistic social network has attracted researchers’ attention [3]. There is no complete endtoend path between nodes in opportunistic social networks [4]; it uses the encounter opportunities brought by node movement to communicate hop by hop [5]. At present, opportunistic social network has widespread use in various fields, such as mobile phones [6], handheld electronic devices [7], vehicular networks with mobile intelligent devices on the road [8], wildlife tracking [9], and network transmission in remote areas [10].
The traditional social network method to deal with data transmission faces significant challenges [11], which will become an obstacle to the information exchange and sharing [12]. To enhance data transmission in a 5G wireless network [13], we should design a more convenient model to achieve data forwarding flexibly [14]. The user terminal equipment needs to transmit a large amount of data and needs to calculate these intensive tasks [15]. To enhance wireless devices’ computer ability, mobile edge computing (MEC) is proposed [16–18]. Because the mobile edge server locates at the edge of the wireless network and closer to the users, it can efficiently provide the surrounding users’ services and integrate the concept of opportunistic social networks into mobile edge computing, to reduce the consumption of source nodes [19].
However, each node has many social attributes [20]. They represent the relationship among different users, and the connections between nodes in the same community are more than closer [21]. So, the network nodes can be divided into communities by their different attributes to improve the algorithm’s performance [22]. The existing algorithms do not fully consider nodes’ characteristics, so there is a large space for improvement in community detection accuracy and efficiency [23]. That is why it is necessary to propose an efficient community detection algorithm.
Opportunistic social network uses the strategy of “storingcarryingforwarding” to handle the energy consumption problem in the data transmission process [24]. Messages are forwarded through encounter opportunities produced by node movement. In this paper, the network topology attributes and social attributes are used to measure the similarity between nodes, and the hierarchical clustering method effectively divides the community [25]. In the process of data transmission, if the mobile device does not have a suitable transmission target, the message will occupy a lot of cache, and the data transmission in the community is likely to wait a long time and cause the delay in transmission [26]. After dividing the community, we need to further establish the weight distribution between nodes and community to reduce the time complexity and overhead cost and construct a set of candidate relay nodes based on the relationship between information forwarders and adjacent nodes. From the perspective of minimizing bit error rate, the channel coefficients of the two channels from the source node to the relay node and the relay node to the destination node are analyzed. This must select the optimal relay node from the set of candidate relay nodes as transmission. In summary, we propose an efficient data transmission strategy for community detection in opportunistic social network using mobile edge computing combined with network topology and social attributes. The transmission strategy is divided into two periods: the initialization period and the routing period.
The contributions of this research study are as follows: (1)Initialization period: using network topology attributes and social attributes to measure the similarity between nodes, a community detection algorithm is proposed through hierarchical clustering.(2)Routing period: based on the relationship between the message forwarder and the adjacent nodes, a set of candidate relay nodes is constructed. By analyzing the channel coefficients of the source node to the relay node and the relay node to the destination node, a method for selecting the optimal relay node is proposed.(3)Simulation results show that the algorithm EDCD proposed in this paper has good performance such as delivery ratio, routing overhead, and average endtoend delay in different real datasets.
2. Related Works
Many researchers have conducted research on routing and forwarding algorithms in opportunistic social networks and proposed very effective approaches in different application scenarios in recent years. Many research methods have focused on algorithm research. Routing algorithms can be roughly dividing into two sorts: existing socialignorant algorithms and existing socialaware algorithms [27].
Existing socialignorant algorithms mean that social message relating to nodes will not make adaptable messaging decisions in the process of data transmission. Vahdat and Becker [28] proposed the epidemic routing algorithm. Epidemic algorithm is essentially a flooding algorithm, and each node forwards information to all its neighbors. However, there are a lot of message copies in the network, which will consume many network resources. Sisodiya et al. [29] proposed a flood routing algorithm, that is, spray and wait algorithm, which divides the information forwarding process into two steps. The first step is to copy the message and the transmission process is in the second step. It can easily lead to ultratransmission delay and data redundancy.
Sharma et al. [30] proposed a routing protocol named MLProph, which uses machine learning (ML) algorithms, namely, decision trees and neural networks, to determine the probability of successful message delivery, but this algorithm has great limitations. Tang et al. [31] proposed a scheme based on reinforcement learning (RL), which can apply to opportunistic routing transmissions that require high reliability and low latency. However, this opportunistic routing scheme can only be used for specific scenarios and is not for all networks. Wu et al. [32] proposed the algorithm that adjusts the cache by analyzing the importance of message propagation. This algorithm has a small routing overhead, but to avoid deleting the cached data, the data shares by adjacent nodes will cause data redundancy.
Socialaware algorithms refer to the social relationship between nodes to measure the transmission relevance between nodes. Yan et al. [33] established an effective data transmission strategy (ENPSR), which uses the priority of nodes and social relationships in opportunistic social networks. Obtain the data transmission priority by measuring the social attributes and historical information of the node. Then use the forecast plan to determine the appropriate message delivery decision. Wu and Chen [34] proposed an optimal routing scheme for cooperative nodes based on opportunistic network features. This scheme can use in social networks. By reliability, availability, and weighting factors are used as the weights of human activities to obtain the optimal cooperative node, but the algorithm has a high routing overhead. Drǎgan et al. [35] proposed that nodes can be divided into several communities according to their intimacy and the time together. This community detection method does not fully consider all of the nodes in the community.
Zeng et al. [36] proposed a socialbased clustering and routing scheme, in which each node selects the nodes with close social relationships to form a local cluster, but this can cause data redundancy issues. Liu et al. [37] proposed an algorithm using node similarity (FCNS) based on fuzzy routing and forwarding. This algorithm has good performance in data transfer ratio and routing overhead but high transmission delay. Niu et al. [38] proposed a predictive and extended routing protocol, which uses Markov chain as a node mobility model to realize the social characteristics of nodes. It does not consider node communication between different places, and nodes just upload and send message in the same place.
Because the abovementioned traditional methods do not fully consider node characteristics and other problems, this paper proposes a model that combined with the network topology and social attributes to detect community and analyze the channel coefficients of source node to relay node and relay node to destination node to select optimal relay node as information transmission in opportunistic social networks. This model can effectively handle the challenge of improving data transmission and has good performance of low delay and low routing overhead.
3. Model Design
In opportunistic social networks, we can define the topological structure , where is the node of the network and is the edge set in the network reflecting the relationship between the nodes. , and are nodes, and is the weight of the edges of node and node . On the basis of the division of the community, we make , which require more edges between vertices in each community subgraph. We consider that there will be differences between nodes and the number of encounters between nodes to weight each edge. This paper proposes to measure the similarity between different nodes in terms of network topology attributes and social attributes. The greater the similarity is between nodes, the more likely they are to belong to the same community.
Firstly, we must reasonably define the similarity between nodes. For a real social network, and considering the network topology, we also need to consider the social attributes between nodes. We must collect the data of the node, and the process is shown in Figure 1. The nodes information collection method is that the base station collects all node information in the area within a period of time. When the node has a transmission task, request the probability table of the source node and the destination node from the base station that has collected the information and use edge computing to transmit decision information to reduce node’s workload. Because many communities can usually only share messages based on one or two nodes, there must be enough cache to improve data transmission efficiency. The node requires obtaining the position, speed, and moving direction of itself and the destination node. However, the encounter of nodes in opportunistic social networks is random. Combining the characteristics of node movement to calculate the probability of node encounters, in this paper, means the probability of nodes and meeting in a period of time , and the node meeting interval time obeys the exponential distribution; then the probability of node and node meeting within the sensing range is where is the source node, is the destination node, is the encounter strength of node and node , and is the average time between node and node : where is the time of the kth encounter, and we define . In short, combine the formula to get where is the remaining time to live of the message, is the initial time to live of the message, and is the current time the message has been alive.
Secondly, construct the encounter probability matrix. The number of encounters between nodes to a certain extent only reflects the number of encounters of the node in a period of time.
Use the number of encounters between nodes to weight each edge. is the set of edge weights, where is the number of encounters between two nodes. represents the encounter matrix of node and node in a period of time .where represents the number of encounters the node has met with other nodes within a certain period of time.
In opportunistic social networks, network topology attributes reflect the status of the network. It requires more edges between the vertices in each community subgraph.(1)The strength of nodes describes how close the node is to the surrounding network, and the node strength is equal to the degree of the node, that is, the number of neighbor nodes. The defined formula is where is the node connection strength between node and node .is the set of neighbor nodes connected to a node in current times, and is the set of neighbor nodes connected to a node in current times. We have to consider that two nodes may share a set of similar neighbor nodes, so the higher the relationship between them, the higher the probability of data transmission.(2)The direct connection strength represents the influence of the direct connection between two nodes. When there is an edge between two nodes, the edge weight measures the strength of the connection between them. We define the sum of the weights of all edges adjacent to node as , where is the set of neighbor nodes of . For any , there is a relationship between node and node . So the formula for direct connection strength is as follows: where is the strength of the direct connection between two nodes and is also the ratio of the weight of the two nodes to the weight of their adjacent edge.(3)The indirect connection strength indicates the influence of the indirect connection between two nodes; just as when node and node have a common adjacent node , then node and node also have a certain chance to connect. The more adjacent nodes that two nodes have in common, the closer the two nodes are. So the formula for indirect connection strength is as follows: where and represent the connection strength between node and node through node , and the indirect connection strength between nodes is the sum of the strengths of all common neighbor connections. That is to say, the more common adjacent nodes the two nodes have, the greater the indirect connection strength is.
In the network topology attributes, we classify the possible relationships between two nodes into the following four types, where we use to express topological similarity between node and node .(a)No direct and no indirect connection:(b)Indirect but no direct connection:(c)Direct but no indirect connection:(d)Direct and indirect connection: where is the coefficient of the strength of node, is the coefficient of the direct connection strength, and is the coefficient of the indirect connection strength. The higher the topological similarity between nodes, the greater the chance of communication between nodes, which can improve data transmission efficiency.
The social attributes between nodes measure the social similarity between two nodes.(1)The geographic relevance of nodes: the node has mobile characteristics; the mobile node’s trajectory information is used to analyze the geographic location correlation of the node. The trajectory information refers to the geographic location information of the sensing area. The sensing area is the area where the node can transmit messages within a certain range. Specifically, in the time period , if the nodes’ geographical locations are close, it means that the probability of node information transmission is high; that is to say, the probability of meeting in the same area will also be increased. The geographical correlation between nodes can be expressed as where is the geographic relevance of nodes, represents the similarity function of node and node at position , represents trajectory information of node , and represents trajectory information of node . where takes the maximum value between and , is the time when node enters the sensing area for the time, and is the time when node enters the sensing area for the time. represents take the minimum value between and , is the time when node quits the sensing area for the time, and is the time when node quits the sensing area for the time.(2)The interesting relevance of nodes: users with common interests will visit the same business. Naturally, mobile users with the same interests will spend more time and energy communicating together. The information transmission between nodes will be carried out between mobile users with the same interest in the time period . The interesting relevance between nodes can be expressed as where represents the interesting relevance between node and node . represents the ratio of time occupied by node and node during the kth transmission of information in time period . represents the ratio of the time occupied by node and other nodes except node in the k1th transmission information in time period .(3)The separating time relevance of nodes: two nodes can make a connection and communicate. The average interval between two nodes can be defined as the time interval when two nodes meet each other. If there is no communication for a long time, the relationship between the two nodes is not close enough. Conversely, a shorter separation means that the two nodes are closely related. The separating time relevance of nodes can be expressed as where represents the separate time relevance of node and node to convey information. is the time of the kth transmission of information in the time interval . is the time of the first transmission of information in the time interval .
Through the above calculation of social attribute values, we can quantify the relationship between node and node . represents the similarity of social attributes as follows: where is the coefficient of the geographic relevance of nodes, is the coefficient of the interesting relevance of nodes, and is the coefficient of the separating time relevance of nodes. The higher the node’s social attribute value, the higher the closeness between the nodes and the higher the probability of encountering communication, which will improve the efficiency of information transfer between nodes.
Node similarity is affected by the network topology and social attributes. represents the similarity between node and node . Correspondingly, in this paper, we define node similarity to be composed of network topology and social attributes, and the node similarity formula is
Through the above description, we can know the relationship between nodes more accurately. The higher the node similarity, the more frequent the communication between nodes. Source node can accurately find the relay node and then transmit information to the destination node by establishing a community [39]. The information transmission in this process is more efficient, and the time delay reduces.
The nodes within the same community are closely connected. Community detection is essentially the clustering of nodes with a tight structure in the network. This paper uses a hierarchical clustering algorithm to divide the community. Lead in modularity , which is used to measure the degree of community division. The fast unfolding algorithm considering data scale, running time, and other aspects of the community division results is ideal. The algorithm is stable and will continuously merge nodes to construct new graphs, which significantly reduces the calculation amount. The algorithm steps are as follows: Step 1: initialize and calculate the node similarity; divide each node into the community where the adjacent node is located. As shown in Figure 2, the source node is in community one. We try to move the node to community two and community three. Calculate the corresponding modularity value, and move the node to the corresponding community with the largest change value. We lead in modularity to measure the degree of community division. The specific calculation formula is as follows: where is the modularity, represents the number of connections within the community, represents the sum of degrees of all nodes in the community, and is the sum of weights in the network. Step 2: select each node one by one, and calculate the modularity gain divided into the community where the adjacent point is located. represents modularity gain, and the calculation formula is as follows: where is the sum of weights from node to the community and is the sum of the weights of node . After calculating the modularity gain, we have to determine whether it is a positive number; if it is a positive number, it will be divided into the corresponding community; otherwise, no division will be made. Step 3: repeat Step 2 until the node’s community no longer changes. Step 4: construct a new graph; each point in the new graph is each community divided in Step 3; continue to execute until the community structure does not change.
This paper roughly divides the above algorithm steps into two stages: Stage 1: divide each node into the community where the adjacent node is located so that the modularity value becomes more immense. Stage 2: the communities divided in the first stage are aggregated into one point, and the network is reconstructed until the structure of the network no longer changes.
This paper draws on the hierarchical clustering idea of the fast unfolding algorithm. We use network topology attributes and social attributes to express node similarity and comprehensively calculate node similarity to update network weights. In the first stage of node merging, we form an initial community to merge and improve the overall modularity and then calculate modularity gain; if is positive then the two communities are merged; otherwise they will not be merged. The modularity gain is calculated repeatedly, and the final division result is output.
Nodes have the characteristics of random movement, and it is vital to establish a community. In opportunistic social network, many communities can usually deliver messages based on only one or two nodes. If these nodes do not have enough cache or overhead, data transmission in the community is likely to wait a long time. Therefore, after we divide the community, we need to establish further the weight distribution between the nodes and the community reconstruction so as to reduce the time complexity and overhead cost better. Below we will prove the changes in the community of the source node during the movement.
We define at time , is the degree of modularity of the community, is the total weight of weight, is total weight of the edges of community , is the degree of node in community , and is the increment of edge weight.
Proposition 1. In opportunistic social networks, the weight of the edge made by a node with other adjacent nodes in the network increases; the community relevance also will increase.
Proof. With time , the modularity in the community is .
When the time increases to , the modularity change in the community can be expressed asWe can get , so we just need proof .
In other words,It is known that is the total of nodes in the network, and no community in the network appears more than . In short, we are aware that increasing the weight can increase the community’s relevance in opportunistic social networks. For this paper, the weight will affect the community’s relevance in opportunistic social networks, and the proposition holds.
Proposition 2. If the weight of an edge of two communities increases, node is in community , will be increased, and will be decreased. The community corresponding to the node will change, and the weight of an edge between the node and the community is ; if the weight of the edge can be changed, the result of the community will also change.
Proof. Before the weight changes, for node ,After the weight changes, for node ,Because , when ,All in all, if the weight of one side increases, then for node increases. Then,Because , , , for all edges in the network , .
If the weight of one side increases, then decreases.
If the weight of an edge of two communities increases, node is in community , then will increase and will decrease.
Proposition 3. If node and node are connected, and one of the nodes has one and only one edge, when the weight between node and node drops, the community will not divide.
Proof. Let us assume that the community is divided; then the following three conditions must be met: As the weight changes, the formula can also be expressed asSo, it can be seen from the above proof and we conclude that is false.
For a node in opportunistic social networks, if it has only one edge connected to another node, the community will not divide when the weight between the two nodes decreases.
After community detection, we construct a set of candidate relay nodes according to the relationship between the information forwarder and adjacent nodes. Select the optimal relay node from the set of candidate relay nodes to undertake the transmission task. Therefore, selecting one or more relays among multiple relay nodes to participate in transmission has become our concern. As shown in Figure 3, when the community is established and transmitted between each community, it is necessary to find a reliable relay node to transmit information. To achieve higher efficiency, construct a set of candidate relay nodes from the neighbor nodes of the source node; from the perspective of minimizing the bit error rate, this paper analyzes the channel coefficients of the two segments of the source node to the relay node and the relay node to the destination node and chooses the AF protocol as the relay node’s forwarding method, which is suitable for the information transmission process of various channel qualities [40]. Calculate the sum of the channel coefficients of the channel corresponding to each relay node, and find the largest coefficient of the relay node, which is the optimal relay node and will improve the efficiency of information transmission.
Let us suppose there are a source node , destination node , and relay nodes , when transferring information between communities. The communication model is as shown in Figure 4. In this case, the channels from the source node to the destination node and the source node to each relay node are all Rayleigh fading channels, which obey the Rayleigh distribution. We assume that the channel coefficient from the source node to the destination node is , the channel coefficient from the source node to the nth relay node is , and the channel coefficient from the nth relay node to the destination node is .
The transmit power of the source node is , and the transmission power of the relay node is . When there is a direct transmission from the source node to the destination node, the power . When the source node sends information to the destination node and the relay node is with power , noise from the source node to destination node is and noise from the source node to the relay node is . So information received by the relay node and the destination node is as follows:In the AF protocol, when the relay node receives the signal from the source node and forwards it to the destination node, it will amplify the received signal, and the scaling factor isWe can know that the signal from the relay node to the destination node is , and then the information sent by the relay node to the destination node isThis paper’s focus on selecting the optimal relay node is how to find an optimal relay node that makes the channel coefficients of the source node to the relay node and the relay node to the destination node larger.
The channel coefficient matrix from the source node to the relay node is , and the channel coefficient matrix from the relay node to the destination node is . Then,We define a threshold for the number of candidate relay nodes and set ; we have to consider the following situations:(1)If , compare the channel coefficients of each relay node corresponding to matrices and , find the smaller of the two, and store the smaller value in the matrix . Sort the matrix elements from largest to smallest, select the first m relay nodes with a larger value from them, and store them in the matrix and , where is the smaller value of the channel coefficient of the two channels corresponding to the relay node . where is one of the first elements in the matrix after sorting. The value of largely depends on the number of candidate relay nodes , the larger the value, the lower the bit error rate. Bit error rate refers to the index of the accuracy of data transmission within a specified time. where is bit error rate, is the bit errors in transmission, and is the total number of codes transmitted. We add the two channel coefficients of these m relay nodes, and the relay node with the largest sum is the optimal relay node as follows:(2)Otherwise, when the number of candidate relay nodes is less than the threshold, we must pay attention to the accuracy of being selected as the optimal relay node; calculate the sum of the channel coefficients of the channel corresponding to each relay node and the relay node with the largest sum, which is the optimal relay node.Based on the above definition, we propose an efficient data transmission algorithm EDCD and the algorithm steps are as follows: Step 1: calculate the encounter probability of node and node , construct the encounter probability matrix, and use the number of encounters between nodes to weight each edge. Step 2: define node similarity, which is composed of network topology attributes and social attributes. Network topology attributes are composed of the strength of node, the direct connection strength, and the indirect connection strength. Social attributes are composed of the geographic relevance of nodes, the interesting relevance of nodes, and the separating time relevance of nodes. Step 3: use a hierarchical clustering algorithm to divide the community and lead in the modularity . The modularity is used to measure the degree of community division. And the fast unfolding algorithm is used to calculate the node similarity to update the network weight comprehensively. Step 4: from the perspective of minimizing the bit error rate, after the community is divided into a multihop wireless network, construct a set of candidate relay nodes based on the relationship between the information forwarder and adjacent nodes and select the optimal relay node from the set of candidate relay nodes to undertake the transmission task. Analyze the channel coefficients of the channels from the source node to the relay node and the relay node to the destination node, and select the AF protocol as the relay node forwarding method for routing and forwarding.To enhance the understanding and readability of the entire algorithm, the specific calculation flowchart of the EDCD algorithm is shown in Figure 5. Algorithm 1 gives the initialization and community establishment phase of the proposed algorithm, and Algorithm 2 presents the routing and forwarding phase of the proposed algorithm.


4. Simulation and Analysis
To assess the performance of the EDCD, we use a simulation tool called ONE (Opportunistic Network Environment) [41] and we compare with the following four typical routing algorithms.
Spray and wait [29]: this algorithm sprays the copies to the network and waits for these nodes to reach the destination node. The number of copies of the algorithm will affect performance, reduce the message delivery success rate, and increase the delivery delay.
SCR (Socialbased Clustering and Routing Scheme) [36]: this algorithm is a useful measurement method of social relations between nodes in mobile opportunistic network, and is a novel socialbased clustering and routing scheme.
SECM (status estimation and cache management) [42]: the algorithm uses state estimation and cache management methods to identify surrounding neighbors to evaluate the transmission probability between nodes, to ensure that they have high transmission, and to achieve the purpose of adjusting the cache.
EIMST (effective information transmission based on socialization nodes) [2]: the algorithm is based on social nodes to achieve effective information transmission. According to the defined stop time, when , the node forwards the message with the most excellent probability, and when , the node stops sending the message.
Download the real datasets from the network repository to experiments. According to the data information required for data transmission in opportunistic social networks, and choose pagesgovernment [43], wikielec [44], advogato [45], and slashdot [46] four datasets for simulation experiments. The characteristic information of the four experimental datasets is shown in Table 1.
In the simulation experiment, we set the following metrics according to the characteristics of data transmission. The EDCD algorithm and the other four algorithms run in the same simulation environment to compare their performance.(1)Delivery ratio: probability of choosing a suitable node as the nexthop node, represented as follows: where is the number of messages received by the destination node and is the total number of sent messages.(2)Routing overhead indicates the overhead between nodes when transmitting information, represented as follows: whereis the total time of the transmission between nodes and is the time to transmit a successful message between nodes.(3)Average endtoend delay: express the delay in selecting the optimal next hop. where is the total delay of per node and is the total number of nodes successfully receiving messages.
The correlation between the time and delivery ratio in four different real datasets is shown in Figures 6–9. Figure 6 shows the delivery ratio of spray and wait, SCR, SECM, EIMST, and EDCD algorithms in pagesgovernment dataset. We can infer that when the simulation time is less than one day, the advantages of the algorithm EDCD are not apparent in the four real datasets. However, as the simulation time increases, we can find that the transmission rate of the EDCD algorithm is always bigger than other algorithms. EDCD algorithm divides the community by node similarity, and the effective nodes in the community carry out data transmission, so the data delivery ratio is better than the other four algorithms. The relationship between the delivery ratio and the simulation time in wikielec dataset is shown in Figure 7. The SCR algorithms deliver information to nodes, and the community by using the flooding method leads to mass information missing. The delivery ratio of SECM is 0.65–0.78. EIMST and EDCD algorithm’s delivery ratio is higher than the other. EIMST algorithm controls the time interval of delivery information that improves the transmission and receiving of effective information, and its delivery ratio reached 0.66–0.81. Due to the adoption of the EDCD algorithm combining network topology and social attributes, the algorithm’s transmission rate is the highest among all algorithms, reaching 0.67–0.84.
The correlation between the delivery ratio and simulation time in advogato dataset is shown in Figure 8. We see that the algorithm with the highest delivery rate is the EDCD algorithm, reaching 0.85–0.88. The spray and wait algorithm uses flooding to transmit information at community nodes, a large amount of information is lost, and the delivery rate is the lowest, only 0.67–0.70. Figure 9 shows the relationship between time and delivery ratio in slashdot dataset. The dataset with the largest number of nodes in the four datasets is slashdot dataset. When the simulation time is less than one and a half days, each dataset’s delivery ratio is rising sharply, and the time is up to three days; only the EDCD and EIMST algorithms’ delivery ratio is rising. This is because, in slashdot dataset, the two algorithms quantify the social attributes in 5G environment of nodes. On the whole, in the EDCD algorithm, the delivery ratio is 0.76 on average, which is higher than the other algorithms.
The correlation between the time and routing overhead in four different real datasets is shown in Figures 10–13. The comparison of the routing overhead between these five different algorithms in pagesgovernment dataset is shown in Figure 10. The average routing overhead of the EDCD algorithm is always kept to the lowest. The algorithm uses the node similarity to divide the community and uses the optimal relay node strategy to forward information. The routing overhead of the EDCD algorithm is maintained between 40 and 65.
Figure 11 shows the association between routing overhead and time in wikielec dataset. In the spray and wait algorithm, redundant message group copies require a lot of time and resources, which is the main reason for the vast routing overhead. In the SCR algorithm, each node only forwards a copy of the message to the node with the destination node as a cluster member, ignoring the current availability of the nexthop node, which will cause overhead. In the SECM algorithm, because the node injects many redundant data, the overhead will be large. In the EIMST algorithm, information and buffer space can be effectively managed, but it consumes some unavailable node resources. In terms of routing overhead, EDCD always performs best among these five algorithms. Figure 12 shows the relationship between time and routing overhead in advogato dataset. Compared with other algorithms, EDCD algorithms select the optimal relay node and set up the weight distribution between nodes and community to reduce the overhead cost. Regarding the spray and wait algorithms, a lot of redundant information use lot of computing resources. For SCR and SECM algorithms, the cooperation mechanism is conducive to the reasonable allocation of computing resources, so the cost of these two algorithms is in the middle level. EIMST does not fully consider the transmission preference of nodes, so its performance is worse than that of EDCD algorithm.
The relationship between routing overhead and time in slashdot dataset is shown in Figure 13. From the chart, we can see that the routing overhead increases sharply at first, nearly stably by the time it reaches the third day. The routing overhead of the spray and wait algorithm increases dramatically; a large number of data copies are generated in slashdot dataset with a large number of nodes, and these need to be processed, so the routing overhead is higher than other algorithms.
The association between the time and average endtoend delay in four different real datasets is shown in Figures 14–17. The relationship between the average endtoend delay and time of each algorithm in pagesgovernment dataset is shown in Figure 14. Compared with the other four algorithms, the EDCD algorithm has the lowest average endtoend delay.
Since the EDCD algorithm proposes a strategy for dividing communities by analyzing the comprehensive characteristics of nodes, it can reduce inefficient nodes that are not helpful to the transmission process, reducing the average endtoend delay. The spray and wait algorithm has more message copies, which will cause corresponding delays. The SCR algorithm effectively forwards the copy of the message to the destination node, so the transmission delay is lower than the spray and wait algorithm. SECM algorithm will also increase the cache of node before data transmission, so there will be a corresponding delay.
Figure 15 shows the association between routing overhead and time in wikielec dataset. We can see that the EIMST algorithm’s delay is higher than that in other datasets but lower in the rest of the datasets. Because the EIMST algorithm applies node based on information management, there are more nodes in the wikielec dataset, and the delay increases as the simulation time increases. In short, the average endtoend delay of the EDCD algorithm in wikielec dataset is lower than the other four algorithms.
Figure 16 shows the relationship between average endtoend delay and time in advogato dataset. To be specific, spray and wait algorithm’s maximum delay could reach 95 because this method remarkably increased routing and message forwarding delays. The SCR and SECM algorithms have lower delays than the spray and wait algorithm because both algorithms effectively controlled a lot of message copies. Besides, the SCR algorithm implemented community division and information management. In contrast, the SECM algorithm effectively utilized the cooperation mechanism between nodes to utilize the nodes’ cache space reasonably to reduce the delay in the message forwarding process.
The average endtoend delay of the EIMST algorithm was also significantly lower than the other algorithms. Figure 17 shows the correlation between the average endtoend delay and time in slashdot dataset. In a dataset with many nodes, we can see in the figure that the average endtoend delay of the EIMST algorithm is significantly higher than other datasets. That is why the EIMST algorithm implements community detection. However, the effect is general when processing large amounts of data. The algorithm EDCD proposed in this paper has a lower latency in different real datasets than other algorithms.
5. Conclusions
In this study, an effective data transmission scheme in opportunistic social networks that uses mobile edge computing combined with network topology attributes and social attributes to measure node similarity to divide communities and select the optimal relay node. This algorithm is mainly based on the idea that the closeness between nodes in the community is higher than that exterior in the community and provides a method for selecting the optimal relay node according to the sum of channel coefficients in the process of transmitting information. The simulation experiment results show that the strategy has good performance in different real datasets such as delivery ratio, routing overhead, and average endtoend delay. The EDCD algorithm can be used to the 5G data transmission scene and can cope with the challenges of stability and continuity required by data in the interactive process through efficient community division and information transmission. In future work, we will enhance the related performance of the algorithm and will further study the security of data transmission in opportunistic social networks.
Data Availability
The data used to support the findings of this study are currently under embargo, while the research findings are commercialized. Requests for data, 12 months after publication of this article, will be considered by the corresponding author.
Conflicts of Interest
The authors declare that they have no conflicts of interest.
Acknowledgments
This research was supported by the National Science Foundation of China under Grant 61966035 (Research on SuperResolution Reconstruction of Remote Sensing Images Based on Deep Learning of SpatioTemporal Spectrum Features), by the Intelligent MultiModal Information Processing Project (XJEDU2017T002), by the International Cooperation Project of the Autonomous Region’s Science and Technology Department’s “Datadriven ChinaRussia Cloud Computing Sharing Platform Construction” No. 2020E01023.