Abstract

Due to limited energy and computing power of IoT devices, they cannot handle complex tasks. Edge computing technology effectively solves the requirements of computing power and response delay for complex tasks in devices by migrating computing power to the vicinity of IoT devices. For a separable complex task on IoT terminal, we focus on the effects of data distribution, dependencies, and offloading sequence of subtasks on its total delay when it is offloaded to edge servers. Through comprehensively considering these factors, we study the slicing and choreographing method during the offloading process of a complex task. Firstly, a task slicing method based on hierarchical clustering is presented and an improved hierarchical clustering algorithm is used to obtain the optimal solution of task partitioning. Secondly, a task choreographing method based on overlapping the longest path is presented. Finally, through the simulation experiments of complex tasks with different structures and loads, the effectiveness of our method is verified.

1. Introduction

In recent years, as the mobile Internet industry matures, the rapid explosion of the Internet of Things (IoT) leads the vigorous development of mobile intelligent terminal devices, which are widely used in transportation, health, entertainment, and other fields. At the same time, the applications deployed in IoT devices are becoming more and more complex. For example, they need to deal with large amounts of IoT data and complex processing processes. Limited by their own processing capacity and battery capacity, IoT devices have been unable to meet the needs of these tasks. Cloud computing emerges as a computing mode with unlimited supply of resources and becomes an effective supplement to terminal processing capacity. Mobile devices transfer data to remote terminals, use the resources of cloud data centers to complete efficient operations, and return the final results to users, so as to achieve the goal of fast data exchange. Compared with performing the task directly on the user terminal, the task processing mode of transferring data to the cloud is faster and more efficient, which can support mobile applications to achieve richer functions. However, such mode also has some disadvantages. For example, as users need to transfer amount of data to the cloud center for processing, the data transfer time is too long to exceed the effective delay requirement of the application. In addition, the link distance between mobile devices and cloud center is long, which is prone to interruption or instability, leading to the failure of the feedback results.

The appearance and application of edge computing solve the above problems to a certain extent. Edge computing provides cloud services and IT environment for IoT devices through sinking the processing capacity of the cloud platform to the network edge closer to IoT devices, which makes tasks on IoT devices be offloaded and processed more quickly. Compared with cloud computing, edge computing enables mobile devices to have a shorter data transmission path for offloading tasks, thus reducing the feedback and transmission delay of data and results, meeting the needs of delay-sensitive tasks on IoT devices. Meanwhile, processing tasks at the edge is also helpful to relieve the traffic pressure on the network.

There are some drawbacks to offloading tasks to edge servers. Due to the limitation of edge server’s own processing capacity, the processing time of offloaded tasks will increase when the computing requirement is relatively large. Such processing delay cannot also satisfy a delay-sensitive task, and some complex tasks even exceed the processing capacity of a single edge service. For this, there are two solutions. One way is to offload tasks to multiple edge servers through distributed computing to shorten the time delay of task execution. The other way is to combine the edge computing with the cloud computing to work together to complete the tasks, which can improve the processing capacity and shorten the transmission time. In both cases, we need to divide complex tasks into smaller ones and choreograph them to multiple edge servers or the cloud.

In this paper, we focus on the slicing and choreographing method of a complex delay-sensitive task at edge servers. The main research work and contributions include the two following aspects: (1) Aiming at the problem that the response time of offloading task is too long to meet the delay requirements of IoT devices, the task slicing method based on hierarchical clustering is improved to reduce the communication cost of subtasks on different servers and minimize the time consumption of task workflow while supporting the parallel and distributed execution of subtasks. (2) On the basis of task slicing, a subtasks choreography method combining static, dynamic, and the earliest start time of the task workflow is proposed, and a scheduling algorithm of task workflow on edge servers is designed.

Mobile edge computing provides a promising solution for mobile terminal devices with limited computing capacity to complete complex, intensive, and sensitive computing tasks. Therefore, in the MEC system, it is very important to optimize the assignment and scheduling of mobile terminal tasks. In recent years, many strategies, methods, and algorithms have been proposed. In this section, we make a detailed introduction and analysis of representative work related to this paper.

Mobile edge computing provides enhanced computing power for mobile devices by deploying edge servers next to the communication base stations. When a mobile terminal receives a task request, the first issue to be decided is whether to offload part or all of the task to an MEC server. Considering the computing needs of different tasks on mobile devices, the optimization method of task partition ratios was proposed to minimize the maximum task latency. In [1], the authors divided multiple parts into a single subtask with prior knowledge and modeled the ordinal number relation of parts to guide the segmentation process in a circular way. In [2], each user could partition their computation task into offloading computing and locally computing parts in multiuser MEC networks. In [3], the offloading location of the task was further extended to the cloud server. Each task on the mobile device can be decided to be processed locally at its mobile device or offloaded to one of the edge servers or a cloud server. When the offloaded task is complex and the computing power of a single edge server is limited, an offload strategy is proposed to divide a task into some subtasks and deploy them in multiple servers. In [4], the authors assumed that each user’s tasks were separable and proposed a distributed algorithm to obtain the hierarchical multilevel offloading decisions. In [5], a multiserver system with dynamic speed and power management was modeled as queueing systems, and then the issue of the optimal task dispatching on multiple heterogeneous server systems was addressed. In [6], the authors allocated computing tasks to suitable cores of mobile devices or the cloud in the MCC and proposed an optimization framework to minimize the total energy consumption and maximize the system reliability. For the task offloading problem of a heterogeneous multilayer MEC (HetMEC), the authors designed the latency minimization algorithm by jointly coordinating the resources among the end devices, multilayer MEC servers, and the cloud center in [7]. In [8], the authors designed LL-MLS algorithm to find an optimal partition of a given workload through task scheduling and energy allocation strategies. In [9], the authors proposed an energy-aware cooperative routing (ECoR) scheme for optimal handling of task offloading between source and target UAVs in a gridlocked swarm.

Offloading the task to the edge computing system not only provides the task with expanded processing capacity but also brings with it the transmission delay caused by the offloading process. Therefore, the allocation of wireless resources for offloading task is also the focus of many research works. In [10], the authors transformed the problem of joint task assignment and wireless resource assignment into a mixed-integer nonlinear program (MINLP) and proposed a suboptimal solution algorithm based on relaxation convex problem to reduce time delay for offloading tasks. In [11], the problem of task assignment in the MEC in the return network was solved by a similar method. In [12], an online adaptive task allocation and computing offload strategy was proposed, which coordinated and optimized the wireless and computing resource allocation by considering dynamic wireless conditions and service delay constraints.

In order to efficiently implement the allocation and deployment of multiple tasks in the MEC, some researchers regard the problem as a joint optimization problem considering various offloading conditions. In [13], the authors minimized energy consumption for all devices and their task delay constraints by cooptimizing communication and computing resource allocation on devices and mobile edge servers. Considering the task completion time and the mobile device energy consumption, the authors in [14] proposed a heuristic offloading decision algorithm (HODA), which jointly optimized the offloading decisions, communication, and computing resources to maximize the system utility. In order to reduce the complexity of the joint optimization problem, the original problem was decomposed into two subproblems in [1521]. In [15], the authors addressed the resource allocation problem using the convex and quasi-convex optimization techniques and solved the problem of task assignment by a heuristic algorithm. In [16], the task partitioning subproblem was taken as a set of univariate optimization problems, which can be easily solved, and the task scheduling subproblem was solved through a heuristic algorithm. In [17], the problem of resource allocation was further decomposed into two stages: the computing resource optimization and the communication resource allocation. The authors proposed a subchannel allocation scheme, and then the transmission power allocation was considered as a convex optimization problem based on the scheme and was solved by the Lagrange multiplier method. For the resource allocation scheme, the authors in [18] proposed a computing framework based on the weighted sum of task completion time and energy consumption in the MEC system, while the authors in [19] proposed a task shunting and resource allocation algorithm based on Deep-Q network. In [20], the authors considered users’ risk-seeking or loss-avoidance behaviors in their final decision. In [21], the authors proposed the energy-efficient multihop communication solution in smart city environment.

In addition to finding a better task offloading strategy by optimizing the allocation and consumption of communication resources and computing resources during task offloading, some other factors, such as shared data among tasks, offloading sequence of tasks, and the mobility of mobile devices, also have an important impact on the efficiency of task offloading. In [22], the authors studied the task assignment algorithm in data shared mobile edge computing systems and proposed three algorithms to deal with holistic tasks and divisible tasks, respectively. In [23], an adaptive slicing method for decentralized workflow based on clustering was proposed. Then a data-related task scheduling algorithm based on the correlation task model was designed, which gave an evaluation function to reduce the intercore communication during the process of task execution by assigning highly the correlated tasks to the same core. In [24], the authors gave full consideration to the mobility of user in the MEC and then proposed a device-to-device (D2D) cooperation method to expedite the task execution of mobile user by leveraging proximity-aware task offloading. In [25], the user mobility and network constraints were considered, and a lightweight heuristic solution was proposed for fast scheduling. In [26], a task allocation solution for optimizing latency and service quality was proposed to support the mobility of vehicles, in which the constraints on service latency, quality loss, fog capacity, stationary task allocation, and mobile fog nodes were taken into account. In [27], the effective task offloading scheme in the MEC was designed, in which the tasks are offloaded to the adjacent servers at the next AP in the direction of vehicle driving. In [28], the authors emphasized the importance of optimizing operation sequence in multiuser MEC system and established a computation offloading model to optimize the task operation sequences and starting times for uploading, executing and downloading, and duration times for uploading and downloading. In [29], a spatiotemporal framework based on stochastic geometry and continuous time Markov chains was proposed. The experimental results showed that the framework can find the optimal number of edge servers for parallel computing of the user task. In [30], the authors studied the scheduling method of parallel tasks merging and scheduling for parallel deep learning applications in the MEC.

When the tasks of mobile terminals are offloaded to the edge computing system, particularly the tasks that can be divided are offloaded to different edge servers, and the execution sequence of tasks on edge servers is the key to determine the actual execution efficiency of tasks. For this, the authors in [31] focused on the problem of providing QoS and performance guarantees to divisible loads and then proposed a linear algorithm for real-time divisible load scheduling by eliminating the need to generate exact schedules in the admission controller. In [32], the authors adopted a Markov decision process to handle the problem of computation task scheduling for MEC systems, where the computation tasks were scheduled based on the queueing state of the task buffer, the execution state of the local processing unit, and the state of the transmission unit. In [33], a partitioned fixed-priority real-time scheduling based on dependent tasks split on homogeneous multicore platform was proposed, which converted dependent tasks into a series of sequential jobs and obtained the interrelated subtasks path as well as synthetic deadlines through the B-tree task model. In [34], the authors creatively proposed a deep learning architecture based on tightly connected network and proposed a corresponding multitask parallel scheduling algorithm. In [35], a peer-to-peer (P2P) enhanced task scheduling framework to minimize the average task duration in device-to-device (D2D) network was proposed. In the framework, an iterative algorithm based on alternating optimization and sorting technology was used to solve the approximate optimal scheduling solution.

To sum up, when a complex task is divided and offloaded to multiple edge servers or cloud, the efficiency of task offloading is affected by many factors. The above studies put forward some effective task offloading strategies from the perspectives of processing capacity of mobile terminals and servers, communication channel allocation in the process of task offloading, and optimization of multitask deployment in multiple servers. However, these studies pay little attention to the effect of data interaction between subtasks and the offloading sequence of subtasks on the execution delay of the whole task after tasks are divided into subtasks. Obviously, these factors also have a great impact on the efficiency of subsequent task scheduling. Although papers [3135] focus on these two factors to optimize the task offloading process, they are not considered as a whole. However, task slicing is closely related to its offloading sequence, and different slicing schemes should correspond to different offloading sequence to optimize the execution delay of the task to the maximum extent. So we focus on the two following problems in the offloading process: (1) in the distributed deployment of complex tasks on edge servers, the data dependencies among the subtasks are fully considered to minimize the communication delay caused by such data dependencies during task execution. (2) In the process of task scheduling, the effect of subtask offloading sequence on task execution is considered, and the execution delay of the whole task is minimized by parallel subtask offloading and subtask execution.

3. Problem Definition and Formalization

Here, we consider a mobile task offloading scenario with multiuser, multiedge servers. Users’ mobile devices can connect and communicate with base stations covering their signals. Edge servers are uniformly deployed near these base stations. At least one edge server is deployed near each base station. Assume that there are m mobile terminal devices and n edge devices in this scenario, represents the set of mobile terminal devices, and represents the set of edge servers. The service request sent by each terminal device can be divided into a series of subtasks. We use a directed acyclic graph (DAG) to represent the tasks offloaded by the mobile terminal and the relationships among them, represented as , where represents all subtasks offloaded by the mobile terminal and k is the number of subtasks; if the output of subtask is an input to subtasks } represents data dependencies between subtasks; represents the workload of each subtask in set ; that is, is the CPU cycle of subtasks ; represents the size of the input data from to . An example of a mobile terminal task offloaded on edge servers is shown in Figure 1.

Given , we need to offload multiple subtasks with dependent relationships to the edge service system. If all subtasks in T are deployed on the same edge server, the computing capacity constraints of a single server and the serial execution of tasks may lead to too long feedback delay of the task to meet its needs. In order to take advantage of edge service system to better meet the demands of mobile terminal, we need to offload the task to multiple edge servers, respectively, make some tasks in parallel execution, and shorten the overall delay of the task. For example, the subtasks , , and can be executed in parallel in Figure 1. A new challenge is that the data dependencies between subtasks introduce new transmission delays. In particular, when the subtasks with large data dependencies are deployed on different servers, the new latency introduced may even outweigh the time savings in the process of executing the tasks in parallel. To solve this problem, the goal of this paper is to first find a task partitioning scheme based on the dependencies between subtasks; we call it task slicing, which can reduce the introduction of new delay as much as possible while deploying all subtasks in a distributed way. In addition to the task slicing affecting the execution delay of the task, the offloading order of the subtasks also has a certain impact on the feedback delay of the task. However, most of the existing studies mainly ignore this problem. In fact, due to the limited wireless communication resources in the MEC environment, when multiple tasks are offloaded at the same time, each task will receive less wireless resources, which will inevitably increase the transmission delay of offloaded subtasks. In , the subtasks do not need to be executed at the same time, so it is not necessary to simultaneously offload subtasks to different edge server. We just need to make sure that a subtask is offloaded before it is executed, which can maximize offloading bandwidth allocation of the subtasks so as to shorten their transmission delay. Therefore, we will study the task choreography method based on task slicing to optimize the overall delay of the task. The problem in this paper is formally described as follows:wheres.t.

Equation (1) is our optimization goal to find a slicing and choreography scheme for task offloading of mobile devices, that is, and , which minimizes the response time of the overall task. As shown in equation (2), is a partition of . Each task slicing corresponds to a priority, which represents the order in which subtasks are offloaded, as shown in equation (3). and represent the start and end times for executing , respectively. and represent the start and end times for offloading t, respectively. and are used to obtain the last and first task slices in the choreography scheme, respectively. Equations (4)–(9) represent the constraint conditions that need to be satisfied when offloading the task, where represents the data transmission time between TLj and TLi and is used to obtain the preorder task slicing of TLi. Equations (4) and (5) represent the basic requirements of task slicing. Equation (6) indicates that if there is a dependency relationship between the subtasks in a task slicing, the start time of the subsequent subtasks must be later than the end time of all the preordering subtasks. Equation (7) indicates that if there is a dependency relationship between the subtasks in different task subsets, the subsequent subtask cannot be executed until it has received the output data of all the preordering subtasks to the server where it was deployed. Equation (8) indicates that any task slicing must be offloaded to the corresponding edge server before it begins to execute. Equation (9) indicates that when multiple subtasks need to be offloaded on the same edge server, they must be offloaded in sequence.

4. Task Slicing and Choreographing Model

For complex mobile terminal requests that can be divided into multiple subtasks, we can distribute these tasks on multiple edge servers to improve the processing capacity, which is conducive to reducing the execution delay of terminal tasks. In this section, we will establish a slicing and choreographing model for complex tasks to minimize the transmission and computation delay in the process of task offloading. Here, we name our method SSCS (Slicing Similarity and Choreograph Sequence).

4.1. Task Slicing Method Based on Data Dependencies

The objective of task slicing model is to optimize the parallel execution of subtasks on different edge servers and minimize the delay caused by data transmission between subtasks. We propose a slicing model based on task workflow, in which the concept of task similarity is defined based on the dependency relationship between subtasks. For two subtasks, if there is a large amount of data exchange between them but less contact with other subtasks, they will be divided into the same cluster. The related definitions are given below.

Definition 1. Subtask (). It istThe basic unit of a task slice, which refers to a task that cannot be divided again, corresponding to an element in the task set .

Definition 2. Task slice (). It is the basic unit of task deployment. After a task is split into multiple subtasks, the similar subtasks will be grouped into the same subtask set, called task slice.

Definition 3. Task hierarchy association matrix (). It is a hierarchical logic relation of task execution for given , represented as , where k is the number of subtasks in T and r is the number of logical levels that the task workflow needs to be executed at least, formallyL(tj) = i means tj is at the i-th layer. means that all the preceding subtasks of have been executed, and task can start to execute.

Definition 4. Direct correlation of subtasks (). In G, if exists, then the direct correlation degree between task and is represented byIt represents the amount of direct communication data between two subtasks, and the larger the value is, the greater the probability that the two subtasks should be deployed to the same edge server.

Definition 5. Dependency between subtasks (). In G, the dependencies between subtasks can be divided into three categories: single dependency (), split dependency (), and joint dependency (). means that task depends entirely on task ; that is, , if , then , and, , if , then . means that task depends partly on task , that is, ; make and . means that a part of the input of task depends on task , that is, ; make and . If , then there is no direct dependency between and .

Definition 6. Subtask dependency correlation (). Based on different types of dependency relationships between subtasks, the dependency correlation degree between and is represented bywhere represents the amount of communication data between and ; if there is no dependency relationship between them, then .

Definition 7. Subtask distribution correlation (). It reflects the influence of data distribution between subtasks on task partition results. The larger the data traffic between subtasks, the greater the probability of the subtasks coupling into the same slice. The communication correlation degree between and can be expressed byFrom the above equation,In equation (13), and are the correlation degrees calculated by the input and output data traffic of and , respectively. represents the communication data volume from to . If there is no dependency between them, then .

Definition 8. Subtask computation time (). Due to the different sizes of subtasks, their computation time will be different. Assume that the computing power of the CPU is ; can be expressed as follows:where represents the CPU cycles required to process a unit of data.

Definition 9. Sequential execution time between subtasks (). If there is a sequential dependency between two consecutive subtasks, their sequential execution time is the sum of their respective execution times; otherwise, the value is 0.

Definition 10. Similarity between subtasks (). It represents the degree of comprehensive correlation between subtasks. The similarity between and is represented byTask similarity comprehensively measures the correlation between two subtasks in the whole workflow system from the direct data dependency between tasks and the relative importance of such dependency in the whole task flow, which serves as the basis for further coupling subtasks. First of all, dirCorrij is the main factor that determines whether the subtasks can be aggregated into a task slice. In addition, depCorrij reflects the degree of association between or and other tasks; the lower the degree of association between or and other tasks, the higher the probability that they are aggregated into a task slice. represents the computation time required to complete them. Dividing more subtasks that need to be executed sequentially into a task slice will help to avoid the delays caused by data transfer between subtasks. reflects the degree of correlation between and and their precedence and postorder subtasks. The smaller the degree of correlation, the higher the probability that and are grouped into one task slice.
An example is given in Figure 2, where the weights between nodes represent the reciprocal of similarity between different subtasks. Firstly, starting from the first node of the task workflow, we look for the subtasks that can be grouped into one task slice from top to bottom. The rule of merging is that each subtask is merged with one of the subsequent subtask which has the lowest weight with it until the last node in the workflow. Secondly, starting from the last node of the task workflow, we continue to look for the subtasks that can be merged into one task slice from bottom to top. The rule of merging is that each task is merged with one of the preceding tasks which has the lowest weight with it until the first node in the workflow. For example, in Figure 2, nodes t1, t3, and t7 are merged into one task slice, and nodes t2, t5, t10, and t11 are merged into one task slice. Then, the minimum weight of the merged subtasks is set as the merge threshold. For those subtasks that are not merged, they are merged when the weight between continuous subtasks is less than the threshold. The threshold value is obtained by comprehensively considering all the similarity of the whole workflow. According to the rule, nodes t4 and t9 are merged into one task slice.
In the following, an improved horizontal clustering algorithm is given to solve the slicing scheme based on the similarity between subtasks, as shown in Algorithm 1.
Algorithm 1 provides a method to determine the optimal slicing scheme of task workflow under the premise of a given number of edge servers. The value of SM can be allocated statically according to the resource situation in the MEC system or solved dynamically by optimizing the overall computation delay of the task workflow. In Algorithm 1, lines 6–9 are used to calculate the similarity between the subtasks in T by using equations (10)–(18). Lines 10–18 are used to solve the task slicing scheme based on the improved hierarchical clustering process. Here, the concurrency of task slices is mainly considered, and subtasks at the same level cannot be divided into the same task slice, as shown in line 15.

(1)function [S] = Slice (, SM, A)
(2)Input:
(3)  SM//Number of edge servers, 0 < M < m
(4)  A//Logical hierarchy matrix
(5)output: S
(6)InitNum(T)//Initializes the subtask
(7)taskNum = Count(T)//the number of T
(8)for i = 1: taskNum
(9) TD = TaskSim ()//TD is sliceNum  sliceNum matrix, and the similarity between sub-tasks is calculated
(10)sliceNum = taskNum
(11)while true
(12)if sliceNum ≤ SM
(13)  break;
(14)Stemp = MaxSim (TD);
(15)if Notlevel (Stemp, A)//Tasks at the same logical level cannot be divided into a task slice
(16)  Cluster = Merge (Stemp)//Task clustering, forming a new task slice division
(17)  sliceNum = Count (Cluster)
(18)S = Cluster
4.2. Task Slice Choreography Method Based on the Longest Overlapping Path

Next, we need to choreograph these subtasks for offloading. Our goal is to accomplish sequential offloading of task workflow and shorten the task wait delay for executing while offloading. In this paper, we propose a subtask choreography method based on the longest overlapping path by analyzing the logical relationship and execution constraints among subtasks. The definitions are given below.

Definition 11. The computation time of task slice (). It refers to the sum of execution delays of all subtasks in the task slice.Each task slice must wait until all the subtasks on which it depends have been completed before it begins to execute. The earliest start time of a task slice is determined by the longest path from the initial task slice to the last task slice. Therefore, we first carry out static sorting for all tasks slices. Here, we give an example for the task slice hierarchical relationship, as shown in Figure 3.
The following is a brief description of static sorting rules. The node in the first layer is the first task slice of task workflow, its subsequent task slices are in the second layer, and so on. In general, when the task slice at layer i completes, the task slices at layer i +1 can start executing. But it is not strict. For example, when is complete, and can be executed regardless of . If the computation delay of or is significantly higher than that of , then whether to offload or first will have an impact on the overall delay of the task workflow. Obviously, in this case, should be preferred. In addition, and , which are deployed on different edge servers, can be executed in parallel. But, compared to TL6, has more subsequent tasks. When the execution delay of is not greater than that of , we should offload first, so that it is executed more earlier than . The offloading order of task slices has an important effect on the computation time of the task slices on the edge servers. Here we define the earliest execution start time for a task slice.

Definition 12. The earliest start time of the task slice (). Each task slice must wait until all the subtasks on which it depends have been executed before it executes. The earliest start time of task slice is determined by the longest path from the initial task slice to this task slice, and the calculation formula is represented bywhere the longest path of task slice TLi is equal to the longest path of all its presequence task slices, its own computation time, and the transmission delay between it and other task slices.
As shown in Figure 4, the subtasks are executed in the order of , , , , , , , according to the static sort. However, since the size of each subtask is different, the computation time is also different. Scheduling tasks in a statically sorted manner can cause too much delay in the execution of the overall task workflow. Therefore, we need to combine static sorting with task slice’s earliest start time to produce a comprehensive sorting result.
Next, we use the concept of task priority to represent the choreography scheme of task slice set . The priority of each task slice is determined by its latest offloading time. Let the priority of the last task slice in the task workflow be 0. The higher the priority of the task slice is, the earlier it should be offloaded. The related definitions are as follows.

Definition 13. Task slice priority (). It represents the time to offload the task, that is, the latest time for the task slice to be offloaded. The calculation formula of priority is as follows:where represents the set of subsequent task slices of , which have direct data dependence between and , Prio(TLx) is the task priority of TLx, and represents the channel bandwidth between edge servers.
in the optimal choreography is the optimal start time of the transmission corresponding to ; in order to ensure that each task slice has been offloaded before it is executed on the edge server, . The task choreography method based on overlapping longest path can solve the best offload time for each task slice in T, so as to maximize the parallelization of the transmission and execution of subtasks, to realize the optimization goal in equation (1).
As shown in Figure 5, the choreography of subtasks is as follows: (1) Firstly, a static sort is done. For example, t1 is executed in the first sequence, t2 and t3 are executed in the second sequence, and subtasks t4, t5, t6, t7, and t8 are executed in the third sequence. (2) After that, all the subtasks are choreographed according to the earliest start time. It can be seen that the earliest start time of t7 and t8 is earlier than that of t4, t5, and t6, so, in the third execution sequence, t7 and t8 should be executed earlier than t4, t5, and t6. (3) Finally, the above two sequences are dynamically sorted according to the following rules: the nodes with more children take precedence or the nodes whose child nodes have high computation time take precedence. In Figure 5, t4 and t5 have more child nodes, and the computation delays of their child nodes are longer, so t4 and t5 have priority over t6, t7, and t8.
We design a heuristic algorithm to solve the subtask choreography scheme (Algorithm 2).
The input of Algorithm 2 is the output of Algorithm 1. It uses dynamic iterative optimization to solve the optimal choreography scheme , namely, the offloading sequence and timing of each task slice in S. In this algorithm, lines 6–12 are used to calculate all kinds of time delays during the offloading, computation time of each task slice, and the computation time constraints of each task slice according to equations (19)–(21). In lines 13–21, a heuristic optimization algorithm is used to find the task choreographing scheme that minimizes the overall computation delay of the task workflow under the execution constraints. The combination of Algorithms 1 and 2 can achieve the optimal task slicing and choreographing scheme under the condition of a specific number of edge servers. In the case of sufficient resources of the MEC system, we can traverse from 1 to k (number of T subtasks) to find the optimal number of task slices, that is, the optimal number of edge servers for distributed deployment of the whole task workflow.
The computational complexity of an algorithm is determined by the number of basic operations when the input size is N. The SSCS algorithm proposed in this paper is a heuristic algorithm based on discrete optimization. The algorithm consists of generating the initial solutions, generating neighborhoods, judging the infeasible task scheduling list, and removing the infeasible task scheduling list. Since the generation of the initial solutions is constrained by the earliest start time of a task, the computational complexity of the operation is determined by the horizontal clustering result of the task workflow, that is, O(NlogN). Similarly, the computational complexity of generating neighborhoods is determined by the maximum number of parallel tasks at each level, and the size does not exceed O(). The computational complexity of judging the infeasible task scheduling list and removing the infeasible task scheduling list is L, where L is the length of the infeasible task list. So, the overall complexity of the SSCS algorithm is O(Max_GenNlogN), where Max_Gen is the maximum number of iterations.

(1)function [S] = choreography (, S, En)
(2)Input: //Task workflow
(3)  S//Task slice scheme
(4)  En//Environmental parameters, including channel bandwidth, server processing capacity, etc
(5)output: Ord(S)
(6)CreTree(, S)//Build the number of task slice levels
(7)Init(En)//Initialize the offload environment
(8)for i = 1:|S|
(9)T(i) = ExeTime(TLi)//Calculate the earliest start time of the task slices
(10)KP = LogicP(S, T)//Find the longest path for the task slices to execute
(11)NKP = DelP(S, KP)//Get the task slices not in the longest path
(12)Cons = priority(S, ExeTime)//Obtain scheme constraint
(13)Sord = rand (Popsize, Cons, S)//Program population size
(14)while (k ≤ maxnum)
(15)for i = 1: Popsize
(16)  F(i) = fitness (, Sord)//Set the optimization target of the heuristic algorithm
(17)[globlalMinT, ordi] = min(F)
(18)for i = 1: Popsize
(19)  Sord = IteV (Cons, KP, NKP)//Optimize the population of the choreography scheme under the constraints
(20)[BestMinT, bestord] = min (global, ordi)
(21)Ord(S) = BestMinT

5. Experiment and Analysis

In order to verify the effectiveness of the task slicing and choreographing method proposed in this paper, we set up two groups of simulation experiments. In the experiment, we first generated the complex task workflow of mobile devices covering the three data dependencies and assigned corresponding parameters to each subtask and MEC environment. The setting range of main parameters is shown in Table 1.

5.1. Verification Experiments of Task Slicing Method

Firstly, in order to prove the advantages of task slicing method, we use HPD (Hierarchical Process Decentralization) algorithm and HIPD (Hierarchical Intelligent Process Decentralization) in papers [36, 37] to generate subtask slicing and then conduct comparative experiments with our algorithm. The HPD algorithm applies breadth-first search/traversal algorithm to find the most relevant, closely related, and parallel activities in the workflow view and then encapsulates the closely related activities in the same broker to reduce the need for interbroker messaging. The HIPD algorithm combines HPD and a frequent path mining algorithm together. In this experiment, the task workflow to be offloaded by the mobile device and the data dependency relationship between subtasks are shown in Figure 5. We use these two methods to generate task slices of different granularity and compare the results with the slicing results of our algorithm. The partitioning results of the three slicing methods for the task workflow are shown in Table 2.

When the task is offloaded to the server, this group compares the advantages and disadvantages of the results of different task slices from three aspects: the overall computation delay of the task workflow, the load of the edge servers during the execution process of the task, and the idle time of the edge server. In order to better prove the stability and applicability of our algorithm, in the experiments, we assigned two load schemes under different conditions to the subtasks in Figure 5. In the first case, all the subtasks are executed only once. The experimental comparison results are shown in Figures 6(a)6(c). In the second case, all the subtasks are executed many times, and some subtasks may not be executed. The results of experimental comparison are shown in Figures 6(d)6(f). Figures 6(a)6(c) show the overall workflow computation delay corresponding to different slice results in Table 2, the average data transfer load between different edge servers, and the average idle time of the server itself during the execution process. It can be seen from these three figures that, compared with other methods, the SSCS method has the shortest total delay of task workflow, lower average data transmission between subtasks introduced by task distribution deployment, and the highest server utilization rate. Different task slicing schemes obtained by HPD and HIPD methods are effective and shorten the overall delay of task workflow, but they usually introduce a large amount of task transmission. At the same time, they are not superior to our method in terms of server utilization.

For example, the traffic volume introduced by HPD1, HIPD0, and HIPD1 methods is much more than that of our method. For the HPD2 and HIDP2 algorithms, their goal is to slice tasks more evenly across complex workflows. Although in the long run they can make the load distributed among the different server edges more balanced, in most cases, the total delay of the tasks they generate is higher than that of our method. Figures 6(d)6(f) show that the SSCS method is applicable to different task loads of different sizes. In this case, the slice deployment using the SSCS method can obtain the lowest task computation time delay. It also has good load balance and server utilization. This lays a good foundation for the choreography of subtasks.

5.2. Verification Experiments of Choreography Method

In this group of experiments, we verified the effect of our choreography method from the overall feedback delay of the task workflow, as well as the number and time of completions of the subtasks on the edge servers at each time period. We choose two common task offloading sorting methods. One is according to the hierarchical structure tree of the task workflow; the lower levels of subtasks are first offloaded and the same levels of subtasks are randomly ordered, which is called random sorting based on hierarchy (RS-HIE). The other is similar idea but for the same levels of subtasks which are ordered according to the size of the load, called load prioritization based on hierarchy (LP-HIE). The three offloading sequencing methods are completed based on the task slice results of our method. The experimental results are shown in Figure 7. Figure 7(a) corresponds to the offloading order of subtasks solved by the three methods. It can be seen that the offloading order of subtasks generated by different methods is greatly different. Figures 7(b) and 7(c) show the impact of different offloading sequences on the completion time of subtasks on the edge servers. Figure 7(b) shows the completion time of each subtask in the task workflow on the edge servers, and Figure 7(c) shows the number of subtasks that have been completed on the edge servers in each period. As you can see, the SSCS method can offload the most subtasks per unit time and has the shortest total computation time. It gives full consideration to various key time points affecting the task workflow when solving the choreography scheme, reasonably arranges the offloading time of each subtask, and avoids the total delay caused by waiting for the task to be offloaded as much as possible. Some critical tasks that affect the overall latency of the workflow are offloaded first.

6. Conclusion

To resolve offloading problem of complex tasks in the MEC system, we study the distributed deployment strategy and efficient offloading method of the task workflow. Considering the characteristics of data distribution and logic relationships between subtasks in the process of task offloading and execution, a slicing method of task distribution and deployment is proposed based on similarity between subtasks, and a choreographing method of task offloading sequence is proposed based on the longest overlapping path. Finally, aiming at minimizing the overall delay of task workflow, a heuristic algorithm is designed to solve the approximate optimal solution of slicing and choreographing. Simulation experiments compare our method with other commonly used slicing and choreography methods and prove the effectiveness and advantages of our method in solving task offloading of complex tasks from various angles.

Data Availability

The data involved in this paper include migration algorithm code and simulation data (generated by the simulation algorithm). For copyright protection purposes, data access currently requires contact with institutional authors. If data are needed, the authors should be e-mailed at [email protected].

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This work was supported in part by the National Natural Science Foundation of China under Grants 62162047 and 62162046; Natural Science Foundation of Inner Mongolia under Grants 2019ZD15 and 2019MS06029; Inner Mongolia Science and Technology Plan Project under Grants 2021GG0155 and 2019GG372; the Self-topic of Engineering Research Center of Ecological Big Data, Ministry of Education; the Open-topic of Inner Mongolia Big Data Laboratory for Discipline Inspection and Supervision (IMDBD2020012 and IMDBD2021014); Inner Mongolia Engineering Laboratory for Cloud Computing and Service Software; Inner Mongolia Key Laboratory of Social Computing and Data Processing; and Inner Mongolia Engineering Lab of Big Data Analysis Technology.