Abstract

In the era of big data, mining and analysis of the enormous amount of data has been widely used to support decision-making. This complex process including huge-volume data collecting, storage, transmission, and analysis could be modeled as workflow. Meanwhile, cloud environment provides sufficient computing and storage resources for big data management and analytics. Due to the clouds providing the pay-as-you-go pricing scheme, executing a workflow in clouds should pay for the provisioned resources. Thus, cost-effective resource provisioning for workflow in clouds is still a critical challenge. Also, the responses of the complex data management process are usually required to be real-time. Therefore, deadline is the most crucial constraint for workflow execution. In order to address the challenge of cost-effective resource provisioning while meeting the real-time requirements of workflow execution, a resource provisioning strategy based on dynamic programming is proposed to achieve cost-effectiveness of workflow execution in clouds and a critical-path based workflow partition algorithm is presented to guarantee that the workflow can be completed before deadline. Our approach is evaluated by simulation experiments with real-time workflows of different sizes and different structures. The results demonstrate that our algorithm outperforms the existing classical algorithms.

1. Introduction

Nowadays, the big data technology has been used in a wide range of applications including complex systems to support decision-making [1, 2]. Along with the enormous commercial benefits, scientific advances, management efficiency, and analytical accuracy brought by big data, this new technology raises many challenging problems such as high cost and latency of big data storage, transmission, and processing [35]. To tackle these problems, cloud computing environment and workflow modeling methods are recognized as the effective way.

Many large-scale scientific applications in areas such as astronomy, bioinformatics, and meteorology would generate and process large amounts of data; such applications consist of a large number of data processing tasks that are frequently modeled as workflows [6]. Normally, a workflow is represented by a Directed Acyclic Graph (DAG) with nodes and edges, where nodes represent tasks and edges represent data/control dependencies between tasks in a complex application system. Meanwhile, once a time-critical or real-time application system [7, 8] is modeled as a real-time workflow, the deadline constraint can be used to ensure the tasks complete on time effectively.

Cloud computing is being investigated as an effective platform that delivers hardware infrastructure and software applications as services for the tasks of big data management and analytics. And real-time workflows can also be executed in such high performance computing environment. There are various cloud providers offering large amount of services with different quality of service (QoS) [911] as well as different prices. A lot of efforts have been made in the area of service recommendation [12, 13] based on QoS from the perspective of service providers [1416], but few concerns have focused on how to select the services in cloud platform based on QoS from the perspective of users. Particularly, cloud computing provides a flexible pricing model (i.e., pay-as-you-go and on-demands services); users are charged based on their consumption of various resources with different QoS. Therefore, one of the most challenging problems with real-time workflows in cloud computing is to get a cost-effective way to complete the workflow within the deadline [17].

In reality, there are two main stages when executing a workflow in a cloud computing environment [18]. The first one is the resource provisioning stage: during this phase, computing resources from the clouds will be selected and reserved to prepare for the workflow’s execution. The second one is the task scheduling stage: during this phase, a schedule is generated and each task will be mapped into the best-suited resource. That is, the second stage determines where and when each task of a workflow will be executed, while the first stage decides what types and how many resources will be leased from the cloud service providers and therefore the total cost of the workflow is mainly decided at this stage. To address these distinctions between task scheduling and resource provisioning, we propose in this paper a novel cost optimization algorithm that focuses only on resource provisioning.

In order to satisfy the deadline of real-time workflow, our proposal partitions the original workflow into some sub-workflows which can be executed in parallel based on the critical path methodology. Then we use the dynamic programming knapsack algorithm to provision resources in clouds for these sub-workflows to minimize the total cost. Our approach is evaluated by simulation experiments with real-time workflows of different sizes and different structures. The results demonstrate that our algorithm outperforms the existing classical algorithms. Major contributions of this paper are stated as follows:(i)A global resource provisioning for real-time workflow strategy is addressed for cost optimization under deadline-constrained.(ii)A workflow partition algorithm based on critical path technique is proposed to get some sub-workflows being executed in parallel to ensure the deadline constraint.(iii)A dynamic programming algorithm is used to provision the most cost-effective resources to the sub-workflows.(iv)We perform extensive simulations and show the efficacy of our strategy over two existing algorithms, namely, simply DPK and IC-PCP.

The rest of the paper is organized as follows. Section 2 introduces related work followed by problem specification in Section 3. Section 4 explains the proposed algorithm while Section 5 presents the evaluation of the algorithm performance. Finally, conclusions and future work are summarized in Section 6.

Cost optimization for workflow execution has been widely studied over the years. Efficient resource utilization in parallel and distributed computing environment is a key issue for cost-effectiveness. To resolve this issue, numerous researches have been done by a variety of workflow scheduling methods in clouds. Accordingly, a significant number of real-time workflow scheduling algorithms focusing on reducing the overall execution cost of real-time workflow have been proposed. Alkhanak et al. [19] classified the cost optimization method of workflow in cloud computing environment into two categories: heuristic methods and meta-heuristic methods.

Abrishami et al. [17] presented a static algorithm IC-PCP (IaaS Clouds-Partial Critical Path) based on the heuristic for scheduling a single workflow instance on an IaaS cloud. This algorithm considers cloud features such as VM heterogeneity, pay-as-you-go, and time interval pricing. They try to minimize the execution cost of scheduling all tasks in a partial critical path on a single machine that can finish the tasks before their latest finishing time (which is calculated based on the application’s deadline and the fastest available instance). In [20], Verma and Kaushal proposed a greedy algorithm based on the classical HEFT for providing a suitable trade-off between execution cost and time. Zheng et al. [21] presented three novel scheduling heuristic algorithms to help users to schedule their big data processing workflow application on clouds so that the cost can be minimized and the deadline constraints can be satisfied; different configurations of CPU frequency were considered in their work. Meta-heuristic approaches such as Genetic Algorithm (GA) based [22], Ant Colony Optimization (ACO) based [23], Particle Swarm Optimization (PSO) based [24], and symbiotic organism search algorithms were used to address the same objectives that minimize the cost of workflow execution while considering the deadline. Wu et al. [25] proposed a meta-heuristic algorithm L-ACO as well as a simple heuristic ProLiS to minimize execution cost of a workflow in clouds under a deadline constraint; experimental results show that the meta-heuristic L-ACO performs better in terms of execution costs and success ratios of meeting deadlines but the heuristic ProLiS is more efficient. That is to say, the meta-heuristic based algorithm has achieved improved performance on cost optimization but with some compromise on the execution time.

Task scheduling assigns tasks to the most cost-efficient resource to optimize the execution cost of a workflow. But task scheduling is a well-known NP-hard problem. There is no scheduling algorithm that can obtain an optimal solution in polynomial time [17]. Otherwise, to tackle deadline-constrained workflow scheduling, deadline-distribution method is the most widely used method by both meta-heuristic based and heuristic based algorithms. The deadline-distribution method always distributes the deadline to each task in proportion to its minimum execution time [25]. In most case, this kind of methods ignores the tasks of workflow that can be executed in parallel, so the sub-deadline of each single task may be not appropriate. According to these sub-deadlines, task scheduling algorithms cannot get a global optimal result, because a task level optimization was used and hence failed to utilize the whole workflow’s structure and characteristics. Resource provisioning can get a minimum cost by selecting an optimal assembly of resources for the global workflow execution. Our proposal minimized the total cost of workflow execution from the perspective of resource provisioning to the whole workflow (or sub-workflows) without concerning each single task’s resource mapping, which can further improve resource utilization.

With respect to cost optimization, resource provisioning is more efficient and effective than task scheduling for a real-time workflow in cloud computing environment [26]. Many researches have been done to minimize the overall cost of workflow by resource provisioning [27, 28]. Both static and dynamic methods are used to provision the cloud resources [29]. Static method assumes that accurate information about workflow and cloud resource performance can be obtained before scheduling. Dynamic provisioning adopts no such assumption. Most scientists run the same workflows, thereby enabling the collection of such information through several trial runs. Therefore static method is an appropriate method for workflows [30]. Static scheduling is also eased by the fact that cloud providers usually declare the performance specifications of their resources. We use static method for resource provisioning in this paper.

3. Problem Specification

In order to facilitate the reading of the paper, all of the symbols used in this section are listed in Table 1.

3.1. Related Model
3.1.1. Workflow Model

A real-time workflow refers to using the workflow technology for modeling of a real-time application system. That is, temporal constraints are added to the original workflow model using DAG graphs.

Definition 1. A workflow application is represented by a Directed Acyclic Graph (DAG)  = (T, W, E), where is a set of n tasks, is the computational workload of each task in T, and is the set of directed edges between two tasks. An edge eij of the form (ti, tj) denotes that there is data dependency between ti and tj; ti is said to be the parent task of tj and tj is said to be the child task of ti. This relationship between ti and can be represented by ti = Parent(tj) and tj = Child(ti). The computational workload in is given by the number of mega-floating point operations that need to be executed. The temporal constraint for a real-time workflow is the deadline which is denoted by D.Based on this definition, a child task cannot be executed until all of its parent tasks are completed. Figure 1 shows an example workflow with six tasks. tentry and texit are additional dummy nodes with computational workload 0 to give the whole workflow a single entry and a single exit. The task t1 must be finished before tasks t3 and t4 can start. Also, both t3 and t4 must be finished before t5 can start. However, there is no path between t3 and t4. So t3 and t4 do not need to be executed in any particular sequence. Therefore, the tasks {t1, t3, t4, t5} can be executed as t1 ⟶ t3(t4) ⟶ t5. This means that the workflow segment t1, t3, t5 or t1, t4, t5 must be executed sequentially, but t3, t4 can be executed in parallel. There are two classic types of workflow processes in a large real-time workflow as shown in Figure 2. One is sequential workflow process Figure 2(a) in which all tasks must be executed one after the other in a certain order. The other is parallel workflow process Figure 2(b) in which all tasks can be executed simultaneously without any particular order.

3.1.2. Resource Model

This paper focuses on Infrastructure as a Service (IaaS) clouds which offer the user a virtual machine (VM) pool of unlimited and heterogeneous resources with different computational performance, memory capacity, and price that can be accessed on demand such as Amazon’s EC2 (Elastic Compute Cloud) [31]. Better computational performance or more memory implies higher price. We assume that there is no limitation on using each resource; i.e., the workflow can order any number of resources from each cloud provider at any time. Based on the profiling results about workflows given in [1] and the VM types offered by Amazon EC2, we assume that the VMs have sufficient memory to execute the workflow tasks. So, memory capacity of VMs will no longer be considered in this paper. We define as a VM type in terms of its computational performance Pj and PRICEj per billing cycle of τ time units.

For each of a certain type, we assume that the computational performance in terms of mega-floating point operations per second (MFLOPS) either is available from the provider or can be estimated [32]. This information is used to deduce the computational performance by Pj = MFLOPSj ∗ τ, where τ denotes the billing cycle of which is specified by the cloud provider. User’s payment for the usage of each is based on the billing cycle τ. Any partial utilization of the leased VM is charged as if the full billing cycle was consumed. For example, if τ = 60 seconds and a VM is used for 61 seconds, then the user will pay for two cycles of 60 seconds, that is, 120 seconds. Also, we assume that there is no limitation on the number of billing cycles of VMs that can be leased from the provider.

Definition 2. Assume there are m types of VMs that can be provided by cloud providers; the collection of available VMs will be denoted as . Each VMj has two parameters; one is MFLOPSj and the other is PRICEj. MFLOPSj is defined as mega-floating point operations per second of the VMj, and PRICEj is the price per τ seconds of the VMj, where τ is the billing cycle of the VMj. Pj = MFLOPSj × τ is the computational performance of VMj per billing cycle.

3.1.3. Cost Model

Normally, the final cost is based not only on the utilization of computational resources, but also on the data transfer between the parent task and its child task. If both parent and child tasks are in the same VM, there is no data transfer fee because the tasks share the same data center in a single VM. When the parent task and the child task are executed on different VMs that belong to the same cloud provider, there will not be any data transfer fee because most of the public cloud providers, such as Amazon EC2, do not charge for internal data transfers among their computational resources. So, when we use these public cloud computing resources, we can ignore the data transfer fee at this moment. For the rest of this paper, we will assume that we utilize computing resources from one cloud provider only.

The cost of using the computational resources is calculated from the resource occupation time multiplied by the price of the resource. The resource occupation time, denoted by RT, concerns not only the tasks execution time (ET), but also the data transfer time (TT) and the initial boot time (BT) of each resource provisioned by the workflow. Thus, RT = BT + TT + ET. However, as the development of Docker technology, the initial boot speed of virtual resources is of the order of seconds or milliseconds. It is so small compared with the execution time of a workflow that it can be ignored. That is, we can assume that BT = 0. When the data transfer occurs between a parent task ti and its child task tj, the transfer time depends only on the amount of data to be transfered and the bandwidth provided by the cloud provider. Both of these values are fixed, regardless of the type of VMs that is provisioned by the workflow. Therefore, the value of data transfer time of each task ti can be denoted by TTi, which is VM-independent. For a VMj, the resource occupation time can be denoted by , where ETij is the execution time of task ti (1 ≤ i ≤ n) on VMj and TTi is the data transfer time of ti which is VM-independent. Each task ti (1 ≤ i ≤ n) may or may not be executed by VMj (1 ≤ i ≤ m). Let aij be a 0-1 variable that indicate whether ti is executed on VMj; when ti is executed by VMj, aij = 1, otherwise, aij = 0. So the cost of each VMj can be denoted by . In order to optimize the total cost, we need to minimize each Cj. From the formula of Cj, we see that ETij and aij are the two variables that depend on both ti and VMj, while TTi depends only on ti but not VMj. Thus, TTi can be regarded as a constant from the cost optimization point of view. For the rest of the paper, we can ignore TTi.

Actually, a task does not have to be executed entirely by one VM. It can execute sequentially on several VMs as long as the execution coincides with a billing cycle. For example, suppose ti has executed on a VM for one billing cycle and it has not yet finished; then it has the option of staying with the same VM in the next billing cycle or switching to a different VM. Let ETi denote the execution time of task ti. Then, ti will be charged billing cycles, where the ceiling of ETi is due to the last billing cycle of VMj (assuming that VMj is the last VM provisioned by ti). Therefore, the cost of resource provisioning to task ti can be denoted as Ci = ⌈ETi⌉ × PRICE, where PRICE is the price vector . So, . The total cost of a workflow can be defined as follows:

Definition 3. Assume there is a workflow with n tasks as depicted in Definition 1, while there are m as defined in Definition 2 that can be provisioned to workflow . The total cost of resource provisioning to the workflow is , where , xij is the number of billing cycles of VMj which provision to task ti. So, . Letting , we have .
Note that, in the above definition, we have assumed that no two tasks are combined together. For example, suppose there are two tasks t1 and t2 each requiring execution of 1.5 billing cycles. If we combine the two tasks together, it only requires 3.0 billing cycles. However, in the above definition, they will require a total of 4 billing cycles, two billing cycles for each task. We will discuss how to combine two tasks together in later sections.

3.2. Cost-Effective Resource Provisioning
3.2.1. Optimization Objective

Resource provisioning in cloud computing environment may have different objectives. This work focuses on finding a provisioning strategy to execute a workflow on IaaS clouds such that the total execution cost is minimized and the deadline is met. We assume that there are m types of VMs; the computational performance and price per billing cycle of each are defined as in Definition 2. According to Definition 3, the objective of cost optimization is to find a resource provisioning to satisfy the following:where is the sum of all tasks’ workload, xj is the number of billing cycles used by VMj for all the tasks, D is the deadline of the workflow as defined in Definition 1, and is the total execution time of the workflow . It is generally known that is not only affected by each task’s execution time but also related to the execution sequence for all tasks in a workflow. That is, the execution time of a sequencial workflow as shown in Figure 2(a) will be the sum of each task’s execution time while a parallel workflow as shown in Figure 2(b) will take the longest running task’s run time as its execution time. So the execution time of a workflow is defined as follows:

Definition 4. Assume there is a workflow with n tasks as depicted in Definition 1. If the workflow is a sequential process, the execution time can be expressed as . If the workflow is a parallel process, the execution time can be expressed as . ETi in both formulas denote the execution time of each task in the workflow. So, if the workflow contains the same tasks the execution time of parallel workflow must be less than sequential workflow.
Shown in Table 2 is an example of three types of VMs with computational performance Pj and price PRICEj per billing cycle. In this example, we assume the billing cycle is one hour.
When we provision these three types of VMs to the tasks in the sample workflow shown in Figure 1, the cost of each task executing on each VM is shown in Table 3. As can be seen from this table, when , provisioning VM3 would take 1 hour and is charged 0.1, while provisioning VM1 or VM2 would take less than 1 hour but is charged 0.8 or 0.4. So, the optimal provisioning for t1 is one VM3 and with a charge of 0.1. In Table 3, the optimal single VM provisioning of each task is given in bold. Notice that for t1, t2, t3, and t5, the optimal provisioning for each of them is to use one type of VM. For t4, the optimal provisioning is to use one VM1 and one VM2 for a total charge of 1.2. For t6, the optimal provisioning is to use one VM2 and one VM3 for a total charge of 0.5.
The total workload of all the tasks in the workflow shown in Figure 1 is 146. The optimal provisioning is to use five VM1 for a total charge of 4.0. Note that this is less than the sum of each task’s optimal provisioning which is 4.2. This is because each individual optimal provisioning might have idle time. For example, for t5, we use one VM1 which can execute 30 MFLOPS in one hour, but t5 has a workload of only 26 MFLOPS. So, 4 MFLOPS are wasted. The columns CSW and NSW will be explained later.

3.2.2. Cost Optimization Stragety

is assumed to be the computational workload that need to be provisioned. The decision variable xk is the number of billing cycles of VMk that are leased from the IaaS provider. Let the VMs be sorted in nonincreasing order of performance versus price; i.e., P1/PRICE1 ≥ P2/PRICE2 ≥…≥ Pn/PRICEn. The VMs are considered in this order. That is, VM1 is considered first, followed by VM2, and so on. The optimal value function when restricted to using the first k VMs can be defined by the following equation.

The optimal value function is given by . To solve equation (2) we use dynamic programming by considering VM1 first. If there is some idle time left in using VM1 (because it does not use a full billing cycle), VM2 will then be considered for using the residual time left by VM1. If VM2 again has idle time left, VM3 will be considered next. This process is repeated until we reach VMm. The following is the dynamic programming algorithm to solve equation (2).

The final is the minimum cost of provisioning all m types of VMs to achieve the computational workload .

VMs provisioning of individual task could cause idle time unless the computational workload of the task matches exactly the computational performance of the VMs provisioned for it. But the probability of this matching is very low. If we assemble some tasks together to provision VMs, the efficiency of resource utilization can be significantly increased. As shown in Table 3, the optimal cost for and is 0.1 and 0.8, respectively. However, if we combine and together, the total workload is exactly 30. Therefore, the optimal result is to use one VM1 that will cost 0.8. Such cost is lower than the total cost of provisioning VMs to and separately. As a result, the more tasks we can assemble as a whole for provisioning, the more we can save money in leasing resources. In order to minimize the total cost of a workflow execution, it would be better to provision all tasks in the workflow as a whole. If we assemble all tasks as a single task, it would require that all tasks be executed sequentially. The execution time may exceed the temporal constraint of the workflow. According to Definition 4, some tasks should be partitioned from this sequential process while other tasks can be executed in parallel. In order to reduce the total execution time to be satisfied with the temporal constraint, some strategy of workflow partition will be discussed in the next section.

3.2.3. Deadline Assurance Method

When is greater than the temporal constraint of the workflow , the workflow cannot be finished before the deadline. In real life, meeting deadline constraint is very important because many real-time workflows are Urgent Computing [33] to support emergency computations such as severe weather prediction for hurricanes, flooding, earthquake, etc. Therefore, temporal constraint is a very important dimension for workflow quality of service (QoS). The temporal constraint of a workflow execution must be considered when we optimize the cost. To guarantee the workflow can be finished before the deadline, the workflow can be partitioned into several sub-workflows, where each sub-workflow can be executed in parallel and all tasks in one sub-workflow should be executed sequentially without exceeding the temporal constraint. Each sub-workflow executing on time will ensure the whole workflow to be finished before its deadline. So a series of temporal constraints must be defined for every sub-workflow.

The critical path of a workflow is the longest execution path between the entry and exit tasks of the workflow [34]. All the tasks that belong to the critical path are called critical tasks. The sum of the computational workload of the critical tasks is maximum compared with any other path in the workflow. The critical tasks are executed sequentially because of the parent-child relationship between them. The execution time of the critical path by the best performing VM must not exceed the temporal constraint D. Otherwise, there is no solution that can meet the deadline constraint. In the rest of this section, we assume that the execution time of the critical path does not exceed D.

Suppose the workflow is partitioned into a number of sub-workflows. One of the sub-workflows contains all of the critical tasks and some other (possibly none) noncritical tasks. This sub-workflow will be called the critical sub-workflow (CSW). The sub-workflows that contain no critical tasks are called noncritical sub-workflow (NSW). There can only be one CSW but r (r ≥ 0) NSWs.

Suppose there is a CSW . If we provision the best performing VM to with a resulting total execution time and if  ≤ D, then there will be a successful provisioning that can guarantee the workflow completed by the deadline, because the other NSWs can be executed with the CSW in parallel. On the other hand, if , it is impossible to get a successful provisioning of VMs to the CSW because of the violation of deadline. In this case, we have to partition this CSW into a new NSW and a new CSW which includes fewer tasks. The new CSW includes all critical tasks but fewer noncritical tasks. This procedure can be repeated until the CSW can be completed before the deadline. Also, any other NSWs’ execution time must be satisfied with the temporal constraints gained from the layered approach.

As a workflow, there must be a structure that determines the execution order of each task in the workflow. Actually, all parent tasks should be finished before their child tasks can begin. We can use a layered approach to ensure such a sequence. A workflow can be divided into several levels (layers) that would be executed sequentially. At the same time tasks within one level do not depend on each other, so they can execute in parallel.

Definition 5. Assume there is a workflow as defined in Definition 1. Define the exit task to be Layer 0. Layer d is composed of all tasks that have a longest path with d edges to Layer 0 in the workflow . So, Layer d is defined as , whereFor example, the workflow shown in Figure 1 can be divided into five levels as shown in Figure 3. As long as executing tasks from the high level to the low level, all parent tasks will finish before their child tasks start. There is no execution order of the tasks in the same level. So, the sample workflow shown in Figure 3 can be arranged as a sequence: t1 ⟶ t2 ⟶ t3 ⟶ t4 ⟶ t5 ⟶ t6 or t2 ⟶ t1 ⟶ t4 ⟶ t3 ⟶ t6 ⟶ t5.
In order to guarantee that the deadline can be met, we must have , where is the critical sub-workflow. Also, the total execution time of all tasks in each layer of each NSW must not be more than the execution time of all the tasks in the same layer of CSW. Therefore, according to the execution time of CSW, we present the Layer Minimum Execution Time (LMET) as follows:

Definition 6. Assume a workflow as defined in Definition 1 is divided into l layers as BL(0), BL(1), …, BL(l − 1). If , where the number of tasks in layer d is s, then the Layer Minimum Execution Time of BL(d) is , where Pj is the best performance that IaaS can provide and is the workload of .
As for a NSW, if ET(BL(d)) ≤ LMET(d), where , then the temporal constraint of NSW can be satisfied. Otherwise, a partial NSW should be partitioned from the original NSW. This process is repeated until the inequality ET(BL(d)) ≤ LMET(d) can be met.
Take the sample workflow shown in Figure 1 for instance; we assume that the deadline is four hours. Using the best performing VM, VM1, shown in Table 2 to execute this workflow, the result (shown in the TW column of Table 3) shows that using one VM1 with five billing cycles will have  = 4.87 > D. So the workflow have to be partitioned into two sub-workflows. Based on the critical path t2 ⟶ t4 ⟶ t5, we can get a CSW and a NSW as shown in Figure 4. After this partition,  = 3.87 < D, while ET(t1) < LMET(3) = 1 and ET(t3) < LMET(2) = 1.47. So, the temporal constraint of NSW as shown in Figure 4 can be met. The cost of each sub-workflow is also shown in Table 3 (the CSW and NSW columns of Table 3). The total cost of both CSW and NSW is 4 which is equal to the optimal cost. So the workflow partition as shown in Figure 4 is a successful cost optimization scheme.

4. Cost-Effective Resource Provisioning Algorithm

4.1. Algorithm Design

The basic idea of the global resource provisioning algorithm is that, for a given workflow as shown in Definition 1, a dynamic programming method is used to optimize the cost of resource provisioning, and a critical path based workflow partition technique is used to guarantee the deadline requirement of the real-time workflow. Specifically, whether the deadline is met has to be determined by a Resource Provisioning in Parallel Algorithm before dynamic programming procedure is employed; once the deadline cannot be satisfied, a workflow partition procedure is employed to divide the workflow into some sub-workflow; each sub-workflow can meet its own temporal constraint; then these sub-workflows can apply the Dynamic Programming Knapsack Algorithm to get the most cost-effective resource provisioning scheme. All of these algorithms are based on a Workflow Layer Algorithm. Therefore, the whole process of our strategy is presented in detail as follows.

We first divide the workflow into l layers according to Definition 5. Second, using the dynamic programming algorithm given in the previous section, we provision m types of VMs to the workflow as though all tasks are executed in order from BL(l − 1) to BL(0). If , the provisioning fails. We then partition the workflow into a CSW and a NSW. While , we keep partitioning until and for each layer we have ET(BL(d)) ≤ LMET(d). Now we apply the dynamic programming algorithm to provision VMs to the CSW and all of the NSWs. obtained by this algorithm is the optimal resource provisioning results, and the minimum cost is . As this algorithm can achieve the global optimization of cost, it is denoted as Global Resource Provisioning for Real-time Workflow Algorithm which is shown in Algorithm 1.

Input: 1. A workflow . 2. A collection of virtual machine types . 3. A fixed deadline .
Output: 1. Optimized Provision X. 2. Minimum Cost TC.
(1) Initial ;
(2) Add t0 ⟵ tentry, tn+1 ⟵ texit to ;
(3) WorkflowLayer ();
(4)X = ParallelProvision(, VMs, D);
(5);
(6) Output X, TC

The called procedure WorkflowLayer() in GRP4RW is a algorithm to layer the workflow according to Definition 5. The pseudocode of WorkflowLayer algorithm is shown in Algorithm 2.

(1)procedure (WorkflowLayer())
(2)for i = 0 to (n + 1) do
(3)  if i = 0 then
(4)   BD(ti) ⟵ 0;
(5)  else
(6)   ;
(7)  end if
(8)end for
(9)l ⟵ BD(tn+1);
(10)for d = 0 to l do
(11)  ;
(12)  Sort ti in non-increasing order of (ti ∈ Bl(d));
(13)end for
(14)end procedure

First, (Lines 2–8) calculate the longest path of each task ti to texit as BD(ti). Next, (Lines 9–13) assign the tasks with the same value of BD(ti) to the same layer. Then, sort the tasks in the same layer in nonincreasing order of their workload (Line 12). This has the effect of partitioning the larger task to a sub-workflow to minimize the number of sub-workflows. In this way the cost optimization will be more efficient. Using this algorithm, the workflow can be divided into l layers, where each layer contains all the tasks with the same length to texit.

The next called procedure ParallelProvision( VMs, D) in GRP4RW is an algorithm to realize parallel provisioning of resource to multiple sub-workflows in order to guarantee the cost-effectiveness of each sub-workflow on the premise of meeting deadline. The pseudocode of Resource Provisioning in Parallel Algorithm is shown in Algorithm 3.

(1)procedure (ParallelProvision(, VMs, D))
(2);
(3)if DPK(TW, VMs, D) ≠ Failure then
(4)  X ⟵ DPK(TW, VMs, D);
(5)  return;
(6)else
(7)  PartitionPath ()
(8)  ParallelProvision(, VMs, D)
(9)  tF ⟵ the first task of ;
(10)  tL ⟵ the last task of ;
(11)  ;
(12)  ParallelProvision(, VMs, D2);
(13)end if
(14)end procedure

This algorithm first attempts the Dynamic Programming Knapsack Algorithm (DPK) which is shown in Algorithm 4 to provision VMs to the original workflow (Lines 2-3). If successful, it returns the optimized provision X (Lines 4-5). Otherwise, it calls Partition Workflow Algorithm (PartitionPath) which is shown in Algorithm 5 to partition the workflow into and (Line 7). It then recursively call ParallelProvision procedure for CSW and NSW separately (Lines 8–12), until all sub-workflows provision VMs successfully. As to , the temporal constraint is always the deadline. The temporal constraint for a can get from the Layer Minimum Execution Time based on Definition 6.

(1)procedure (DPK(TW, VMs, D))
(2)if D × P1 < TW then
(3)  return Failure;
(4)else
(5)  m ⟵ the number of VMs;
(6)  TW1 ⟵ TW;
(7)  x1 ⟵ ⌈TW1/P1⌉;
(8)  f1 ⟵ x1 × PRICE1;
(9)  for j = 2 to m do
(10)   TWj ⟵ TWj−1 − (xj−1 − 1) × Pj−1;
(11)   xj ⟵ ⌈TWj/Pj⌉;
(12)   fj ⟵ fj−1 − PRICEj−1 + xj × PRICEj;
(13)   if fj ≥ fj−1 or xj−1 − 1 + TWj/Pj > D then;
(14)    fj ⟵ fj−1;
(15)    xj ⟵ 0;
(16)   else
(17)    xj−1 ⟵ xj−1 − 1;
(18)   end if
(19)  end for
(20)  return X;
(21)end if
(22)end procedure
(1)procedure (PartitionPath())
(2) CP ⟵ Critical Path of ;
(3)for each ti ∈ G do
(4)  if ti ∈ CP then
(5)   ti.assigned ⟵ true;
(6)  else
(7)   ti.assigned ⟵ false;
(8)  end if
(9)end for
(10)k ⟵ BD(t) where t is the first task of CP;
(11)while all of ti ∈ BL(k) with ti.assigned = true do
(12)  k ⟵ k − 1;
(13)end while
(14)t ⟵ the first ti ∈ BL(k) where ti.assigned = false;
(15) ⟵ ;
(16) ⟵ Null;
(17)while t.assigned = false do
(18)   ⟵  + {t};
(19)   ⟵  − {t};
(20)  t.assigned ⟵ true;
(21)  t′ ⟵ t;
(22)  for each tp′ ∈ Parent(t′) do
(23)   if tp′.assigned = false then
(24)    t′ = tp′;
(25)     ⟵  + {t′};
(26)     ⟵  − {t′};
(27)    t′.assigned ⟵ true;
(28)   end if
(29)  end for
(30)  t ⟵ Child(t);
(31)end while
(32)return ,
(33)end procedure

This Dynamic Programming Knapsack Algorithm is based on the idea of dynamic programming elaborated in equation (3). Primarily, this algorithm will judge whether the workflow can be completed within the fixed temporal constraint D (Line 2). If using the VM with the best computational performance to execute the workflow will not satisfy the deadline constraint, the procedure will return failure (Line 3). Otherwise, it utilizes the method of dynamic programming to find the VMs provisioning vector X and then return (Lines 5–20). Note that D is a real number expressed in terms of the number of billing cycles.

The mission of Partition Workflow Algorithm is to partition the current workflow or sub-workflow into two parts: and . The includes, but are not limited to, critical tasks. The other part, , does not contain any critical tasks. We use a Boolean variable ti.assigned to denote the feasibility of putting ti in . Because all critical tasks cannot be put into , the original value of ti.assigned should be set as “false” except the critical tasks (Lines 2–9). We choose the noncritical path between two critical tasks as the basic of (Lines 10–31). Once a task t belongs to , the parent tasks of t that have not yet been assigned must belong to (Lines 22–29). Such a principle can make the sub-workflow as large as possible to assign the best cost performance resources as explained in Section 3.2.2. Meanwhile, the parent and child tasks belonging to a single sub-workflow can guarantee the correct execution order of the tasks (parent tasks finished before child tasks).

4.2. Algorithm Analysis

We now consider the time complexity of our algorithm. We suppose the number of tasks in the workflow is n, and the maximum number of types of VMs is m. The most time-consuming part of the Global Resource Provisioning for Real-time Workflow (GRP4RW) Algorithm is the WorkflowLayer and ParallelProvision procedures. The time complexity of the remaining parts of GRP4RW algorithm is O(m).

The first part of WorkflowLayer algorithm (Lines 1–4) is a counting loop whose time complexity is O(n). Another counting loop (Lines 6–9) will repeat l times, depending on the number of layers that the workflow is divided into. l will be no more than n because there must be at least a task in each layer. So the total time complexity of this WorkflowLayer procedure is O(n).

The ParallelProvision procedure is a recursive procedure. First, it calls the DPK procedure, which provisions each type of VMs only once. So, the time complexity of the DPK procedure is O(m). Then, another procedure PartitionPath will likely be called, which has time complexity O(n). This is because each task belongs to either or , and each task will be assigned only once. Because the workflow will be partitioned into two sub-workflows by calling the PartitionPath procedure, the maximum number of times the ParallelProvision procedure will be called will be . Therefore, the time complexity of the ParallelProvision procedure is O(mn + nn) = O(n ⋅ (m + n)). Consequently, the overall time complexity of the Global Resource Provisioning for Real-time Workflow Algorithm is O(n) + O(n ⋅ (m + n)) + O(m) = O(n ⋅ (m + n)).

5. Performance Evaluation

In this section, we present our empirical studies of the Globlal Resource Provisioning for Real-time Workflow Algorithm.

5.1. Simulation Settings

To evaluate a resource provisioning algorithm, we measure its performance on some sample workflows. Deelman et al. developed a workflow generator, which can create synthetic workflows of arbitrary size that are similar to real world real-time workflows. Using this workflow generator, they created four different sizes for each workflow application in terms of the number of tasks. These workflows are available in DAX (Directed Acyclic Graph in XML) format from their website [35]. For our experiments, we chose three sizes which are small (about 50 tasks), medium (about 300 tasks), and large (about 1000 tasks). Meanwhile, In order to explore the influence of different workflow structures on our algorithm performance, three different lengths of critical path are concerned in our simulation. However, the critical path length of each kind of workflow generated by this workflow generator is fixed and less than 10 tasks, so we have to organize several workflows altogether to construct the longer critical path we need. For example, we can connect 5 critical paths with a length of 8 and 1 critical path with a length of 10 end-to-end, so as to get a critical path with a length of 50.

For our experiments, we modeled an IaaS provider that offers a single data center and six different types of VMs with different processor speeds and different prices. The VM configurations are based on the current Amazon EC2 offerings. We used the work of Ostermann et al. [32] to estimate the computational performance in MFLOPS based on the number of EC2 compute units. Another important parameter for the experiment is the time unit for one billing cycle. Most of the current commercial clouds, such as Amazon, charge users based on a billing cycle of one hour. So, we used a VM billing cycle of one hour.

As a large number of workflows with different attributes are used for our experiments, it is important to normalize the total cost of each workflow execution in order to clearly analyze the cost optimization issue. For this reason, we first take all the tasks of a workflow as a whole and use the dynamic programming method to solve it. We define this cost as Cheapest Cost (CC). Note that the Cheapest Cost is obtained by ignoring the deadline constraint. The Normalized Cost (NC) of a workflow execution is defined by NC = (total cost of provisioned VMs)/CC, where CC is the Cheapest Cost of provisioning the VMs for the same workflow. Note that the total cost of provisioning the VMs in NC has deadline constraint. Because of the deadline constraints, clearly we have NC ≥ CC.

To evaluate the proposed algorithm, we need to assign a deadline to each workflow. In order to set a series of proper deadlines to our experiments, we define an upper bound of the deadline as the makespan of the Cheapest Cost (CC) situation and a lower bound as the makespan of the critical path executing on the best performaning VM in a sequential order. DU and DL denote the upper and lower bounds of deadline, respectively. To set the deadlines for workflows, we use the deadline factor α, and we set the deadline of a workflow to be , where 0 ≤ α ≤ 1. We let α goes up in steps of 0.1; i.e., α = 0, 0.1, 0.2, …, 0.9, 1.0.

To the best of our knowledge, many works have combined resource provisioning and task scheduling to minimize the cost of workflow execution; IC-PCP algorithm is the classical one among these works. Therefore, we adopted IC-PCP algorithm as a baseline to evaluate our algorithm. As mentioned in Section 2, IC-PCP algorithm was designed to minimize the cost of workflow execution while meeting a user-defined deadline [17]. This algorithm begins by calculating the EST (Earliest Start Time), EFT (Earliest Finish Time), and LFT (Latest Finish Time) of each task. It then finds the partial critical paths associated with the EST, EFT, and LFT. The tasks on each path are scheduled on the same VM and are preferably assigned to an already leased instance which can meet the LFT of the tasks while the values of EST, EFT, and LFT of each unassigned task are updated. Finally, each unassigned task on the scheduled path is calculated and the process is repeated until all tasks have been scheduled. At the end of this process, each task has been assigned a VM and has start and end times associated with it. The IC-PCP algorithm also aims at cost optimization but focuses on task scheduling. The time complexity of IC-PCP algorithm is higher and it lacks flexibility. Additionally, we use the Simple DPK algorithm to each task of the workflow as another comparative reference to manifest the importance of global idea on cost optimization.

5.2. Simulation Results

Figure 5 shows the relationship between the cost optimization of resource provisioning by the GRP4RW algorithm and the deadline. Figure 5(a) shows the results with different sizes: small size (50 tasks), medium size (300 tasks), and large size (1000 tasks). These different sizes have similar variations in the trends of the normalized cost with each different deadline. The more tasks a workflow has, the more gentle change of normalized costs with each different deadline. The fluctuation of normalized cost of the workflow with the smallest size is obvious. Figure 5(a) shows that the performance of cost optimization of our proposed algorithm is more stable as the size of workflow is larger.

Figure 5(b) shows how the different structures of workflows influence the variation of normalized costs. Because a large size workflow shows more stable normalized cost, we use three large size workflows (1000 tasks each) with different lengths of critical path (CP). The lengths of the critical paths are 50 tasks, 100 tasks, and 200 tasks. The results show that there is a nearly inversely proportional linear relationship between the normalized cost and the deadline to the workflows with the longest critical path. However, the workflow with the shortest critical path has the most expedite convergence to the optimization as the deadline grows larger. Therefore, our proposed cost optimization algorithm can be more efficient when the deadline is set to a low level for the workflow with shorter critical path.

Since the performance of cost optimization for larger size and longer critical path is better and more stable, we take a workflow with 1000 tasks with the length of the critical path being 200 as the simulation instance to compare the performances of three cost optimization algorithms: GRP4RW, IC-PCP, and Simply DPK. Figure 6 shows the result of the comparison. The normalized cost of Simply DPK algorithm does not vary much with the change of deadline factor and it always maintains a high value. The normalized cost of IC-PCP algorithm decreases with the increase of deadline. The normalized cost of our GRP4RW algorithm has a certain reduction with the deadline changes and the values of the costs are generally lower than the IC-PCP and Simply DPK algorithm at each deadline. There are three reasons for these results. First, the Simply DPK algorithm uses the dynamic programming knapsack method to each single task of a workflow. So, its optimization effect is modest and deadline-invariant. Second, the IC-PCP algorithm containing the combined idea uses the partial critical path as the optimization unit. This makes the optimization effect improve significantly, but dynamic programming method not utilized in this algorithm results in suboptimal normalized costs for all different deadlines. Finally, both the global idea and the dynamic programming knapsack method are used in GRP4RW algorithm. So the optimization effect achieves the optimal result. In conclusion, global idea is the most important force in the cost optimization of resource provisioning algorithm for real-time workflow. At the same time, dynamic programming knapsack method also plays a positive role in the whole process.

Considering that the length of billing cycles could impact on the performance of the cost optimization algorithms, we studied the effect of the relationship between the length of billing cycles and the execution times of tasks in workflow on the performance of the algorithms by simulation experiments. As the execution time of task seriously depends on the workload of task, we used three different workflows with different workloads of tasks: one workflow denoted as Workflow with Small Workload in which more than 90% task workload are less than 1 MFLOPS, another workflow denoted as Workflow with Medium Workload in which more than 90% task workload are between 1 and 5 MFLOPS, and the third workflow denoted as Workflow with Large Workload in which more than 90% task workload are more than 5 MFLOPS. Meanwhile we use 30 seconds, 60 seconds, 120 seconds, 300 seconds, 600 seconds, and 1200 seconds as the different lengths of billing cycles. The experimental results are shown in Figure 7. It demonstrates that the lengths of billing cycles have impact on the algorithms performance; with the decrease of the length of billing cycles, the cost optimization effect of all algorithms is improved to some extent. When the length of billing cycle is 30 seconds, the performance of all these algorithms is optimal. When the length of billing cycle becomes 3600 seconds, the performance of algorithm Simply DPK decreases significantly, while that of algorithms IC-PCP and GRP4SW is passable. That is to say, when the execution times of tasks are extremely smaller than billing cycle such as the workflow with small workload under the situation that the length of billing cycle is 3600 seconds (seen in Figure 7(a)), the length of billing cycles impact on the performance of algorithms is more significant, and our proposed algorithm GRP4SW has the most advantage at that condition.

6. Conclusion

In this paper, we presented a global resource provisioning algorithm for executing real-time workflow in cloud. We modeled the problem as an optimization problem that aims to get a cost-effective resource provisioning to executive a workflow while meeting a deadline constraint. The problem was solved by using the dynamic programming knapsack algorithm. Our approach embodies basic IaaS cloud properties such as heterogeneity and elasticity of resource performance and pay-as-you-go model of price mechanism.

We perform simulation experiments using three different sizes and three different structures of real-time workflows separately. Our results show that the cost optimization of our proposed solution improves as the deadline increases. We compared the performance of our algorithm against two other algorithms (Simply DPK and IC-PCP). Our results show that our proposed GRP4RW algorithm has an overall better performance than both the Simply DPK and IC-PCP algorithms under every deadline constraints.

For future work, we intend to improve our strategy for optimizing both data storage and data transfer cost of a real-time workflow execution. In addition, more and more researches have been focused on the optimization objectives about security or energy consumption property of QoS [36, 37]; the most important issue is realizing multiobjective optimization.

Data Availability

The workflows used in this study can be accessed via the website http://confluence.pegasus.isi.edu/display/pegasus/WorkflowGenerator.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This work is the partially supported by the National Natural Science Foundation of China (Project no. 61972001).