Abstract
Energy consumption in computer systems has become a more and more important issue. High energy consumption has already damaged the environment to some extent, especially in heterogeneous multiprocessors. In this paper, we first formulate and describe the energyaware realtime task scheduling problem in heterogeneous multiprocessors. Then we propose a particle swarm optimization (PSO) based algorithm, which can successfully reduce the energy cost and the time for searching feasible solutions. Experimental results show that the PSObased energyaware metaheuristic uses 40%–50% less energy than the GAbased and SFLAbased algorithms and spends 10% less time than the SFLAbased algorithm in finding the solutions. Besides, it can also find 19% more feasible solutions than the SFLAbased algorithm.
1. Introduction
Multiple processing in heterogeneous computing platforms adapts to different types of computing needs. Using multiple processing platforms will improve the system performance and satisfy the increase in energy consumption. However, assigning realtime tasks to a multiprocessor implementation proves to be an NPhard problem. The problems of realtime task allocation in a heterogeneous environment have been studied extensively in the existing references. However, most of the studies focus on the performance metrics of how to minimize the maximum utilization and these problems can be mapped to the traditional makespan problem [1].
Energy consumption has become a major problem in computer systems; the processor consumes most of the energy, especially in embedded systems, where the excessive energy consumption will cause serious pollution and waste of resources in the natural environment [2, 3]. Therefore, how to reduce processor energy consumption becomes a widespread concern. We need to focus on the problem from reducing the maximum utilization to energy consumption under the premise of meeting the specified task deadlines.
Although it is an NPhard problem, there are many approximation algorithms for solving the problem of realtime task allocation in a heterogeneous processor environment, including traditional realtime task scheduling algorithms such as deadlinemonotonic (DM) algorithm [4], ratemonotonic (RM) algorithm [5], leastlaxityfirst (LLF) algorithm [6], earliestdeadlinefirst (EDF) algorithm [5], and linear programmingbased (LP) algorithm [7] and the swarm intelligence algorithms such as ant colony optimization (ACO) [8], genetic algorithm (GA) [9–11], and shuffled frogleaping algorithm (SFLA) [12, 13]. In these studies, most algorithms do not consider energy consumption factors. Besides, the number of feasible solutions and energy saving are in conflict. Therefore, we need to find a new algorithm to solve this multiobjective optimization problem.
The heuristic algorithm in [14] is an adaptive algorithmic structure; it can be used to adapt to a series of relatively wide range of issues. Though many heuristic algorithms exist, the particle swarm optimization (PSO) algorithm emerges as a novel heuristic algorithm in recent years. This algorithm is inspired by the social behavior of a group of migratory birds that try to reach an unknown destination. Each bird is referred to as a particle. Each particle has a fitness value determined by the function to be optimized and a speed that determines their flight direction and distance; then the particle with the current optimal particle to search the optimal solution is chosen in the solution space. Compared with the genetic algorithm, the PSO algorithm has no processes such as reproduction, crossover, and mutation; it is only through simple operation for evolution, which is easy to achieve, and the efficiency is better.
The aim of this work is to propose a new algorithm to solve the problem of realtime task scheduling in a heterogeneous processor environment, under the premise of meeting all task deadlines to reduce energy consumption.
The main objective of the work is as follows.(1)Formulate the realtime task scheduling problem based on energy awareness and add more constraints. Put the energy consumption as the utility into the constraint condition.(2)Based on the PSO algorithm, propose one algorithm that can solve the problem of realtime task scheduling based on energy awareness, which can find as many feasible solutions as possible before the specified deadlines and minimize the energy consumption.(3)Through a series of comprehensive experiments, make a comparison of the proposed algorithm with the existing traditional algorithms and the other heuristic algorithms, to improve the algorithm so as to achieve the purpose of the optimization.
This paper is organized as follows. Section 2 presents the state of the art of the task scheduling problem in multiprocessor platforms. Section 3 formulates the problem of realtime task scheduling in heterogeneous processors based on energy awareness. Overview of the PSO algorithm and the proposed energyaware realtime task scheduling algorithm based on the PSO algorithm is introduced in Section 4. We analyze the performance and results of the proposed algorithm in Section 5. The final part gives the conclusion and summary and provides directions for future work.
2. Related Works
Baruah [15] has made a study on task scheduling in heterogeneous multiprocessor platforms, and some improvements have been made for the ACO heuristic algorithms and the improved algorithm performs very well in finding the feasible solutions under time constraint. Braun et al. [14] conclude 11 heuristic algorithms that can be applied to task scheduling in heterogeneous multiprocessor platforms to reduce the execution time. However, Braun et al. assume that each task on each machine has accurate execution time and has no time constraints. In the experimental results, the GA has a good performance. So far, there are available heuristic algorithms such as GA, ACO algorithm, SA algorithm, PSO algorithm, and SFLA algorithm [12, 13], and these algorithms have been applied to task scheduling in multiprocessor platforms.
Baruah [16] converts the task scheduling problem in multiprocessor platforms into an ILP problem and proposes an approximate polynomial time algorithm. However, the LP problem has a lot of feasible solutions; a polynomial time algorithm is not guaranteed to find the fundamental solution. Similarly, Leung and Whitehead [4] convert the task scheduling problems in multiprocessor platform, in which tasks can be divided and have priority to LP. However, they believe that each task can be arbitrarily segmented, but this assumption is limited in practice.
Baruah [17] puts forward a polynomial time algorithm for the task scheduling problem in multiprocessor platforms in which tasks are preemptive and transitive. Its purpose is to achieve task scheduling under a series of realtime tasks constraints in heterogeneous processor platforms. However, they ignore the communication overhead and that a task is dividable.
The task scheduling problem in multiprocessor platforms not only needs to solve the increasing number of the feasible solutions but also reduces the energy consumption of each found feasible solution. At present, there are a lot of researchers studying task scheduling in multiprocessor platforms to reduce energy consumption. But, in general, the PSO algorithm has not been used in these subjects.
Cheng et al. [8] propose an improved ACO algorithm in multiprocessor platforms task scheduling which can find sufficient feasible solutions, while satisfying the time constraints. But this algorithm is not PSO related and does not make a certain improvement in energy consumption. Baruah [17] in multiple processing scheduling applications describes a nonACO algorithm. The algorithm can effectively reduce energy consumption and reduce scheduling time, but the algorithm needs a premise: all the tasks must have the same computation time. In addition, the algorithm is not PSObased. Aydin and Yang [18] propose the worstfitdecreasing algorithm in multiprocessor task scheduling to reduce energy consumption and meet the deadline. However, this heuristic algorithm is not a PSObased algorithm. Zhu et al. [19] put forward the corresponding effective algorithm in multiprocessing task scheduling, but it is not PSObased algorithm, either.
The evolutionary algorithms thick swarm intelligence optimization algorithm as the goal, such as ACO algorithm, evolution strategy, and GA, can solve the problem of multiobjective combinatorial optimization and obtain a better solution, but the algorithm is complex and has low efficiency. So, looking for a more effective task scheduling and allocation algorithm is very important.
Particle swarm optimization (PSO) algorithm [20, 21] is a new global optimization algorithm, the same with the other swarm intelligence algorithm; all belong to the group of intelligent evolutionary computation technology. Randomly initialize population and then evaluate it according to the fitness function, so as to determine whether to have further search. However, the PSObased algorithm has no operation such as reproduction, crossover, and mutation, only works through simple arithmetic for evolution, and is simple and easy to achieve. As an important tool of optimization, the PSObased algorithm can be applied in cloud computing and information retrieval [22, 23].
3. Problem Formulation
Each task is assigned to a particular processor and does not exceed any of the computing capacity of the processor without exceeding the deadline of the task. In general, the computation time and deadline for each task are known. But for now, some realtime tasks are dynamically changed. A series of periodic tasks is assigned to the series of heterogeneous processor and does not exceed the deadline. The problem is an NPhard problem. We solve this problem based on particle swarm optimization (PSO).
3.1. Heterogeneous Multiprocessors Platforms
is a heterogeneous multiprocessor platform. in each clock cycle executes only one command and determines speed according to the type of task. is the clock frequency and is the speed to perform a specific task . refers to the execution time of on the , , where is the clock cycles needed for the execution of task.
3.2. Periodic Task Set
consists of realtime tasks. is made up of a binary group (, ), where presents WCET (it is estimated as the worst case execution time); is the task period. generates an infinite sequence of tasks; each task is at most time units, and the interval is time units. The deadline of each is time units after the arrival of .
3.3. RealTime Task Scheduling, Energy Utilization, and Energy Consumption
We build a task scheduling situation matrix (see Table 2). Matrix element indicates whether task can be assigned to processor . The value of element is 0 or 1, respectively, which indicates that task is not assigned to the processor and task is already assigned to processor .
The energy consumption matrix in the realtime task scheduling problem on heterogeneous processors is presented by ; its element is computed as = / which shows the energy consumption it takes to execute task on processor . is a real number whose range will be set ; if task cannot run on processor , then is set +∞.
Energy consumption of in processor on each cycle is as follows: where and are constants. Thus, , and the energy consumption is linear.
The total energy consumption on the processors is
Here we define the theoretical maximum energy consumption value as
3.4. Constraint Model
On the basis of the defined energy consumption matrix , the constraint model of the realtime task scheduling problem in heterogeneous processor is given. The constraint model consists of the following three parts:(1), ,(2), ,(3) is either 0 or 1, ,
where represents the maximum amount of computation each processor allows and will be set to 1 in our experiments.
3.5. Calculation of the Fitness Function
The fitness function is defined as the ratio of actual energy consumption and theoretical energy consumption. By (1), we assume that and are constants and are set as 1. The theoretical maximum energy consumption is calculated as . The actual energy consumption can be computed as . Thus, the fitness function is defined as follows:
3.6. EnergyAware RealTime Task Scheduling Problem for Heterogeneous Multiprocessors
Given and , we name eRTSP energyaware realtime tasks scheduling problem in heterogeneous processors. eRTSP has two conflict optimization aims. The first one is to look for each task assigned to a specific processor and makes the utilization of each processor that does not exceed its maximum utilization. The second one is the energy consumption, which is to find a feasible solution to minimize the energy consumption on the corresponding processor.
4. PSO Algorithm for EnergyAware RealTime Task Scheduling Problem
4.1. Introduction to the PSO Algorithm
The PSO algorithm [21] was first proposed by Eberhart and Shi. It is a kind of evolutionary computation theory. The PSO algorithm is inspired by a social behavior of a group of migrants trying to reach an unknown destination. In the PSO algorithm, each solution is a group of birds and each bird is said to be a particle. All particles have a fitness value which is determined by the function to be optimized and each particle has a speed which determines its flight direction and distance and then the particle searches the optimal solution in solution space with the current optimal particle. The PSO algorithm and GA are both based on the iterative method. A particle is similar to a chromosome in the GA. But unlike the GA, an evolutionary process does not generate new members from the parent member in the PSO algorithm but only changes its own social behavior according to the process of moving towards the destination.
In fact, the PSO algorithm imitates the communication of the birds when they are flying together. Each bird moves towards a certain direction; when in communication, it determines the best position. Therefore, each bird depends on the current position at a particular speed towards the best birds. Then, each bird forms its new location to view their search space and repeats the process until the bird reaches the desired destination. It is important to note that the process also involves the interaction and intelligence in the community, in order to learn from their own experience (local search) and from the surrounding particles experience (global search).
The PSO algorithm is initialized in the initial time for a group of random particles. The th particle is presented as the position of an dimensional space as a point and is the number of variables. In the entire process of the PSO algorithm, each particle displays three variables: the current position of the particle , the best position of the previous iteration of the loop the particle has reached , and flight speed of the particle . These three variables are represented with a component form as follows:
In each time period, the best position of particles is calculated as all of the best adaptations. Therefore, each particle updates its own speed to catch up on the best particle as follows:
According to the above formula and making use of the new speed, we update the position of the particle as follows: in which .
We called , the learning factors which are two constants; rand() and Rand() are two random functions which range in ; is the maximum velocity limit of the particle; is an inertia weight used to affect the current speed. In the formula (6), the second component presents the thought of its current position and the best position. On the other hand, represented by the formula (1), the third component is the cooperation between the particles, comparing the current position of a particle and the best position.
4.2. Applying the PSO Algorithm to eRTSP
4.2.1. Building Energy Matrix and TimeConsuming Matrix
The eRTSP problem can be represented as a bipartite graph. There are two types of nodes: PTS and HMP. A task is mapped to a node of PTS, and a processor is mapped to a node of HMP. If and only if a task can be assigned to the corresponding processor and does not exceed the maximum computing power limit, there is an edge between the two nodes. This assignment consumption directly relates to the energy consumption of the task on the processor.
Therefore, in general, we construct an energy matrix: represents the tasks, represents the processors, and is represented by the energy utilization of the task in the th processor. Each value of the matrix is set as ; if no tasks are assigned on the particular processors, we set the corresponding value of the element in the matrix . Now, we define the constraints: in each row there can only be an element to be visited; accumulated value of the energy of each column cannot exceed 1.
The same as energy consumption matrix, we can build a matrix recording the running time of a task in the corresponding processor. Each element in the matrix of the execution time is = nCycles/nSpeed.
4.2.2. The Update of the Velocity, Position, and Inertia Weight of the Particle
The velocity of the particles is the critical factor for the positions of the particles. The velocity of the particles will affect the overall convergence of the PSO algorithm and will affect the efficiency of the algorithm’s global searching. We consider (6) as a speed profile. The particle’s position updates present the next position of the task. As the particle position updates, we have mentioned formula (7) in the third section, . When > 0, it indicates that it needs to adjust the number of the processor, and then, ; otherwise, the position of the particle remains unchanged; that is, .
The parameter in the PSO algorithm plays a balanced role in global searching and local searching. And over time, the number of iterations increases gradually while linearly reduces. The formula of updating is where is the number of iterations and is the total number of iterations.
4.2.3. Optimization of the Energy Consumption
When we find a feasible solution by the PSO algorithm, we often need to optimize the feasible solution to achieve the second objective: energy consumption target, that is, forthcoming a feasible solution with high energy consumption through a task assigned to other processor or exchanging their corresponding processor running two tasks to reduce the overall energy consumption.
In the initial state, for a processor if its utilization is greater than 1, we extract the task with maximal energy consumption in the processor, run this task in the processor with the lowest utilization, and compare the utilization of the processor to see whether it is greater than 1. If it is not greater than 1, then the corresponding coordinate of this task is updated.
Thereafter, in accordance with the calculated corresponding local and global optimum position of each task and from formula (2), the speed of the particles has to be updated. Subsequently, we check the speed of particles, in case utilization is less than the upper limit of the maximum utilization of each processor; if the speed is greater than , will be assigned to ; if the speed is less than 0, then the speed is set as 0.
In the optimization, first we backup and then analyze the following three cases.
(1) Particle.v > 0 and Particle.v . Let Particle.x equal Particle.v and calculate the corresponding utilization of the processor, in case guaranteed utilization is less than 1, recalculating fitness value. If the energy consumption ratio has been decreased, we modify the original solution and update the value of Particle.x. If there is no reduction of the energy consumption ratio, we do not change the original solution.
(2) Particle.. We will let the value of Particle.x be and recalculate the corresponding processor utilization of , in case guaranteed utilization is less than 1; we recalculate the fitness value and observe whether the energy consumption ratio has been decreased. If declined, we will alter the original plan; if not, we will not change the original plan for the assignment.
(3) Particle.. The general idea is the same with the second case; we will assign Particle.x to 0 and recalculate the utilization of their corresponding processor, in case of utilization is less than 1; we recalculate the fitness value and observe the ratio of the energy consumption to see whether it has been declined; we will change the original plan for the scheduling if so; if not reduced, we will not change the original plan for the assignment.
We assume that if the fitness value does not decrease or remain the same; we quit the iteration and return after iteration 1000 in the PSO algorithm.
4.2.4. PSO Algorithm for eRTSP
See Algorithm 1.

5. Experiment and Result Analysis
In this section, at first, as for the PSO algorithm, we want to determine its parameters in resolving eRTSP. After that, we solve the eRTSP problem with the PSO algorithm and analyze the comparison of the performance of the PSO algorithm, GA, and SFLA in eRTSP with the solution quality and energy consumption.
5.1. Environment of the Experiments
CPU: Intel Core 2 CPU 1.67 GHz. Cache: 512 KB. Memory: 2074492 KB. Operating system: Windows 7. Development platform: Visual Studio 2003.NET.
We will get the results from a large number of randomly generated problem sets with the PSO algorithm. There are a lot of different situations in problem sets and each issue is initialized as processors and tasks.
5.2. The Parameter of the PSO Algorithm
According to the PSO algorithm, there are three parameters , , and , which impact the performance of the PSO algorithm (see Table 1). denotes the inertia weight heavy; and denote the acceleration. The following experiments are set to determine the best combination of the three parameters. The results are shown in Table 4.
As seen from the results in Table 4, the results of the different parameters of the PSO algorithm running the same eRTSP problem are not the same; the parameters for = 1, = 2, and = 2 in this group when solving eRTSP get the largest number of feasible solution and its running time is the shortest. Therefore, in the subsequent experiments, we will select the parameter in this group in solving the eRTSP and compare the performance with the GA and SFLA.
5.3. The Comparison of the Results among the PSO Algorithm, GA, and SFLA in eRTSP
To show a wider range of heterogeneous environments, the use of matrix values is varied. For a periodic task , the definition of the task frequency is the average speed of execution of the tasks before deadline and is defined as . In the PTS, the variance of the frequency of the task is defined as task heterogeneity. In the HMP, for a given task, the variance of the processing time of each processor is defined as the processor heterogeneity.
The method of generation of the consumption utilization matrix of the task is as follows.(1)Generate a vector with random elements in the range of ; its element represents the implementation of clock cycles for each of the tasks .(2)Construct a vector containing floatingpoint type elements ; the size of its elements is in the range . Said that the task of the heterogeneity. This vector can also reflect the frequency of the task .(3)For each column vector , it contains elements; the size of its elements ] indicates the degree of processor heterogeneity.(4)Configure utilization of an matrix, whose element is . Accordingly, the size of the elements is affected by the degree of task and processor heterogeneity. The element size range is ].
In order to obtain the true and objective evaluation of the performance of each algorithm, the characteristics of the utilization matrix are task heterogeneity, processor heterogeneity, and consistency. Therefore, we generated a combination of eight kinds of experiments according to the above features: using the matrix of the high and low task heterogeneity, high and low processor heterogeneity, and being consistent or nonconsistent. High task heterogeneity is represented as 100; low task heterogeneity is expressed as 5. Highly heterogeneous processor is represented as 20; low heterogeneous processor is represented as 5. When the processor performs any task shorter than the processor , we use of matrix consistency. A consistent utility matrix is generated by sorting each vector, and has the fastest processing speed of all the processors and processor is the slowest. In contrast, nonconsistent utility matrix is that processor processes fast on certain tasks than the processor , but the processing speed is slow in the other tasks. It is an unsorted matrix randomly generated. Above 8 experimental category combinations are shown in Table 5.
From Figure 1, on the energy consumption aspect, the energy consumption of the GA and SFLA is higher than the PSO algorithm, wherein the PSO energy consumption using the different test dataset is about 40–50% of the GA and SLFA in energy consumption.
Figure 2 and Table 3 show that the three algorithms running time is different in the same environment of eRTSP; the running time of the GA is the longest, followed by the SFLA. The PSO algorithm is the fastest in 8 groups for the test in general. In particular, we focus on comparing the PSO algorithm and SLFA in the first problem set, finding that the PSO algorithm running time is 10% of the running time of the SFLA. Under consistency utility matrix conditions, the running time of SFLA and the PSO algorithm is essentially the same.
In Figure 3, we compare the three algorithms to find feasible solution volume. The number of feasible solutions found by SFLA and PSO algorithm is less than the GA (see Table 6). The PSO algorithm in the ability to find feasible solution is slightly worse than the GA. Not all of feasible solutions in the GA in the fourth set of experiments are found; however, in other conditions all are found, but the PSO algorithm has a stronger ability to find the feasible solutions than the SFLA. The feasible solution number of the PSO algorithm is 50% more than that of the SFLA algorithm.
Therefore, from the above results we can get that energy consumption and running time of the PSO algorithm on the eRTSP are relatively small, compared with SFLA; GA and has certain advantages. The GA mainly uses the crossover and mutation method and the running time of looking for a feasible solution is much slower and the energy consumption is larger. Running time with SFLA is slower and the number of feasible solutions is less than the PSO algorithm, and the PSO algorithm can find a feasible solution in most cases. When operating time between SFLA and PSO algorithm is similar, energy consumption with the PSO algorithm uses lower energy consumption than SFLA. Therefore, considering the above several function tests, it can be said that the PSO algorithm outperforms GA and SFLA.
6. Conclusions
This paper has a formal description of the realtime task scheduling problem in a heterogeneous environment based on energy consumption and puts forward a new heuristic algorithm based on the particle swarm optimization algorithm to solve the problem. The proposed algorithm not only finds much more feasible solutions within the specified time but also optimizes the energy consumption. According to the results of extensive experiments, the PSO algorithm has a better performance in reducing energy consumption and running time and increasing the number of feasible solutions. The energy consumption is only 40–50% of GA and SLFA. In addition, in finding the feasible solution volume, the PSO algorithm finds a total of 19% more feasible solutions than SFLA in 7 out of 8 sets of test data and finds about 8% less feasible solutions than GA. At running time, the PSO algorithm is faster than GA and SFLA and about 10% faster than SFLA.
The current study in this paper focuses on independent, nonpreemptive, and periodic tasks, but many other factors are not taken into account in realtime task scheduling problem in a heterogeneous processor environment. In the future, we will reduce the constraint conditions and study the problem of realtime task scheduling with priority and communication between tasks.
Conflict of Interests
The authors declare that there is no conflict of interests regarding the publication of this paper.
Acknowledgments
This work was supported by the National Natural Science Foundation of China (NSFC) under Grant no. 61173145, the National Basic Research Program of China under Grant no. G2011CB302605, and the National High Technology Research and Development Program of China under Grant no. 2011AA010705. Albert M. K. Cheng is supported in part by the US National Science Foundation under Awards nos. 0720856 and 1219082.