Abstract

As an advanced network calculation mode, cloud computing is becoming more and more popular. However, with the proliferation of large data centers hosting cloud applications, the growth of energy consumption has been explosive. Surveys show that a remarkable part of the large energy consumed in data center results from over-provisioning of the network resource to meet requests during peak demand times. In this paper, we propose a solution to this problem by constructing a dynamic energy-efficient resource management scheme. As a way of saving energy as well as maintaining cloud user’s quality of experience, the scheme presents a multitier cloud architecture by configuring physical machines (PMs) into two pools: a hot (running) pool and a warm (turned on, but in dynamic sleep) pool. Each PM is configured with a resource search engine (RSE) that finds an available virtual machine (VM) for the request, and a synchronous sleep mechanism is introduced to the warm pool. To analyze the end-to-end performance of the cloud system’s service with the proposed scheme, we establish a hybrid queueing system composed of three stochastic submodels by using a matrix-geometric solution. Accordingly, the average latency of requests and the energy-saving rate of the system are derived. Through numerical results, we show the influence of the synchronous sleep mechanism on the system performance. Moreover, from the perspective of economics, we build a system cost function to study the trade-off between different performance measures. An improved Salp Swarm Algorithm (SSA) is presented to minimize the system cost and optimize the sleep parameter.

1. Introduction

As a direct result of the rapid growth in the number of cloud users, some cloud providers have already built large numbers of data centers to satisfy the resources demands [1]. The consequences are massive increases in energy consumption, an excessive increase in carbon emissions, and a reduction in benefits for the cloud providers [2]. Statistical results show that the average data center can consume as much energy as ordinary households [3]. Therefore, based on the concept of green computing, obviously, the development of a greener, more energy-efficient resource management mechanism for cloud systems is becoming more desirable [4, 5].

The main contributions of this paper are summarized as follows:(i)We present a cloud architecture composed of a task-scheduling decision layer, a resource-provisioning layer, and an actual service layer. Over the multitier cloud architecture, we propose an energy-efficient resource management scheme with a synchronous sleep mechanism.(ii)We establish a queueing model composed of three subqueues to capture the proposed scheme. By using a Markov chain-based approach, we derive two performance measures: the average latency of requests and the energy-saving rate of the system.(iii)Taking into account the trade-off between the average latency of requests and the energy-saving rate of the system, we build a system cost function and present an improved Salp Swarm Algorithm (SSA) to optimize the sleep mechanism.

In this section, we review the related work on energy conservation research in cloud systems based on virtualization technology, sleep mode, and multitier cloud architecture. And then, we set forth the motivation for our research.

2.1. Virtualization Technology-Based Energy Conservation Research

In recent years, for utilizing the physical resource optimally, the study of energy conservation strategy for virtual machine (VM) configuration, migration, and consolidation has become a focus of energy conservation research in cloud systems.

Auday et al. considered migration and placement of VMs to enhance the energy efficiency in cloud infrastructure. In order to minimize the additional energy consumption generated by the VM migration, they proposed a distributed approach to an energy-efficient dynamic VM consolidation policy. The approach determined which VMs are migrated and where the selected VMs for migration are placed [6]. For solving the problem of under-utilization of servers in a cloud system, Zakarya et al. used VM consolidation to reduce the number of hosts in use. They explored the impact of VM allocation on energy efficiency and proposed a dynamic VM migration approach, in which the VMs are migrated only if the migration cost could be recovered [7].

Through modeling the energy-aware allocation and consolidation, Ghribi et al. presented an optimal allocation algorithm with a consolidation algorithm relying on migration of VMs to minimize the overall energy consumption in the cloud system. The allocation algorithm was solved as a bin-packing problem aiming to minimize the energy consumption. The consolidation algorithm was based on a linear and integer formulation of VM migration to adapt the placement for released resources [8]. Aiming to save energy and minimize resource wastage, Sharma et al. proposed a multiobjective VM allocation and migration scheme, in which the allocation of VMs was carried out using a hybrid approach of a genetic algorithm and particle swarm optimization [9]. Based on the virtualization technology, the above research improved the utilization rate of the physical resources in use and contributed to the energy conservation.

2.2. Sleep Mode-Based Energy Conservation Research

The sleep mode-based energy conservation strategy is implemented by switching the idle server to a low-power sleep state for the purpose of reducing idle energy consumption in the cloud system.

Jin et al. proposed a clustered VM allocation strategy on the resource layer of the cloud system based on a sleep mode with a wake-up threshold. By establishing a queue with an -policy and asynchronous vacations of partial servers, they derived the performance measures in terms of the average latency of requests and the energy-saving rate of the system [10]. By using a hybrid shuffled frog leaping algorithm, Luo et al. proposed a dynamic VM allocation scheme, which applied a live VM migration strategy and switched some free resource nodes into a sleep mode to reduce energy consumption [3]. Farahnakian et al. developed a dynamic VM consolidation method to solve the optimization problem for setting the number of active hosts based on the utilization of existing resources. The proposed method could make a decision on when to switch a host into the working or sleep mode [11].

Sridharshini et al. proposed an energy-aware scheduling algorithm and a live migration algorithm to efficiently utilize the resources in a cloud system. These two algorithms were used to consolidate heterogeneous workloads to minimize the number of physical machines (PMs) and switch the idle PMs to the sleep mode to reduce energy consumption [12]. The studies mentioned above showed a certain degree of enhanced energy efficiency due to the introduction of a sleep mode.

2.3. Energy Conservation Research under a Multitier Cloud Architecture

A multitier cloud architecture contains multiple separate parts such as an “application layer,” a “management layer,” and a “resource layer” [13]. Some works have appeared examining the energy consumption management in a multitier cloud architecture.

Usman et al. proposed a cloud architecture composed of four modules: broker, cloud manager, VM manager, and resource scheduler. By using an Interior Search Algorithm (ISA), they developed an energy-efficient VM allocation technique to overcome high energy consumption and reduce under-utilized resources in a cloud system [14]. Aiming to use the computing resources productively and energy efficiently, Beloglazov presented a three-tier cloud architecture composed of a global resource manager, user applications, and resource pools. He proposed a distributed dynamic VM consolidation approach utilizing fine-grained fluctuations in the application workloads to minimize the number of active physical nodes [15].

Zhu et al. proposed a cloud framework composed of four modules: application agent, VM allocation center, global scheduling center, and resource pools. In addition, they designed a resource allocation and scheduling strategy to reduce the energy consumption on both the system level and the component level [16]. In order to promote energy efficiency in a cloud system, Ghosh et al. developed a multitier cloud architecture composed of a resource provisioning decision layer, a VM deployment layer, and an actual service layer. Furthermore, for reducing the complexity of performance analysis, they developed a multilevel interactive stochastic submodel method to derive the performance measures of the system [17]. Obviously, it is more reasonable to study the energy consumption problem by considering a multitier cloud architecture.

2.4. Motivation for Our Research

Inspired by the work mentioned above, in this paper, we propose a dynamic energy-efficient resource management scheme in a cloud system. Considering that it is more realistic to study energy conservation under a multitier cloud architecture, we present a cloud architecture composed of a task scheduling decision layer, a resource provisioning layer, and an actual service layer. It’s noted that switching all the idle servers to a low-power sleep state may deteriorate the response performance. To save energy as well as to maintain the cloud user’s quality of experience, we configure PMs into two pools: a hot pool and a warm pool. The PMs in the hot pool keep working continuously to provide cloud services instantly for the arriving requests. The PMs in the warm pool are turned on, but remain in a dynamic sleep mode to reduce energy consumption.

In addition, this paper also considers the provisioning process of VMs in both of the two pools. Concretely, each PM is configured with a resource search engine (RSE) that finds an available VM for each request, and the RSE is set to sleep synchronously with all the VMs on the PM to conserve energy. To analyze the proposed scheme, we establish a hybrid queueing system composed of three stochastic submodels with synchronous multiple vacations, and we study the system performance through theoretical analysis and numerical experiments. By building a system cost function, we study the trade-off between different performance measures and present an improved SSA to optimize the sleep mechanism.

The remainder of this paper is organized as follows. In Section 3, by considering a multitier cloud architecture and two PM pools, we propose an energy-efficient resource management scheme with a synchronous sleep mechanism. In Section 4, we establish a hybrid queueing system composed of three submodels. In Section 5, we analyze the steady-state probability distribution of the queueing system by establishing a three-dimensional Markov chain. In Section 6, based on model analysis results, we evaluate the average latency of requests and the energy-saving rate of the system. In Section 7, we show the influence of the sleep mechanism on the performance measures by using numerical results. In Section 8, we present an improved intelligent algorithm to optimize the sleep mechanism. Finally, we summarize the whole paper in Section 9.

3. Scheme Description

Proper deployment of VMs is critical for the energy conservation and the Quality of Service (QoS) guarantee in a cloud system. In order to save energy and maintain the QoS, this paper proposes a dynamic energy-efficient resource management scheme, where the PMs are grouped into two pools: a hot pool and a warm pool. In the hot pool, the PMs are running continuously and the VMs hosted on a PM are always available. This means that the requests allocated to the hot pool can be served quickly so that the QoS of the cloud system can be guaranteed. In the warm pool, a synchronous sleep mechanism is introduced for the purpose of achieving a better energy-saving effect. The service provided by the warm pool can be delayed by the sleep mechanism. We call the PMs, the RSE, and the VMs in hot pool, the hot PMs, the hot RSE, and the hot VMs. And we call the PMs, the RSE, and the VMs in warm pool, the warm PMs, the warm RSE, and the warm VMs. Based on a multitier cloud architecture and a grouping approach for the PMs, we propose a novel resource management scheme shown in Figure 1.

In Figure 1, we assume that each PM is equipped with a RSE and the maximum number of VMs deployed on one PM is . We also assume that the numbers of the identical PMs in the hot pool and the warm pool are and , respectively, where and . The life cycle of a request with the resource management scheme proposed in this paper is illustrated as follows:(1)All the requests are assumed to be homogeneous and enter a first-come, first-served (FCFS) queue in the system buffer. The request at the head of the queue firstly receives the service of the Task Scheduling Decision Engine (TSDE). As long as the hot pool is not full, the request will be allocated by the TSDE to the hot pool. Otherwise, the request will be allocated to the warm pool.(2)The request allocated to the hot pool randomly enters the FCFS queue in one of the hot PM buffers. The request at the head of the queue is processed by a RSE, which is used to find a VM on the selected PM for resource provision. If at least one idle VM exists on one of the hot PMs, the RSE provisions an available VM to the request, and the request is immediately served by the running VM. After the service is completed, the request will depart the system.(3)The request allocated to the warm pool randomly enters the FCFS queue in one of the warm PM buffers. The request at the head of the queue can have its service delayed due to the introduction of the sleep mechanism. On one of the warm PMs, once all the requests are processed, the RSE together with all the VMs enter a sleep period. Meanwhile, a sleep timer is started. When the sleep timer expires, if at least one request exists in the warm buffer, the RSE and all the VMs on the PM will wake up, otherwise they will enter the next sleep period.

Then, we build a hybrid queueing system to mathematically derive the system performance measures and to solve the performance optimization problem with the proposed scheme.

4. System Model

In this section, we model the proposed scheme as three submodels based on the continuous-time environment as follows. Then, we obtained the continuous-time Markov chains (CTMC) of the hot PM and the warm PM, respectively.

4.1. TSDE Submodel

In cloud systems, some practical requests are independent with each other, while other practical requests are correlated. The computing requests initiated by users are usually uncorrelated. Therefore, the arrival process with Poisson distribution is considered to be appropriate for capturing the stochastic behavior of a cloud computing system with uncorrelated traffic [18].

In this research, we focus on user’s initiated requests. Therefore, we can make the following assumptions. In the request scheduling decision process, we assume that the arrival intervals of requests and the service times of requests are independent, identically distributed (i.i.d) random variables. Request arrivals at the cloud system presented in this paper are supposed to follow a Poisson process with arrival rate . The service time of a request processed by the TSDE is supposed to follow an exponential distribution with service rate .

Therefore, we build a single server queue for the task-scheduling decision process. We define the service intensity of the TSDE to be the number of request arrivals at the TSDE during the service time of a request. is given as follows:

We define the latency of a request in the TSDE buffer to be the time duration from the instant of a request arriving at the TSDE buffer to the instant of the request departing the TSDE buffer. The average latency of requests in the TSDE buffer is obtained as follows:

Substituting equation (1) into equation (2), we have

4.2. Hot Pool Submodel

In this paper, we focus on a hot PM to build a queue model as a submodel of the system called the hot pool submodel and study the performance of the hot pool. Let , be the capacity of the hot PM buffer. Let random variable be the number of requests in the hot PM buffer at instant . Let random variable , be the state of the RSE, whether it is busy with provisioning a VM or not . Each hot VM processes a request by loading a software environment (SE). Let random variable , be the number of hot VMs loaded with an SE at instant t. We call the system level, the system stage, and the system phase. constitutes a three-dimensional continuous-time stochastic process with state space as follows:

We assume that a newly arriving request is randomly allocated to one of the hot PMs. The decomposition of a Poisson process yields multiple Poisson processes [19]. The request arrivals at each hot PM are supposed to follow a Poisson process with arrival rate . We have

We assume that the service time of a request processed by the hot RSE follows an exponential distribution with service rate . The service time of a request processed by the hot VM loaded with SE is supposed to follow an exponential distribution with service rate .

Based on these assumptions, the stochastic process can be regarded as a CTMC.

We define as the steady-state probability distribution of the hot PM for the system level being equal to , the system stage being equal to , and the system phase being equal to . is expressed as follows:where .

We define as the steady-state probability distribution vector of the system level being equal to . can be given as follows:

The steady-state probability distribution of the CTMC is composed of . is given as follows:

4.3. Warm Pool Submodel

In order to evaluate the performance of the warm pool, we focus on a warm PM to build a queue model as another submodel of the system called the warm pool submodel. We assume that the capacity of the warm PM buffer is infinite. Let be the number of requests in the warm PM buffer at instant t. Unlike the hot PMs, a synchronous sleep mechanism is introduced to each warm PM. The RSE and all the VMs on one warm PM will go to sleep synchronously if possible. Let , be the state of the warm RSE. means the warm RSE is asleep, means the warm RSE is idle, and means the warm RSE is busy with provisioning a VM for a request. Just like those in the hot pool, each warm VM also needs to load an SE for processing a request. Let , be the number of warm VMs loaded with an SE at instant t. We call the system level, the system stage, and the system phase. constitutes a three-dimensional continuous-time stochastic process with state space as follows:

The general input flow is split into two streams, one is into the hot pool and the other is into the warm pool. In Section 4.1, the general request arrivals are assumed to follow a Poisson process, so the request arrivals at the warm pool also follow a Poisson process. We assume that a newly arriving request is randomly allocated to one of the warm PMs. The arrival rate of the requests at each warm PM is given as follows:where is the arrival rate of the requests at the TSDE submodel and is the probability that a newly arriving request can be accepted by the hot pool. is calculated as follows:

We assume that the service time of a request processed by the warm RSE follows an exponential distribution with service rate . The service time of a request processed by the warm VM loaded with an SE is supposed to follow an exponential distribution with service rate . A sleep timer is used to control the time length of a sleep period. The time length of the sleep timer is assumed to follow an exponential distribution with sleep parameter .

Based on these assumptions, the stochastic process can be regarded as a CTMC.

We define as the steady-state probability distribution of the warm PM for the system level being equal to , the system stage being equal to , and the system phase being equal to . is given bywhere .

We define as the steady-state probability distribution vector of the warm PM for the system level being equal to . can be given as follows:

The steady-state probability distribution of the CTMC is composed of and given as follows:

5. Model Analysis

In this section, we construct the transition rate matrixes in the context of CTMC and derive the steady-state probability distributions of the hot PM and the warm PM, respectively.

5.1. Steady-State Probability Distributions of the Hot PM

Let be the one-step state transition rate matrix of the CTMC . Let be the one-step state transition rate submatrix of for the system level changing to , from . For the convenience of expression, we denote as , as , and as .(1)For the case of , there are no requests in the hot PM buffer.If a new request arrives at the hot PM, the system state changes in the following two cases:(a)When the number of the hot VMs loaded with an SE is less than and the hot RSE is idle, the newly arriving request accesses the hot RSE immediately. The system level and the system phase remain unchanged, but the system stage increases by one. The system state transfers to from , with .(b)When the number of the hot VMs loaded with an SE is less than , but the hot RSE is busy, or the number of the hot VMs loaded with an SE is up to , the newly arriving request has to wait in the hot PM buffer. The system level increases by one, but the system stage and the system phase remain unchanged. The system state transfers to from , or to from , with .If a request is completely processed by the hot RSE, one of the deployed hot VMs loads the SE and processes this request. The system level remains unchanged, the system stage decreases by one, and the system phase increases by one. The system state transfers to from , with .If a request is completely processed by a hot VM and departs the system, the SE is removed. The system level and the system stage remain unchanged, but the system phase decreases by one. The system state transfers to from , or to from , with .Otherwise, the system state remains fixed at , with , or at , with .In summary, is a matrix given as follows: is a matrix given as follows:(2)For the case of , there is at least one request in the hot PM buffer, and the hot PM buffer is not full.If a new request arrives at the hot PM, the newly arriving request has to wait at the hot PM buffer. The system level increases by one, but the system stage and the system phase remain unchanged. The system state transfers to from or to from , with .If a request is completely processed by the hot RSE, one of the deployed hot VMs loads the SE and provides service for this request. The system state changes in the following two cases:(a)When the number of the hot VMs loaded with an SE is less than , the hot RSE processes the first request waiting in the hot PM buffer immediately. The system level decreases by one, the system stage remains unchanged, and the system phase increases by one. The system state transfers to from , with .(b)When the number of the hot VMs loaded with an SE is up to , the hot RSE becomes idle, and the requests in the hot PM buffer keep waiting. The system level remains unchanged, the system stage decreases by one, and the system phase increases by one. The system state transfers to from with .If a request is completely processed by a hot VM and departs the system, the SE is removed. The system state changes in the following two cases:(a)When the hot RSE is idle, the first request waiting in the hot PM buffer accesses the hot RSE immediately. The system level and the system phase decrease by one, and the system stage increases by one. The system level transfers to from with .(b)When the hot RSE is busy, the requests in the hot PM buffer keep waiting. The system level and the system stage remain unchanged, but the system phase decreases by one. The system state transfers to from , with .Otherwise, the system state remains fixed at with , or at , with .In summary, is an matrix given as follows:Let represent . is an matrix given as follows:Let represent . is an lower triangular matrix given byLet represent . is an diagonal matrix given as follows:(3)For the case of , the hot PM buffer is full. Therefore, no new requests can join the hot PM.

If a request is completely processed by the hot RSE, one of the deployed VMs loads the SE and processes this request. The system state changes in the following two cases:(a)When the number of the hot VMs loaded with an SE is less than , the hot RSE processes the first request waiting in the hot PM buffer immediately. The system level decreases by one, the system stage remains unchanged, and the system phase increases by one. The system state transfers to from , with .(b)When the number of the hot VMs loaded with an SE is up to , no other hot VMs can be provisioned by the hot RSE, so the hot RSE becomes idle, and all the requests in the hot PM buffer keep waiting. The system level remains unchanged, the system stage decreases by one, and the system phase increases by one. The system state transfers to from with .

If a request is completely processed by a hot VM and departs the system, the SE is removed. The system state changes in the following two cases:(a)When the hot RSE is idle, the first request waiting in the hot PM buffer is processed by the hot RSE immediately. The system level and the system phase decrease by one, and the system stage increases by one. The system state transfers to from with .(b)When the hot RSE is busy, all the requests in the hot PM buffer keep waiting. The system level and the system stage remain unchanged, but the system phase decreases by one. The system state transfers to from , with .

Otherwise, the system state remains fixed at with , or at , with .

Obviously, is an matrix, and . is an lower triangular matrix given as follows:

At present, we have obtained all the submatrices in the one-step state transition rate matrix . can be written as follows:

The steady-state probability distribution of the CTMC satisfies the following equilibrium equation and normalization condition:where is an vector with all elements being equal to 1.

By solving equation (23), we derive the steady-state probability distribution of the CTMC , where .

5.2. Steady-State Probability Distribution of the Warm PM

Let be the one-step state transition rate matrix of the CTMC . Let be the one-step state transition rate submatrix of for the system level changing to , from . We denote as , as , and as .(1)For the case of , there are no requests in the warm PM buffer.If a new request arrives at the warm PM, the system state changes in the following three cases:(a)When the warm RSE and the warm VMs are asleep, the newly arriving request has to wait in the warm PM buffer until the sleep timer expires. The system level increases by one, but the system stage and the system phase remain unchanged. The system state transfers to from with .(b)When the warm RSE and the warm VMs are awake, and the number of the warm VMs loaded with an SE is up to ; no other warm VMs can be provisioned by the hot RSE. The newly arriving request has to wait in the warm PM buffer. The system level increases by one, but the system stage and the system phase remain unchanged. The system state transfers to from with .(c)When the warm RSE and the warm VMs are awake, and the number of the warm VMs loaded with an SE is less than , at least one VM can be provisioned. If the warm RSE is busy, the newly arriving request has to wait in the warm PM buffer. The system level increases by one, but the system stage and the system phase remain unchanged. The system state transfers to from , with . If the warm RSE is idle, the newly arriving request accesses the warm RSE immediately. The system level and the system phase remain unchanged, but the system stage increases by one. The system state transfers to from , with .If a request is completely processed by the warm RSE, one of the deployed warm VMs loads the SE and processes this request. The system level remains unchanged, the system stage decreases by one, and the system phase increases by one. The system state transfers to from , with .If a request is completely processed by a warm VM and departs the system, the system state changes in the following two cases:(a)When the warm RSE is idle and there is only one warm VM loaded with an SE, the used SE is removed. The warm RSE and the warm VMs enter a sleep period immediately. The system level remains unchanged, but the system stage and the system phase decrease by one. The system state transfers to from with .(b)When the warm RSE is idle and there are at least two warm VMs loaded with an SE or the warm RSE is busy and there is at least one warm VM loaded with an SE, the used SE is removed. The system level and the system stage remain unchanged, but the system phase decreases by one. The system state transfers to from , or to from , with .Otherwise, the system state remains fixed at with , at , with , or at , with .In summary, is a matrix given as follows: is a matrix given as follows:(2)For the case of , there is at least one request in the warm PM buffer.

If there are no new request arrivals on the warm PM, while the warm RSE and the warm VMs are asleep, once the sleep timer expires, the warm RSE wakes up and processes the first request waiting in the warm PM buffer immediately. The system level decreases by one, the system stage increases by two, and the system phase remains unchanged. The system state transfers to from with .

If a new request arrives at the warm PM, the newly arriving request has to wait in the warm buffer. The system level increases by one, but the system stage and the system phase remain unchanged. The system state transfers to from , to from , or to from , with .

If a request is completely processed by the warm RSE, one of the deployed warm VMs loads the SE and provides service for this request. The system state changes in the following two cases:(a)When the number of the warm VMs loaded with an SE is less than , the warm RSE processes the first request waiting in the warm PM buffer immediately. The system level decreases by one, the system stage remains unchanged, and the system phase increases by one. The system state transfers to from , with .(b)When the number of the warm VMs loaded with an SE is up to , no other warm VMs can be provisioned by the warm RSE. Therefore, the warm RSE becomes idle and none of the requests waiting in the warm PM buffer can access the warm RSE. The system level remains unchanged, the system stage decreases by one, and the system phase increases by one. The system state transfers to from with .

If a request is completely processed by a warm VM and departs the system, the used SE is removed. The system state changes in the following two cases:(a)When the warm RSE is idle, the first request waiting in the warm PM buffer accesses the warm RSE immediately. The system level and the system phase decrease by one, but the system stage increases by one. The system state transfers to from with .(b)When the warm RSE is busy, none of the requests waiting in the warm PM buffer can access the warm RSE. The system level and the system stage remain unchanged, but the system phase decreases by one. The system state transfers to from , with .

Otherwise, the system state remains fixed at with , at with , or at , with .

In summary, is an matrix given as follows:

Let represent . is an upper triangular matrix given as follows:

Let represent . is an lower triangular matrix given by

Let represent . is an diagonal matrix given as follows:

At present, we have obtained all the submatrices in the one step state transition rate matrix . can be written as follows:

Based on the structure of the one-step state transition rate matrix , the three-dimensional CTMC of the warm PM can be regarded as a type of Quasi Birth-and-Death (QBD) process. Thus, we can apply the method of a matrix-geometric solution [20, 21] to derive the steady-state probability distribution of the CTMC , where .

First, we set up a matrix quadratic equation as follows:

Since must be nonsingular, from equation (31), we have

By deducing equation (32), we obtainwhere and .

In order to compute the rate matrix , we present an iteration algorithm in Table 1.

Using the rate matrix obtained in Table 1, we further construct a square matrix as follows:

The steady-state probability distribution vectors and satisfy the following equation:where is a vector and is an vector, respectively, with all elements being equal to 1.

By using the Gauss–Seidel method, we solve equation (35) to obtain and . Other steady-state probability distribution vectors , satisfy the matrix-geometric solution form as follows:

Up to this point, we can mathematically give the steady-state probability distribution of the CTMC .

6. Performance Measures

In this section, we present two performance measures of the cloud system: the average latency of requests and the energy saving rate of the system.

The service intensity of the warm PM is given as follows:where is the arrival rate of requests at a warm PM, is the service rate of a request on a warm VM, and is the service rate of a request on the warm RSE.

For the proposed scheme, the service intensity of the system is given as follows:where is the service intensity of the TSDE.

The necessary and sufficient condition for the system to be stable is . We evaluate the average latency of requests and the energy-saving rate of the system under the condition that the service intensity .

We define the latency of a request as the time duration from the instant a request arrives at the cloud system to the instant the request is about to receive the service. In this paper, the average latency of requests in the cloud system includes the average latency of requests queueing in the TSDE buffer and the average latency of requests queueing in the hot PM buffer or the warm PM buffer.

In Section 4.1, the average latency of requests queueing in the TSDE buffer has already been obtained. Next, we need to compute the average latency of requests queueing in the hot PM buffer or the warm PM buffer.

Using the steady-state probability distribution of the CTMC given in Section 5.1, the average number of requests queueing in the hot PM buffer can be given by

For the convenience of technique, we tag one of the hot PMs. Based on Little’s law, the average latency of requests queueing in the buffer of the tagged hot PM is obtained as follows:where is the arrival rate of requests at the tagged hot PM.

We also tag one of the warm PMs. Using the steady-state probability distribution of the CTMC given in Section 5.2, the average number of requests queueing in the buffer of the tagged warm PM is given as follows: shown in equation (41) is a sufficiently large number satisfying the following equation:where , called a precision factor of the average number of requests in the warm PM buffer, is a number related to the precision of the average number of requests in the warm PM buffer. The smaller the value of is, the more precisely the average number of requests in the warm PM buffer will be given.

Accordingly, the average latency of requests queueing in the tagged warm PM buffer is obtained as follows:where is the arrival rate of requests at the tagged warm PM.

Combining equations (40) and (43), the average latency of requests queueing in the hot PM buffer or the warm PM buffer can be obtained as follows:where is the probability that a newly arriving request can be accepted by the hot pool.

In summary, the average latency of requests queueing in the cloud system can be derived by

Since the TSDE and the hot PMs are always running, the energy consumption there is normally constant. The energy-saving rate of the system is therefore measured as the energy conservation per unit time in the warm pool.

When the warm PMs are in the active state, the energy is consumed normally just like in the TSDE and the hot PMs. Let , be the energy consumption per second for the warm pool in the active state. Let , be the energy consumption per second for the warm pool in the sleep state. When the warm PMs are in the sleep state, less energy will be consumed. It is obvious that .

For a sleeping warm PM, if a sleep period is about to expire, the RSE and the VMs need to monitor the PM buffer. Therefore, additional energy will be consumed. Let , be the energy consumption for each monitoring. It is noted that additional energy is also consumed when the warm PM wakes up from a sleep state. Let , be the energy consumption for each wake up.

Therefore, in this paper, energy-saving rate of the system is given as follows:where is the sleep parameter of the proposed dynamic sleep mechanism defined in Section 4.3.

, shown in equation (46), is a sufficiently large number, which satisfies the following equation:where , called a precision factor of the energy-saving rate of the system, is a number related to the precision of the energy-saving rate of the system. The smaller the value of is, the more precisely the energy saving rate of the system will be given.

7. Numerical Results

To numerically analyze the average latency of requests and the energy-saving rate of the system with the proposed scheme, we carry out experiments to provide numerical results based on MATLAB. All the experiments are carried out on a PC configured with Intel(R) Core(TM) i7-4790 CPU @ 3.60 GHz, 8.00 GB RAM, and 500G disk. The parameters set in the experiments are listed in Table 2.

7.1. Numerical Results for the Average Latency of Requests

Figure 2 illustrates the change trend for the average latency of requests with the sleep parameter for a different number of the hot PMs and a different number of the warm PMs.

In Figures 2(a) and 2(b), we show the average latency of requests for the different service rates of a request on a hot VM and the different capacities of a hot PM buffer, respectively. In Figures 2(c) and 2(d), we show the average latency of requests for the different service rates of a request on a warm VM and the different service rates of a request on a warm RSE, respectively.

From Figure 2, we notice that, as the sleep parameter increases, the average latency of requests firstly decreases accordingly and then tends to be fixed.

In the stage of the smaller sleep parameter , a newly arriving request has to wait for a longer time in the buffer of a sleeping warm PM. As the sleep parameter grows, the waiting time of a request in the warm PM buffer gets shorter. Therefore, the average latency of requests shows a downtrend. This implies that the influence of the sleep mechanism on the response performance of the system is greater in the case of a smaller sleep parameter.

When the sleep parameter gets larger and grows to a certain value, the time length of a sleep period is close to zero. Therefore, a warm PM has little chance to go to sleep. As a result, the average latency of requests tends to be fixed as the sleep parameter increases. This implies that the proposed sleep mechanism has little effect on the response performance of the system when the sleep parameter is large enough.

For the same sleep parameters in both Figures 2(a) and 2(b), we notice that the average latency of requests goes up as the capacity of a hot PM buffer increases. The larger a hot PM’s buffer capacity is, the longer the requests wait in the hot PM buffer. This gives rise to an increase in the average latency of requests. We also notice that, as the service rate of a request on a hot VM increases, the average latency of requests gets reduced. The higher the service rate is, the less time a request occupies the hot VM. Therefore, the average latency of requests shows a downtrend.

Comparing Figures 2(a) with 2(b), we find that, for the same capacity of a hot PM buffer, the same service rate of a request on a hot VM, and the same sleep parameter , as the number of the hot PMs increases, the average latency of requests becomes lower. The more the PMs are deployed in the hot pool, the earlier the requests arrive at the hot pool receive service. Therefore, the average latency of requests shows a downtrend. In addition, we also find that when the sleep parameter is smaller, the downtrend for the average latency of requests gets slighter as the number of the hot PMs increases. This implies that the more the PMs are deployed in the hot pool, the weaker the influence of the sleep mechanism on the response performance of the system becomes.

For the same sleep parameters in both Figures 2(c) and 2(d), we observe that the average latency of requests rises up as the service rate of a request on a warm VM increases. When the service rate of a request on a warm VM is higher, the probability of the warm RSE and the warm VMs being idle is greater. Therefore, the warm PM is more likely to be asleep, which causes the request to wait longer in the warm PM buffer. Accordingly, the average latency of requests gets larger. We also observe that, as the service rate of a request on a warm RSE increases, the average latency of requests is reduced. The higher the service rate of a request on a warm RSE is, the less time a request occupies the warm RSE. This leads to a lower average latency of requests.

Comparing Figures 2(c) with 2(d), we find that, for the same service rate of a request on a warm VM, the same service rate of a request on a warm RSE, and the same sleep parameter , a greater number of the warm PMs gives rise to a lower average latency of requests. The more the PMs are deployed in the warm pool, the earlier the requests arrive at the warm pool receive service. Therefore, the average latency of requests gets reduced. In addition, we also find that the downtrend for the average latency of requests becomes sharper as the number of the warm PMs increases in the case of a smaller sleep parameter. This implies that the more the PMs are deployed in the warm pool, the stronger the influence of the sleep mechanism on the response performance of the system becomes.

7.2. Numerical Results for the Energy-Saving Rate of the System

Figure 3 shows the trends for the energy-saving rate of the system with the sleep parameter for a different number of the hot PMs and a different number of the warm PMs.

In Figures 3(a) and 3(b), we show the energy-saving rate of the system for the different service rates of a request on a hot VM and the different capacities of a hot PM buffer, respectively. In Figures 3(c) and 3(d), we show the energy-saving rate of the system for the different service rates of a request on a warm VM and the different service rates of a request on a warm RSE, respectively.

From Figure 3, we notice that, as the sleep parameter increases, the energy-saving rate of the system shows a downward trend. In the stage of a smaller sleep parameter, the energy-saving rate of the system is initially higher. The smaller the sleep parameter is, the longer the time length of a sleep period is. For this case, frequent listening and waking up of the warm RSE and the warm VMs are avoided so that additional energy use is reduced.

As the sleep parameter gets larger, the energy-saving rate of the system decreases. The larger the sleep parameter is, the shorter the time length of a sleep period is. For this case, the warm RSE and the warm VMs listen to the buffer and wake up from sleep frequently. This causes additional energy consumption.

For the same sleep parameters in both Figures 3(a) and 3(b), we notice that the energy-saving rate of the system goes up as the capacity of a hot PM buffer or the service rate of a request on a hot VM increases. The larger the capacity of a hot PM buffer is, the more requests the hot PM can accept. The higher the service rate of a request on a hot VM is, the less time a request occupies the hot VM. Therefore, the processing capability of a hot PM becomes stronger. For this case, fewer requests are allocated to the warm pool so that the warm PMs are more likely to be in the sleep state. Accordingly, the energy-saving rate of the system is greater.

Comparing Figures 3(a) with 3(b), we find that, for the same capacity of a hot PM buffer, the same service rate of a request on a hot VM, and the same sleep parameter , a larger number of the hot PMs leads to a higher energy-saving rate of the system. The more the PMs are deployed in the hot pool, the stronger the processing capability of the hot pool is. For this case, fewer requests are allocated to the warm pool so that the warm PM is more likely to be in the sleep state. This causes an increase in the energy-saving rate of the system. In addition, we also find that the more the PMs are deployed in the hot pool, the closer the energy-saving rates of the system with different capacities of a hot PM buffer and different service rates of a request on a hot VM are. This implies that the capacity of the hot PM buffer and the service rate of a request on a hot VM have less influence on the energy-saving rate of the system as the number of the hot PMs rises.

For the same sleep parameters in both Figures 3(c) and 3(d), we observe that the energy-saving rate of the system rises as the service rate of a request on a warm VM or service rate of a request on a warm RSE grows. The higher the service rate of a request on a warm RSE is, the less time a request occupies the warm RSE. The higher the service rate of a request on a warm VM is, the less time a request occupies the warm VM. For this case, the warm PM is more likely to become idle and enter a sleep period. Therefore, the energy-saving rate of the system shows a growth trend.

Comparing Figures 3(c) with 3(d), we find that, for the same service rate of a request on a warm VM, the same service rate of a request on a warm RSE, and the same sleep parameter , a greater number of the warm PMs leads to a higher energy-saving rate of the system. The more the PMs are deployed in the warm pool, the stronger the processing capability of the warm pool is. For this case, the probability of a warm PM being idle is higher, so the warm PM is more likely to be in the sleep state. This leads to an increase in the energy-saving rate of the system. In addition, we also find that the more the PMs are deployed in the warm pool, the closer the energy-saving rates of the system with different service rates of a request on a warm VM and different service rates of a request on a warm RSE are. This implies that when the number of warm PMs is greater, the energy-saving rate of the system is rarely affected by the service rate of a request on a warm VM and the service rate of a request on a warm RSE.

8. Performance Optimization

Based on the numerical results given in Section 7, we find that, with an increase in the sleep parameter , the average latency of requests shows a downward trend, and the energy-saving rate of the system also decrease. This indicates that when the sleep parameter tends to infinity, the average latency of requests will be minimized, and the energy-saving rate will be close to zero. Obviously, in this case, the energy-saving mechanism will not work at all. Conversely, when the sleep parameter tends to zero, the energy-saving rate will be maximized, and the average latency of requests will become too great to be accepted. In this case, the cloud system cannot provide service normally. How to optimally set the sleep parameter is an important issue in any energy-efficient resource management scheme. In this paper, the criterion for optimization is to balance different performance measures. To do this, we combine the average latency of requests and the energy-saving rate of the system and construct a cost function as follows:where and are the influencing factors for the average latency of requests and the energy-saving rate of the system, respectively, in regards to the cost function in the system parameters. It is noted that the higher the cloud user’s demand for the response performance is, the larger the parameter should be set; the higher the cloud provider’s demand for the energy efficiency is, the larger the parameter should be set.

We note that it is difficult to express the average latency of requests and the energy-saving rate of the system in closed forms. Therefore, we cannot easily figure out the monotonicity of the cost function. For minimizing the system cost and optimizing the sleep parameter , we introduce a swarm-based algorithm: SSA.

SSA is an intelligent searching optimization algorithm inspired by the swarming behaviour of salps. In 2017, Seyedali et al. first established a mathematical model of salp chains and presented the SSA to settle many optimization problems [22]. SSA has only one main controlling parameter, so it is simple and easy to implement. However, like other swarm-based algorithms, SSA has the insufficiencies of low convergence precision and slow convergence speed when dealing with high-dimensional complex optimization problems [23]. In the classical SSA optimization process, global exploration and local exploitation are a pair of contradictions. If this process is out of balance, the algorithm easily falls into local optimization and leads to convergence stagnation. Consequently, in this paper, we present an improved SSA by introducing logistic chaotic initialization and adaptive inertia weight [24]. We call this improved SSA LA-SSA.

In this LA-SSA, we firstly adopt a logistic chaotic mapping method to generate the initial salp population. This enhances the diversity of the initial individuals and improves the convergence speed of the algorithm in the early stage. Secondly, we introduce an adaptive inertia weight to update the follower position. The inertia weight reflects the ability of the follower to inherit the salp position from the previous one. If the position of the follower is the locally optimal solution, it is easy to fall into the local optimum and result in convergence stagnation for SSA. Moreover, to improve the convergence precision and help SSA break out of the local optimum, in this paper, the inertia weight of linear decline is introduced, which determines the degree of influence of the previous individual on the current individual. This means salp individuals have strong global convergence capacity and relatively accurate results can be obtained in the later stage.

Table 3 shows the main steps of the LA-SSA.

In addition to utilize the parameters in Table 2, we set , , , , , , , , and as an example in the LA-SSA to optimize the dynamic energy-efficient resource management scheme proposed in this paper. For different capacities of a hot PM buffer and different service rates of a request on a warm VM, we produce the optimal sleep parameter and the minimum cost in Table 4.

From Table 4, we observe that, for the same capacity of a hot PM buffer, the optimal sleep parameter maintains an upward trend as the service rate of a request on a warm VM increases. In contrast, the minimum cost shows a downward trend when the service rate of a request on a warm VM goes up.

9. Summary

Considering large amounts of energy consumption generated by cloud data centers, we proposed a dynamic energy-efficient resource management scheme under a multitier cloud architecture. In order to improve the energy efficiency while maintaining the quality of experience for cloud users, we grouped the PMs into different resource pools and introduced a synchronous sleep mechanism to the warm pool. By establishing a Markov chain, we obtained the average latency of requests and the energy-saving rate of the system. In addition, we provided numerical results to study the influence of the sleep mechanism on the system performance. To balance different performance measures, we constructed a system cost function. Moreover, we presented an improved SSA to obtain the optimal sleep parameters and the minimum costs.

In subsequent work, we consider to study energy conservation in cloud systems with heterogeneous cloud users and PMs. Furthermore, we consider to analyze the system models by considering any general stochastic processes, such as Markovian Arrival Process (MAP) and Markovian Service Process (MSP).

Data Availability

The data used to support the findings of this study are included within the article.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This work was supported by National Natural Science Foundation (nos. 61872311 and 61973261), China, and was supported in part by MEXT and JSPS KAKENHI , Grant JP17H01825 Japan.