A Microservice Resilience Deployment Mechanism Based on Diversity

Yu, Hang; Wang, Xiulei; Xing, Changyou; Xu, Bo

doi:https://doi.org/10.1155/2022/7146716

Security and Communication Networks

On this page

Abstract Introduction Related Work System Model Conclusion Data Availability Conflicts of Interest Acknowledgments References Copyright Related Articles

Research Article | Open Access

Volume 2022 | Article ID 7146716 | https://doi.org/10.1155/2022/7146716

A Microservice Resilience Deployment Mechanism Based on Diversity

Hang Yu,¹Xiulei Wang,¹Changyou Xing,¹and Bo Xu¹

Academic Editor: Juhriyansyah Dalle

Received25 Oct 2021

Revised18 Feb 2022

Accepted11 May 2022

Published01 Jun 2022

Abstract

The microservice architecture has many advantages, such as technology heterogeneity, isolation, scalability, simple deployment, and convenient optimization. These advantages are important as we can use diversity and redundancy to improve the resilience of software system. It is necessary to study the method of improving the resilience of software system by diversity implementation and redundant deployment of software core components based on microservice framework. How to optimize the diversity deployment of microservices is a key problem to maximize system resilience and make full use of resources. To solve this problem, an efficient microservice diversity deployment mechanism is proposed in this paper. Firstly, we creatively defined a load balancing indicator and a diversity indicator. Based on this, a diversified microservices deployment model is established to maximize the resilience and the resource utilization of the system. Secondly, combined with load balancing, a microservice deployment algorithm based on load balance and diversity is proposed, which reduces the service’s dependence on the underlying mirror by enriching diversity and avoids the overreliance of microservices on a single node through decentralized deployment, while taking into account load balancing. Finally, we conduct experiments to evaluate the performance of the proposed algorithm. The results demonstrate that the proposed algorithm outperforms other algorithms.

1. Introduction

The ability of the system to maintain the continuous running of services and ensure the completion of tasks is called system resilience [1]. In order to enhance system resilience, service providers usually adopt the idea of diversity and redundancy to deploy software [2]. Typically, the traditional monomer architecture requires that the software must be deployed as a whole, which makes diversity and redundancy implementing technologies costly [3]. In fact, a large number of user requests can be completed only by some functional modules in the software system and only a very small number of requests need all the functional modules of the software system to complete the response.

Microservice architecture [4] decouples software system functions into multiple microservice components and the microservice components can be developed by using container mirroring [5], which greatly improves the reusability of microservice components. The redundant deployment under microservice architecture is more flexible and efficient than monomer architecture. But at the same time, high reusability also means that multiple component containers deployed based on the same mirror are homogeneous. The studies have shown that security vulnerabilities are prevalent in container mirroring [6]. So once an attacker finds vulnerability in a microservice component of the target system, the attacker can then repeatedly exploit the vulnerability to successfully attack containers that use the same mirror, severely affecting the system’s ability to provide services.

Currently, researchers generally agree that diversity [7] methods such as heterogeneous are an effective way to solve homogeneity problems [2]. For systems with diversity, attackers can succeed in an attack if and only if they successfully obtain the vulnerabilities of each implementation version, which makes the attack very difficult. At the same time, as the system may jump in multiple versions of implementations each time when it provides services to users, even if the attacker succeeds temporarily, this attack is difficult to sustain. As a result, the more versions there are, the higher the diversity metrics are, and the difficult it is for an attacker’s attack to succeed.

This paper aims to solve the problem of software system diversity deployment under microservice architecture by using virtualization technology, which eliminates the impact of infrastructure platform inconsistency and realizes the deployment of heterogeneous containers on heterogeneous infrastructure. By further enriching the diversity of software systems, attackers cannot take advantage of the homogeneity of containers to destroy and strike software systems of microservice architectures on a large scale, thereby ensuring the resilience of software systems.

The main contributions of this paper are as follows:(1)A load balancing indicator and a diversity indicator is creatively defined. Based on this, a diversified microservice deployment model is established to maximize the resilience of the system.(2)A microservice deployment algorithm based on load balancing and diversity is proposed, which reduces the service’s dependence on the underlying mirror by enriching diversity and avoids the overreliance of microservices on a single node through decentralized deployment, while taking into account load balancing.(3)Finally, the experiments are designed to evaluate the performance of the proposed algorithm. The results demonstrate that the proposed algorithm outperforms other comparison algorithms.

The remainder of this paper is organized as follows. In Section 2, an overview of the related work is provided. In Section 3, the typical scenarios are shown and the problem is analyzed. In Section 4, the load balance dictator and diversity dictator are given. Based on these two indicators, the optimization model of the problem is given which is proved to be NP hard. In Section 5, a heuristic algorithm, LB-Diversity, is proposed and the performance of this algorithm is shown in Section 6. The application scenario is discussed in Section 7, followed by conclusions of this study in Section 8 and the acknowledgement in Section 9.

At present, researchers have proposed some practical solutions to the service deployment problem on microservice architecture, but these researches mainly focus on efficiency. Wan [8] proposed a resource allocation algorithm EPTA (EC Placement and Task Assignment Algorithm) using the hierarchical characteristics of Docker containers. This algorithm significantly reduces deployment costs by balancing wake-up costs, installation costs, and communication costs. Fan [9] proposed scheduling algorithms which combination of latency, reliability, and load balance-aware to determine container-based microservice deployments in edge computing. By improving and optimizing the Particle Swarm Optimization (PSO), a better LRLBAS algorithm is proposed.

As the software system is implemented based on microservice architecture, which has the advantage of loose coupling, it can be divided into several independent parts and each part works independently. Combined with container technology, individual parts can be packaged into different instances as components to build a complete software system. This provides great potential for resilience enhancement of software systems. Based on this, the application of redundancy technology can be realized to improve the availability of the system.

This redundancy strategy has been widely used in the deployment of Service Function Chain (SFC). Hmaity et al. [10] proposed three redundancy protection schemes in the redundancy protection mechanism of SFC, namely, (1) end-to-end redundancy, (2) virtual node redundancy protection, and (3) virtual link redundancy protection. Although these three schemes can protect nodes and links in a targeted manner, such extensive protection will lead to waste of resources, since the author did not study the number of redundancy. At the same time, the virtual link protection scheme proposed by the author may cause both the primary virtual link and the backup virtual link to be mapped to the same physical link, and this may result in weaker resilience. Han [11] conducted performance analysis experiments for workloads to obtain refined resource requirements and performed microservice deployment based on a greedy heuristic algorithm. The innovation of the algorithm lies in the use of refined resource requirements derived from the analysis results to consider application performance, but the algorithm only considers the case of a single service chain. Joseph [12] proposed a novel heuristic method, IntMA (Interaction-aware Microservice Allocation), to deploy microservices in an interactively aware manner with the help of the interaction information obtained from the interaction graph. However, none of the above work on deployment issues has been discussed based on diversity.

Due to the extensive use of redundancy to ensure availability, most of the microservices (components) in the software system of the microservice architecture are homogeneous. Systems built on Docker container technology introduce the homogeneity brought by using the same base image, which leads to a large number of shared vulnerabilities being introduced into the system [6, 13]. Therefore, the system will be vulnerable to multi-step attacks [14, 15] that exploit the same vulnerability in multiple microservices. Diversity is considered as the most effective means of mitigating the shared vulnerability problem. After the introduction of diversity, how to deploy multi-version microservice components in a server cluster to make the system more secure and efficient has become a problem that researchers need to pay attention to.

At present, some research work is carried out in combination with diversity. Taking into account the security risks that may be caused by the isomorphism of operating systems, Zhang [16] proposed a security deployment strategy for virtual machines based on diversity by using heterogeneous operating systems, which effectively reduced the attacker’s unit attack revenue, but since only the secure deployment of virtual machines is considered, resource constraints and other issues are not considered. Xie [17] proposed a deployment method based on the heterogeneity of nodes, aiming to ensure the heterogeneity of nodes during redundancy backup and remapping. Torkura [18] applied Moving Target Defense (MTD) technology to the microservice architecture for the first time. Using the MTD mechanism, risk analysis was performed on microservices to detect and prioritize vulnerabilities, and then automatic code generation technology was used to convert the programming language and container images of microservices. Minimizing shared vulnerabilities through diversified deployment of security strategies brings uncertainty to attackers and reduced the attack ability of the system under the microservice architecture, thereby overcoming the security risks brought by homogeneous microservices. Alleg [2] proposed a placement solution of Service Function Chains (SFC) modeled as a Mixed Integer Linear Program (MILP) designed to meet the availability level of the target SFC and reduce the cost escalation due to diversity and redundancy. This approach can allocate fewer resources to backup instances while avoiding service disruption. Qing [19] proposed an adaptive spatiotemporal diversity joint scheduling strategy, SASTD (Self Adaptive Spatiotemporal Diversity). The strategy combines spatiotemporal diversity and coarse-grained intrusion detection, overcomes the defects of high defense cost and poor effect under a single strategy, and improves the defense capability of the system at a lower cost. However, the article does not discuss the impact of resource consumption, nor does it take into account the deployment location.

In addition to this, researchers have proposed various security strategies such as Address Space Layout Randomization (ASLR) [20], instruction-level [21], and basic-block-level [22] transformations to address such challenges.

The deployment problem of services under the microservice architecture can be understood as an optimization problem under certain constraints. This problem is an NP-hard problem. In terms of algorithm selection for solving the deployment scheme, researchers usually choose genetic algorithm [23], simulated annealing algorithm [24], list search algorithm [25] and other heuristic algorithms. In addition, the Viterbi algorithm [17, 26] has also been mentioned by researchers.

We provide a comparison of the work mentioned in this section with ours in Table 1. As shown in Table 1, our proposed method considers both diversity and redundancy. Compared with other method, more factors are considered during optimization. In addition, the Docker we use is finer in granularity and better cloud-native.

3. Problem Statement

Based on existing researches, it can be seen that the resilience and damage resistance of software system can be effectively improved through diversity technology, but under the existing monolithic architecture, the realization of software system diversity is facing great challenges. The emergence of microservice technology provides a new way to solve this problem; this paper mainly studies how to establish a diversified microservice deployment model under the constraints of resources, in order to maximize the resilience of the system.

3.1. Failure Scenario

3.1.1. Node Failure

Node failure is an unavoidable problem, and when a node in a cluster fails to work, all containers deployed on that node lose the ability to provide services, which will greatly affect the associated microservice delivery capabilities.(1)If the load of the node is too high, the failure will lead to a significant decrease in the overall service capacity of the system.(2)If the instances of a single microservice are centralized deployed on a node, it will cause the service performance to decrease and become a bottleneck of the whole system.

3.1.2. Shared Vulnerabilities Problem

When using containers to build a software system, for some microservice, if it uses the same basic image in a large number, and once the image has a high-risk vulnerability and it is exploited by the attacker, the attacker can easily use this vulnerability to launch repeated attacks on such containers in the system, making them lose service capability, so as to greatly reduce the performance of the microservice.

3.2. Problem Analysis

To solve the node failure problem described in Section 3.1.1, there are two ideas. The first one requires the system to have good load balancing performance, which can avoid the load being too concentrated on a certain node. The second one requires every microservice to be deployed as evenly as possible among nodes, which is also referred to as the diversity of microservices deployment on nodes. Two illustrative examples are shown as follows.

Example 1. When a large number of instances of service X are concentrated on node N, once node N fails to work due to failure or attack, the service capability of service X will suffer huge losses, thus affecting the overall service performance.

Example 2. When node N hosts a large number of service instances of the system, once node N fails to work due to failures or attacks, the overall service capability of the system will be severely degraded.
To solve the sharing vulnerability problem in Section 3.1.2, it requires any microservice in the system to be deployed with multi-version instances to take full advantage of diversity. At the same time, it is necessary to achieve the balance of the requests of the same microservice between different versions, that is, the diversity of deployment versions of microservices, so as to avoid a single service being too dependent on a certain version implementation, resulting in attacks caused by the mirror vulnerability of this version. The performance of the next microservice suffers a serious loss. An illustrative example is shown as follows.

Example 3. Suppose a total of 20 instances of service X are deployed in the system, 18 of which use the same Alpine-based container template, and the other two instances use other container templates to start. Once the Alpine container template has fatal flaws, it is exploited by attackers to carry out attacks. Through lateral infection, an attacker can cause service X to lose 90% of its service capacity.
According to the analysis, the problem can be parsed and is shown in Figure 1.

4. Indicator Design and System Model

4.1. Symbols

According to the analysis in Section 3.2, node failure can be solved by load balancing and shared vulnerability problems can be mitigated by version diversity. In order to analysis and model the problem, the relevant indicators, optimization objective, and constraints of the problem should be defined first.

The symbols required for modeling in this paper are shown in Table 2.

4.2. Load Balancing Indicator

For microservice MS_x, the CPU resource usage of i-th instance can be expressed as ∙ is a bool value that indicates which mirror the instance MS_x,i uses. For example, when the instance MS_x,i uses the k-th version mirror, the value of is 1; in other cases 0, as shown in formula (1). is also a bool value that indicates where the microservice are deployed, and when the microservice instance is deployed at the node, the value of is 1; otherwise 0, as shown in formula (2).

Therefore, for a single deployment node in cluster, the node CPU resource usage is the sum of CPU resource usages for all instances deployed on that node, which can be expressed as shown in formula (3).

Similarly, the node memory resource usage is equal to the sum of memory resource usages for all instances deployed on the node and can be expressed as shown in formula (4).

The node load rate can then be written as the ratio of the current node load to the total number of node resources, as shown in formulas (5) and (6).

In order to eliminate the influence of inconsistency in the scale, the calculation results are normalized by formula (7) and the normalized deployment nodes CPU and memory resource load balance degree are obtained as , .

Define the node resource load standard deviation and , which represent the difference between the resource load rate on one deployment node and the average of all deployment nodes, as shown in formulas (8) and (9), where and represent the CPU and memory resource load of all deployed nodes after normalization, respectively.

Because of the large difference between node CPU and memory resource data base scale, the direct addition of and cannot effectively reflect the load balancing level of the cluster as a whole; in order to eliminate the impact of the measurement scale and scale between the two indicators, this paper uses the coefficient of variation for calculation. The total load deviation degree of the cluster can be measured by the sum of the coefficients of variation of the node CPU and memory resources, and the smaller the value of the indicator, the smaller the load difference between the nodes, and the better the load balancing effect. To align the cluster load balancing metric with the increase in load balancing performance, the load balancing metric is defined as a negative number of load deviations for the total load of the cluster, which can be expressed in formula (10).

4.3. Diversity Indicator

Diversity is reflected in two main aspects: the deployment node diversity and the version diversity. In order to measure the richness of diversity, this paper refers to the relevant work [6, 14]. According to the measure of biodiversity, a measure of systematic diversity is proposed using the Shannon formula.

4.3.1. Node Diversity

For the diversity of deployment nodes for a single microservice, the mathematical formula can be expressed aswhere represents the node diversity metric for the microservice, represents the number of cluster deployment nodes, and represents the proportion of the total number of instances deployed on node by microservice .

4.3.2. Version Diversity

Similarly, for the diversity of deployment versions of a single microservice, the mathematical formula can be expressed aswhere represents the node diversity metric for the microservice, m_x represents the number of implemented versions of microservices MS_x, and represents the ratio which indicates the number of deployments of version k to the total number of microservice MS_x deployments.

In summary, the diversity indicator can be summarized as shown in formula (13). Single microservice diversity metric is contributed by the deployment node diversity metric for microservices and the deployment version diversity metric, which can be represented as . Overall diversity indicator E_D is the mean of the degree of diversity of all microservices.

4.4. The Goal of Optimization

Load balance degree E_LB and diversity degree E_D both play an important role in solving the problems discussed in this paper; i.e., load balancing and diversity can be considered as two important means of solving the problems raised and enhancing system resilience in this paper, so this paper combines these two factors into one resilience indicator E, as shown in formula (14).

Therefore, the problem raised in this paper can be modeled as an optimization problem. Under certain constraints, the aim is to get , which is the maximum value of the resilience index E. The constraints of the problem are described below.

4.5. Constraints

For any one microservice instance MS_{x, i}, it can only be deployed on one node, so constraint (15) can be obtained.

Similarly, microservice instance MS_x,i can only use one version mirror; that is,

In addition, the ratio of the number of instances deployed on N_j to total deployment number of microservice MS_x, as well as the ratio of the number of instances deployed with version k to total number of microservice MS_x, also has constraints (17) and (18).

From a resource perspective, both CPU and memory resource constraints need to be met during deployment, and their constraints can be expressed as follows:

Formulas (19) and (20) indicate that the sum of the CPU and memory resources consumed by containers deployed on the node does not exceed the maximum resources owned by the node.

Formulas (15)–(20) form the set of constraints for model optimization problem solving in this paper.

5. Algorithm

5.1. Algorithmic Analysis

According to the modeling process, it can be learned that this problem is a problem about the optimal combination of service version and deployment location, which is a typical NP difficult problem. For the solution of such problems, the academic community usually uses heuristic algorithms to solve, such as GA (genetic algorithms), PSO (particle swarm optimization), etc. However, there are common problems of such methods like high time complexity and long convergence time. Industry, on the other hand, has opted for performance compromises, not maximizing the target function, but simply using strategy-based solutions such as Round Robin and Random to get shorter response delays.

The problem in this paper consists of two parts, namely, load balancing and diversity. The Industry’s Round Robin strategy is a very mature solution to the problem of load balancing, whereby the next node is selected for deployment at each deployment location in a certain order of nodes. Although such a strategy is easy to implement, there is a problem of grainy. Because of the differences in resource consumption between different types and versions of microservices, load balancing is not precise. Decisions should be made based on the resource usage of the services that the node has deployed.

In the optimization of diversity, according to formula (13), we can know that if we want to get the maximum value of diversity indicator E_D, we need to meet each , take the largest. However, by formulas (1) and (2), it can be found that there is a consistent description of the node diversity indicator and the version diversity indicator , as follows:

According to Shannon’s first theorem, the indicator D can get the extreme value when and only if p₁ = p₂ = = p_d = 1/d, and the maximum value is log₂d. Thus, when deploying based on redundancy and diversity, there should be as two “even distributions” as possible for each microservice:(1)For any microservice MS_x, the number of microservice instances deployed is distributed as evenly as possible among the nodes, thus maximizing the node diversity indicator .(2)For any microservice MS_x, the number of microservice instances deployed is distributed as evenly as possible among the versions, thus maximizing the version diversity indicator .

5.2. Description of the Algorithm

According to the analysis of Section 4.1, in order to solve the problems, a microservice deployment algorithm based on load balancing and diversity LB-Diversity is proposed. The algorithm realizes the average distribution of microservices on the version through a polling-like mechanism, then uses the current node deployment information for node filtering, realizes the average distribution of services between nodes, and selects the deployment nodes according to the load conditions of the filter results, so as to achieve better load balancing.

The pseudocode of the deployment algorithm is given in Algorithm 1, and its input includes microservice information, node information, microservice list, service version information, node list, number of deployments, output as node deployment results, and microservice deployment results.

In Algorithm 1, line 4 indicates that all versions of the microservices ms_name are reordered based on resource conditions. Line 6 indicates that, with the same number of deployments, deployment takes precedence over less resource-intensive versions for deployment. Lines 7–11 indicate that when you select deployment nodes, you first filter out nodes with fewer current service deployments based on the current deployment status and then find the nodes with the lowest load rate and resources that meet the deployment needs as the final deployment node. In this algorithm, the total number of microservices is s, each type of microservice contains m_x based on different mirrors generated version, and each type of microservices needs to deploy m_x instances, there are deployment nodes in the cluster; then the time complexity of the algorithm can be expressed as

(1)	Input: microservices information ms_info
(2)	Nodes information node_info
(3)	The list of microservices mslist
(4)	The list of microservices versions image_versions
(5)	The list of nodes nodelist
(6)	The number of deployments deploy_num
(7)	Output: the deployment on the node node_deploy_info
(8)	The deployment of microservices ms_deploy_info
(9)	node_deploy_info = [ ]
(10)	ms_deploy_info = [ ]
(11)	for ms_name in mslist:
(12)	sort the image_versions from small to large based on resource consumption according to ms_info ⟶image_versions_sorted
(13)	for instance_seq in range (deploy_num):
(14)	choose the least used and minimal resource consumption version ⟶ deploy_version
(15)	node_ candidates ← find the node with minimal deploy number of ms_name
(16)	deploy_node ← the node with minimal load in node_candidates
(17)	while the resource of deploy_node is not enough:
(18)	deploy_node ← find the next node with minimal load in node_candidates
(19)	end while
(20)	update the resource information of node_info
(21)	ms_deploy_info[ms_name][deploy_version] + = 1
(22)	node_deploy_info[deploy_node][‘deployment’][ms_name] + = 1
(23)	end for
(24)	end for
(25)	return node_deploy_info, ms_deploy_info

Because of microservices type limit, O (m_x∙log (m_x)) is obviously less than n_deploy∙ (n_N∙O (n_N∙log (n_N))), so formula (22) can also be written as shown in formula (23):

6. Experiment

First of all, the paper makes a simple implementation of the system described in the model and verifies its partial performance. In order to better evaluate the performance of the model, the paper simulates the situation under more complex conditions.

6.1. Implementation

In a software system network based on a Software Definition Network (SDN), deployment control functions can be integrated into the SDN controller, making it the control center of the entire software system. As shown in Figure 2, the SDN controller is able to customize the deployment policy based on deployment tasks and resource constraints and distribute the deployment policy to each server in the cluster. In the prototype system designed in this paper, there are two types of servers in the cluster, depending on whether virtual machines are enabled: deploy microservice instances directly as service deployment nodes, and launch virtual machines on the server as service instance deployment nodes, as in Server A in Figure 2.

Diversity can be achieved in many ways, such as code implementation in different languages, using different basic image containers, using different instruction sets for infrastructure environments, etc. Thanks to the development of container technology, using different base images has become the most convenient way. The diversity implementation used in this article is to use different images (such as Ubuntu, CentOS, Debian) as the basis for building template containers when building container templates. The diversity of functional components is obtained by running component scripts on different base containers.

We use Docker container technology to construct the system because Docker provides virtualization technologies that enable research to ignore the differences in the underlying platform. Thus, in this paper, hosts that directly deploy containers, such as Server A, VM1, and VM2, are considered deployment nodes. For business instances deployed on a node, an instance is a container with multiple versions of the implementation of a container that belongs to a microservice, a container template with the same business functionality built with different container mirrors. For example, for containers that implement Web front-end functionality, container builds use the underlying images of Ubuntu, Debian, CentOS, and so on, with different versions of the implementation carrying different resource overheads.

In the software system involved in this scenario, the system business is decoupled into several microservices, including Web front end, database query, and encryption components, and different microservices coordinate with each other to form a complete system service.

The prototype system includes three types of microservices: Web, Database, and Encrypt. Each type of microservice has three implementations based on Ubuntu, CentOS, and Debian. The resource consumption of 9 types of containers is shown in Table 3.

6.2. Evaluation

Our experiment was performed on the Dell T7920 workstation with Intel® Xeon Gold 6248R CPU with 3.00 GHz × 96 (dual processor), and a memory size of 512 GB, using the Ubuntu Server operating system. To verify policy performance, we used Python 3.7 for simulation.

The main simulation parameters covered in this paper are shown in Table 4.

Compare the deployment strategy used in this article with four classic algorithms, including LB-Greedy, Round-Robin, and Random. The LB-Diversity algorithm proposed in this paper is compared with the above four classical algorithms in the aspects of load balancing, shared vulnerability, and node failure.

6.2.1. Adaptability

The Number of Deployments. As you can see from the previous article, there is a multiple-variables group {s, m_x, n_N, n_deploy} in the scenario described in this article. In order to investigate the adaptability of the four deployment algorithms under different sizes, keep the first three in variables group {s, m_x, n_N, n_deploy} unchanged, and gradually increase the number of microservice instance deployments n_deploy; i.e., {s, m_x, n_N, n_deploy} = {5,6,10, n_deploy}.

Figure 3 shows the performance of the load balancing indicators of the four deployment algorithms under different deployment quantities. We find that the LB-Diversity proposed in this paper performs best overall. The load balancing indicator E_LB of LB-Diversity is always high, the volatility is smaller, and the performance index is stable compared with the other three algorithms. In addition, it can be noted that when n_deploy = 60, there is a radical change in the LB-Diversity and LB-Greedy algorithms, showing load balancing indicator E_LB = 0.0, which indicates that under this condition the results of both algorithm configurations bring the load of each node into line. This is due to similar load balancing by both algorithms, except that LB-Diversity uses a certain constraint to avoid deploying too many microservice components of the same type on a single node, resulting in a vulnerable node for a single microservice.

Figure 4 shows four deployment algorithm diversity indicators in different deployments and two conclusions can be drawn from the data. Firstly, as the number of microservice deployment instances increases, the overall degree of diversity of each algorithm will increase, which is the idea of using multi-version implementations to solve problems. Secondly, the LB-Diversity algorithm as a whole has a better diversity performance than the other three algorithms.

In addition, it can be found from Figure 4 that when the number of service deployments exceeds 20, as the number of deployments continues to increase, the diversity index values obtained by most deployment algorithms begin to stabilize. Among them, the results obtained by using the LB-Diversity and Round Robin algorithms, the curve begins to stabilize when the number of deployments is greater than 10. This has reference value for service operators.

Figure 5 shows the performance of four different deployment algorithms in different deployment quantities of the total performance of the resilience indicator, which is contributed by the load balancing indicator E_LB and the diversity indicator E_D. On both indicators, LB-Diversity is the most prominent and therefore also the best performance under the overall resilience indicator.

The Number of Versions. Figures 6 to 8 show the performance of indicators E_LB, E_D, and E under the number of different microservice versions; i.e., {s, m_x, n_N, n_deploy} = {5, m_x, 10, 10}.

In addition to showing the excellent implementation effect of the LB-Diversity algorithm, according to Figure 7, as the number of microservice implementation versions increases, the diversity index E_D has been significantly improved. It can be seen that increasing the diversity of microservice deployment versions is an important way to enhance overall diversity and improve system resilience.

We recommend that operators use as many versions of implementations as possible for each service within an acceptable development cost range to enhance the resilience of the software system to provide services.

6.2.2. Loss Evaluation of Shared Vulnerabilities

In response to a problem caused by an attacker exploiting a vulnerability disclosure, the effect of this issue is that the underlying mirror vulnerability is exploited by the attacker and the service instance launched based on that image is attacked. Therefore, the decline in system service delivery capacity is in fact related to the number of underlying images that are not available due to a vulnerability attack.

As shown in Figure 9, Random has a higher loss than the other three algorithms. LB-Diversity, LB-Greedy, and Round Robin have similar results, and LB-Diversity is slightly smaller than the other two.

6.2.3. Loss Evaluation of Node Failure

For problems caused by node failures, this loss is determined by the number of instances that have been deployed on the failed node, as shown in Figure 10, which is significantly more stable and has a smaller performance loss than the other three deployment methods.

6.3. Evaluation Summary

According to the above experiments, we can conclude that, compared with the other three methods, LB-Diversity has better performance in both load balancing and diversity. This is because LB-Diversity tries to ensure that the principle of “decentralized deployment” is enforced regardless of the deployment location or deployment version. The principle of “decentralized deployment” includes two main points: “decentralized among nodes” and “decentralized among versions.”

In addition, in the failure scenario of node loss, the practice of “centralizing the same service chain on the same node” can also allow as few users as possible to be affected.

7. Application

The service architecture has been widely used at present, and many excellent open source platforms have been developed in practice. Microservice architectures are usually deployed in containers [28]; especially with the rise of Docker container technology [29] in recent years, software systems using Docker technology to build Microservice architectures have become the de facto standard in the industry. The current mainstream microservice architecture platforms mainly include Kubernetes, Swarm, and Mesos. Among them, Kubernetes [30] launched by Google, with its high degree of molecularity, strong operability, and perfect self-healing mechanism, makes the deployment of containerized services simpler and more efficient [27], and has become the current benchmark in the industry.

Since Kubernetes adopts the Master-Slave mode, all control implementations related to cluster state changes will be completed on the Master node. The Kubernetes components on the Master node are highly modular, among which Schedule and Controller Manager are the main components for cluster state control.

The Scheduler is responsible for managing the service instance to be scheduled (a service instance in Kubernetes is a Pod) and all available nodes and selects the best host for the service instance to be scheduled by using the appropriate scheduling algorithm and scheduling strategy according to the cluster load information stored on Etcd. Bind the service instance to Node. Scheduler is a pluggable component that can be replaced with other schedulers.

Controller Manager is responsible for manipulating various controllers of Kubernetes and is a collection of controllers including Replication Controller, Node Controller, etc. Controller is the core abstraction in Kubernetes. After the user declares the desired state through Kube-Apiserver, the controller continuously monitors the current state in Kube-Apiserver and reacts to the difference between the current state and the desired state to ensure that the current state of the cluster is constantly approaching user-declared desired state.

When the total number of deployments and available deployment versions are known, the method proposed in this paper mainly solves two problems: (1) how many containers should be configured for each version; (2) where each container should be deployed. Considering the use of this method in Kubernetes, it is only necessary to convert the calculation result of problem (1) into a yaml file through automatic code generation for automatic deployment. Secondly, the scheduling strategy of Scheduler is replaced by the method used in this paper to determine the location of container deployment.

In addition, in the auto-scaling scenario, whenever a microservice instance is to be added or deleted, we only need to add a module to the scheduling algorithm. This module is responsible for determining the version and deployment location of the adding/removing microservice instance, based on the comparison of the changes in the Total Resilience Indicators before and after the adding/removing, just as shown in formula (13). The auto-scaling problem is also the next research point that our team is working on, which we will show in detail in the next paper.

The above application cases demonstrate the strong adaptability of the proposed mechanism in practical applications. This shows the application value and promotion value of this mechanism in practice.

8. Conclusion

The application of microservice architecture can overcome many problems of traditional monomer architecture, but because microservices are usually deployed by instance through containers, this behavior can effectively improve productivity and module reuse rate, but also create the problem of sharing vulnerabilities between containers.

To mitigate shared vulnerabilities under the microservice architecture, we have established a diversified microservices deployment model to maximize the resilience of the system. Combined with the problem of load balancing, a microservice deployment algorithm based on load balancing and diversity is proposed, which reduces the service’s dependence on the underlying mirror by abundant diversity and avoids the overreliance of microservices on a single node through decentralized deployment, while taking into account load balancing [31].

Data Availability

The authors have pushed the source data they used and the intermediate results in the experiment to GitHub for open source. The URL of the project repository is https://github.com/goodboydan/Microservice-Deployment.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This research was supported by a research grant from the National Natural Science Foundation of China under Grant no. 61379149, the China Postdoctoral Science Foundation under Grant no. 2017M610286, and the National Science Foundation of Jiangsu Province under Grant no. SBK2020043435.

References

D. Gesvindr, J. Davidek, and B. Buhnova, “Design of scalable and resilient applications using microservice architecture in PaaS cloud,” in Proceedings of the 14th International Conference on Software Technologies, pp. 619–630, Prague Czech, July 26 2019.
View at: Publisher Site | Google Scholar
A. Alleg, T. Ahmed, M. Mosbah, and R. Boutaba, “Joint diversity and redundancy for resilient service chain provisioning,” IEEE Journal on Selected Areas in Communications, vol. 38, no. 7, pp. 1490–1504, 2020.
View at: Publisher Site | Google Scholar
F. Douglis and J. Nieh, “Microservices and containers,” IEEE Internet Computing, vol. 23, no. 6, pp. 5-6, 2019.
View at: Publisher Site | Google Scholar
L. A. Vayghan, M. A. Saied, M. Toeroe, and F. Khendek, “Microservice based architecture: towards high-availability for stateful applications with kubernetes,” in Proceedings of the International Conference on Software Quality, Reliability and Security (QRS), pp. 176–185, IEEE, Sofia, Bulgaria, June 2019.
View at: Publisher Site | Google Scholar
S. Wang, Z. Ding, and C. Jiang, “Elastic scheduling for microservice applications in clouds,” IEEE Transactions on Parallel and Distributed Systems, vol. 32, no. 1, pp. 98–115, 2021.
View at: Publisher Site | Google Scholar
T. Combe, A. Martin, and R. Di Pietro, “To docker or not to docker: a security perspective,” IEEE Cloud Computing, vol. 3, no. 5, pp. 54–62, 2016.
View at: Publisher Site | Google Scholar
R. Mijumbi, J. Serrat, J.-L. Gorricho, N. Bouten, F. De Turck, and R. Boutaba, “Network function virtualization: state-of-the-art and research challenges,” IEEE Communications surveys & tutorials, vol. 18, no. 1, pp. 236–262, 2016.
View at: Publisher Site | Google Scholar
X. Wan, X. Guan, T. Wang, G. Bai, and B. Y. Choi, “Application deployment using Microservice and Docker containers: framework and optimization,” Journal of Network and Computer Applications, vol. 119, pp. 97–109, 2018.
View at: Publisher Site | Google Scholar
G. Fan, L. Chen, H. Yu, and W. Qi, “Multi-objective optimization of container-based microservice scheduling in edge computing,” Computer Science and Information Systems, vol. 18, no. 1, pp. 23–42, 2021.
View at: Publisher Site | Google Scholar
A. Hmaity, M. Savi, F. Musumeci, M. Tornatore, and A. Pattavina, “Virtual network function placement for resilient service chain provisioning,” in Proceedings of the 2016 8th International Workshop on Resilient Networks Design and Modeling (RNDM), pp. 245–252, IEEE, Halmstad, Sweden, 2016.
View at: Publisher Site | Google Scholar
J. Han, Y. Hong, and J. Kim, “Refining microservices placement employing workload profiling over multiple kubernetes clusters,” IEEE Access, vol. 8, pp. 192543–192556, 2020.
View at: Publisher Site | Google Scholar
C. T. Joseph and K. Chandrasekaran, “IntMA: dynamic Interaction-aware resource allocation for containerized microservices in cloud environments,” Journal of Systems Architecture, vol. 111, pp. 101785–101799, 2020.
View at: Publisher Site | Google Scholar
R. Shu, X. Gu, and W. Enck, “A study of security vulnerabilities on docker hub,” in Proceedings of the Seventh ACM on Conference on Data and Application Security and Privacy, pp. 269–280, ACM, Scottsdale, Arizona, USA, Octomber 2017.
View at: Publisher Site | Google Scholar
O. H. Alhazmi and Y. K. Malaiya, “Application of vulnerability discovery models to major operating systems,” IEEE Transactions on Reliability, vol. 57, no. 1, pp. 14–22, 2008.
View at: Publisher Site | Google Scholar
A. Nappa, R. Johnson, L. Bilge, J. Caballero, and T. Dumitras, “The attack of the clones: a study of the impact of shared code on vulnerability patching,” in Proceedings of the 2015 IEEE Symposium on Security and Privacy, pp. 692–708, IEEE, San Jose, California, Novembe 2015.
View at: Publisher Site | Google Scholar
M. Zhang, X. Ji, J. Ai, W. Liu, H. Hu, and S. Huo, “Secure deployment strategy of virtual machines based on operating system diversity,” Chinese Journal of Network and Information Security, vol. 3, no. 10, pp. 35–43, 2017.
View at: Google Scholar
J. Xie, P. Yin, Z. Zhang, C. Zhang, and Y. Gu, “Service function chain deployment scheme based on heterogeneous backup and remapping,” Chinese Journal of Network and Information Security, vol. 4, no. 6, pp. 23–35, 2018.
View at: Google Scholar
K. A. Torkura, M. I. H. Sukmana, A. V. D. M. Kayem, and F. C. Cheng, “A cyber risk based moving target defense mechanism for microservice architectures,” in Proceedings of the 2018 IEEE Intl Conf on Parallel & Distributed Processing with Applications, Ubiquitous Computing & Communications, Big Data & Cloud Computing, Social Computing & Networking, Sustainable Computing & Communications (ISPA/IUCC/BDCloud/SocialCom/SustainCom), pp. 932–939, IEEE, Melbourne, Australia, Feb 2018.
View at: Publisher Site | Google Scholar
Q. Tong, Y. Guo, S. Huo, Y. Wang, Y. Man, and K. Zhang, “Design of self-adaptive spatio-temporal diversity joint scheduling strategy,” Journal of Communication, vol. 41, pp. 1–14, 2021.
View at: Google Scholar
S. Jajodia, A. K. Ghosh, V. Swarup, C. Wang, and X. S. Wang, Moving Target Defense: Creating Asymmetric Uncertainty for Cyber Threats, Springer Science & Business Media, USA, 2011.
K. Z. Snow, F. Monrose, L. Davi, A. Dmitrienko, C. Liebchen, and A. Sadeghi, “Just-in-time code reuse: on the effectiveness of fine-grained address space layout randomization,” in Proceedings of the 2013 IEEE Symposium on Security and Privacy, pp. 574–588, IEEE, San Francisco, California, 2013.
View at: Publisher Site | Google Scholar
A. Bittau, A. Belay, A. Mashtizadeh, D. Mazieres, and D. Boneh, “Hacking blind,” in Proceedings of the 2014 IEEE Symposium on Security and Privacy, pp. 227–242, IEEE, San Jose, California, June 2014.
View at: Publisher Site | Google Scholar
Z. Wen and T. R. Lin, “Ga-par: dependable microservice orchestration framework for geo-distributed clouds,” IEEE Transactions on Parallel and Distributed Systems, vol. 31, no. 1, pp. 129–143, 2020.
View at: Publisher Site | Google Scholar
Y. J. Gu, Y. X. Hu, and J. C. Xie, “A spatial and temporal optimal method of service function chain orchestration based on overlay network structure,” Journal of Electronics and Information Technology, vol. 41, no. 11, pp. 2675–2683, 2019.
View at: Google Scholar
Z. Ding, S. Wang, and M. Pan, “QoS-constrained service selection for networked microservices,” IEEE Access, vol. 8, pp. 39285–39299, 2020.
View at: Publisher Site | Google Scholar
C. X. Liu, G. Q. Lu, H. B. Tang, X. L. Wang, and Y. Zhao, “Adaptive deployment method for virtualized network function based on Viterbi algorithm,” Journal of Electronics and Information Technology, vol. 38, no. 11, pp. 227–235, 2016.
View at: Google Scholar
B. Burns, B. Grant, D. Oppenheimer, E. Brewer, and J. Wilkes, “Borg, omega, and kubernetes,” Communications of the ACM, vol. 59, no. 5, pp. 50–57, 2016.
View at: Publisher Site | Google Scholar
A. R. Sampaio and H. B. Kadiyala, “Supporting microservice evolution,” in Proceedings of the 2017 IEEE International Conference on Software Maintenance and Evolution (ICSME), pp. 539–543, IEEE, Shanghai, China, July 2017.
View at: Publisher Site | Google Scholar
D. Bernstein, “Containers and cloud: from lxc to docker to kubernetes,” IEEE Cloud Computing, vol. 1, no. 3, pp. 81–84, 2014.
View at: Publisher Site | Google Scholar
K. Xu, M. S. dissertation, and E. Software, Design and Implementation of A Scalable Distributed Resource Scheduler Based on Kubernetes, Xidian Univ, Xian, China, 2017.
Q. Tong, Y. Guo, H. Hu, W. Liu, G. Cheng, and L.-s. Li, “A diversity metric based study on the correlation between diversity and security,” IEICE - Transactions on Info and Systems, vol. E102, no. 10, pp. 1993–2003, 2019.
View at: Publisher Site | Google Scholar

Copyright

Copyright © 2022 Hang Yu et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

PDF Download Citation

Download other formats

Order printed copies

Views

384

Downloads

484

Citations

Security and Communication Networks

A Microservice Resilience Deployment Mechanism Based on Diversity

Abstract

1. Introduction

2. Related Work

3. Problem Statement

3.1. Failure Scenario

3.1.1. Node Failure

3.1.2. Shared Vulnerabilities Problem

3.2. Problem Analysis

4. Indicator Design and System Model

4.1. Symbols

4.2. Load Balancing Indicator

4.3. Diversity Indicator

4.3.1. Node Diversity

4.3.2. Version Diversity

4.4. The Goal of Optimization

4.5. Constraints

5. Algorithm

5.1. Algorithmic Analysis

5.2. Description of the Algorithm

6. Experiment

6.1. Implementation

6.2. Evaluation

6.2.1. Adaptability

6.2.2. Loss Evaluation of Shared Vulnerabilities

6.2.3. Loss Evaluation of Node Failure

6.3. Evaluation Summary

7. Application

8. Conclusion

Data Availability

Conflicts of Interest

Acknowledgments

References

Copyright