Abstract

Edge computing migrates cloud computing capacity to the edge of the network to reduce latency caused by congestion and long propagation distance of the core network. And the Internet of things (IoT) service requests with large data traffic submitted by users need to be processed quickly by corresponding edge servers. The closer the edge computing resources are to the user network access point, the better the user experience can be improved. On the other hand, the closer the edge server is to users, the fewer users will access simultaneously, and the utilization efficiency of nodes will be reduced. The capital investment cost is limited for edge resource providers, so the deployment of edge servers needs to consider the trade-off between user experience and capital investment cost. In our study, for edge server deployment problems, we summarize three critical issues: edge location, user association, and capacity at edge locations through the research and analysis of edge resource allocation in a real edge computing environment. For these issues, this study considers the user distribution density (load density), determines a reasonable deployment location of edge servers, and deploys an appropriate number of edge computing nodes in this location to improve resource utilization and minimize the deployment cost of edge servers. Based on the objective minimization function of construction cost and total access delay cost, we formulate the edge server placement as a mixed-integer nonlinear programming problem (MINP) and then propose an edge server deployment optimization algorithm to seek the optimal solution (named Benders_SD). Extensive simulations and comparisons with the other three existing deployment methods show that our proposed method achieved an intended performance. It not only meets the low latency requirements of users but also reduces the deployment cost.

1. Introduction

Due to long-distance network communication, data transmission has a long round-trip time. And remote cloud computing services for delay-sensitive IoT applications [13] (i.e., Internet of Vehicles, intelligent industrial control, virtual reality/augmented reality, and online games) may lead to poor user satisfaction. To enable users to access service nodes in a timely manner and quickly meet user requirements, service content is distributed to appropriate access sites. In general, the access site close to the user is preferred to be processed in time by reducing transmission distance, and tasks are going to be executed in multiaccess edge computing (MEC) servers [4]. MEC refers to processing, analyzing, and storing data closer to where it is generated to enable rapid, near real-time analysis and response, which can alleviate the backhaul link pressure with the traditional network architecture. And it can offload a large number of complex computations to edge servers, reducing the cost of remote network communication and effectively meeting the requirements of low latency and bandwidth efficiency. Therefore, the operating costs of edge servers are significantly reduced [5].

A wireless metropolitan area network (WMAN) is a computer network, usually as a public utility, that provides wireless Internet coverage to mobile users in metropolitan areas. The core idea of mobile edge networks is to move network functions, contents, and resources closer to end users. The network resources mainly include computing, storage or caching, and communication resources. A mobile edge network scenario mainly includes the following four parts: MEC server, base station, terminal equipment, and core network equipment. When the edge server is deployed in the metropolitan area network, due to the expansion of the service area and the increase of deployed resources, the scale of the problems to be solved becomes larger and the computational complexity increases. With a large network, service providers can take advantage of economies of scale when providing edge services [6, 7]. So in this work, we focus on the edge server placement in collaborative edge computing environments that provides wireless internet coverage for mobile users in a large-scale metropolitan area. First, a large number of mobile users access edge servers in edge computing environments because the metropolitan area that it covers has a high population density. Second, because of the size of the network, service providers can take advantage of economies of scale when offering edge server services by making edge server services more affordable to the general public. Therefore, how to place edge server becomes a critical and meaningful research topic.

The deployment location of the edge server and the number of servers in each edge micro data center profoundly impact on the costs and the performance of edge services or 6G networks [8], such as the end-to-end delay and resource utilization. On the one hand, if the edge server is deployed far away from the user, the user can only access the nearest edge site through multiple forwarding. The deployment of the servers affects transmission latency in the scenario where data needs to be analyzed in real-time and used for precise control. In addition, the rental cost of the server deployment location varies with geographic location, which significantly impacts on deployment costs. Therefore, edge service providers should set out server sites to achieve a high quality of service for low-latency applications. After the deployment location is determined, each server group serving the surrounding users has limited transmission resources within the region. The resources available within the deployment group, such as the number of servers, should be appropriately adjusted according to the density of users in the region and the service requirements required by users. Deploying a relatively large number of servers in areas with low user density or a relatively small number of servers in areas with high user density is not a reasonable deployment strategy. Unreasonable deployment causes overload or underload of cloud servers and brings the same problems to the transmission process. So we propose Benders_SD algorithm to optimize the deployment of edge servers. This study owns threefold specific contributions as follows. (1)Considering user distribution density, deployment cost, and network access delay in each service area by the candidate locations, the edge server placement problem in the WMAN area is transformed into a MINP problem(2)The Benders_SD optimization algorithm for sparse edge server deployment is proposed to minimize the total capital investment cost while meeting the low latency requirements of users in collaborative edge computing environments(3)The simulation results show that our presented Benders_SD optimization algorithm can successfully solve the above problems, reducing the user delay requirements and the deployment cost to the greatest extent

The remainder of this paper is organized as follows. Section 2 reviews the related work. Section 3 describes the analysis and modeling of edge server deployment problems, and Section 4 presents Benders decomposition of edge server deployment problems. Section 5 gives implementation of edge server deployment algorithm based on Benders decomposition. Section 6 shows an example of edge server deployment. Section 7 evaluates these algorithms by extensive simulations. Section 8 gives the conclusion.

The locations of edge servers have an important influence on user access delay and resource utilization of edge server. Therefore, the strategic placement of edge servers will significantly improve the performance of edge computing systems. To perfect MEC standardization, European Telecommunication Standards Association Working Group on Mobile Edge Computing and Heavy Reading gathered typical use cases and deployment scenarios [9, 10]. However, compared with the research on edge computing resource scheduling, there are relatively few works focusing on edge server placement [6]. In some of these studies, edge server placement is modeled as optimization problems, such as multiobjective constrained optimization problems [6, 1115], integer linear programming (ILP) problems [1618], and MINP problems [19, 20]. The most commonly used methods are k-means clustering [12, 2123], heuristic algorithm [7, 11, 1519], branch and bound method [16, 20], and so on.

Edge server placement was modeled as a multiobjective optimization problem in some work. Wang et al. [6] adopted mixed-integer programming (MIP) to find the optimal edge server placement with workload balancing among edge servers and minimizing the edge server access delay. Kasi et al. [11] used genetic algorithms and local search algorithms to find an edge server allocation strategy. Guo et al. [12] proposed an approximate approach that adopted the k-means and mixed-integer quadratic programming to balance the workload between edge clouds and minimize the service communication delay of mobile users. Li et al. [13] studied the deployment of edge servers in a smart city mobile edge computing environment. And the optimal solution was found by using mixed-integer programming to balance the workload of edge servers and minimize the access delay between mobile users and edge servers. Li et al. [13] proposed the optimal deployment and allocation strategy of edge servers, which could optimize the number and location of edge servers and the allocation of mobile users in a given ultradense networking environment. It proposed a strategy based on queuing model and vector quantization to solve it. Considering transmission delay, workload balancing, energy consumption, deployment costs, network reliability, and edge server quantity, Cao et al. [14] studied the placement problem of edge servers in the Internet of Vehicle (IoV). Considering the density of mobile users and the location of cloudlets in the mobile edge computing environment, Fan and Ansari [15] studied the optimal deployment strategy of cloudlets that balanced the deployment cost and end-to-end delay cost and proposed to use the mixed-integer programming (MIP) tool CPLEX solver to find the suboptimal solution.

The integer linear programming model (ILP) was used to model the edge server positioning under constraints. Considering load balancing between edge servers, Li et al. [16] proposed the greedy algorithm is and combined with the GA to solve the edge server placement problem. To minimize the access delay between mobile equipment and cloudlet, Xu et al. [17] proposed a heuristic greedy algorithm to solve it with an exact solution. To extend the poor scalability of the ILP, efficient approximation algorithms with identical and different cloudlet capacities were proposed. To optimize edge facilities’ overall performance and cost, Yin et al. [18] proposed Tentacle decision support framework and flexible edge server deployment method. Considering the proximity between users and edge servers, cost budget, the capacity of edge sites, the fault tolerance of edge sites, and other factors in deploying edge servers, the heuristic algorithm was used to select the ideal location of edge server deployment and find the exact deployment location of the server closest to the ideal location. Ahat et al. [19] proposed the MILP model to optimize the multilevel computing of the design infrastructure to maximize the expected revenue of operators. The proposed model considered operators’ limited budgets and service requirements and introduced a heuristic approach based on Lagrange relaxation to solve complex and scalability problems and huge instances. Finally, a greedy heuristic solution was proposed to solve the computational time complexity problem.

To minimize the average delay time of job hunting, Jia et al. [21] designed Heaviest-AP First Placement (HAF) strategy and the K-median algorithm. HAF placed cloudlets at the BSs with the heaviest workloads and the K-median algorithm selected some strategic positions. Xiang et al. [22] proposed an adaptive cloudlet placement method for mobile applications to maximize the number of mobile devices covered in cloudlet, and the gathering areas of the mobile devices were identified based on k-means algorithm. Lähderanta et al. [23] proposed the PACK algorithm, which placed a fixed number of servers, minimized the delay between users and edge servers, balanced the system workload, and met the lower and upper limits of the server capacity. And PACK was considered as a variant of the k-means clustering with capacity constraints, and the integer programming step block coordinate descent algorithm was used to solve it.

The above-mentioned researches on the cloudlet/edge server deployment problem are valid; however, these researches [1113, 16, 17, 21] focused on the access delay or balancing workload. Inspired by this, economic cost and delay cost are considered comprehensively from the perspective of edge service provider and user requests in our solution, as well as collaborative edge computing environments in metropolitan area networks. The edge server deployment problem in collaborative edge computing environments is modeled as MINP problem, and the Benders algorithm is adopted to solve it, which can efficiently find the optimal solution for edge server deployment economic cost and low delay balance. Furthermore, based on our previous work [20], this paper further deepens Benders decomposition theory of edge server deployment problems and extends the evaluation test of edge MDC deployment algorithm under different candidate edge locations. In addition, we show an example of edge server deployment based on Benders_SD algorithm to illustrate the effectiveness of our work.

3. Analysis and Modeling of Edge Server Deployment Problems

3.1. Analysis of Edge Server Deployment Problems

In densely populated metropolitan area network coverage areas, edge computing servers are deployed to provide edge services for many users for improving the benefits of edge services by making full use of edge computing resources [1719]. In addition, edge computing infrastructure providers can use economies of scale to enable edge services to benefit more users. Therefore, the network environment for edge computing server deployment selected in this paper is a metropolitan area network. The edge computing server is closer to the user network access point, and the better the user experience. But the closer is to the user, the fewer users, and the efficiency of the edge server will decrease. For edge resource providers, deployment costs are limited, so the deployment of edge servers needs to consider the balance between user experience and server efficiency. At present, edge computing is usually deployed in small and medium-sized edge data centers at the convergence of metropolitan areas or lower [9]. According to the specific network environment and business requirements, the server is often deployed to be close to the edge communication equipment of the user end, such as the base station. Deploy server in cost-effective IP convergence points to reduce network switching due to user movements, such as the location of routers or switches. Edge services are deployed in computer clusters within schools or enterprises. One or more edge servers are placed on each location to form a small edge data center. Figure 1 is an example of edge server deployment based on the wireless metropolitan area network architecture.

Considering the deployment cost of edge computing nodes and the sharing of edge computing resources, the deployment region does not need to cover all network access points and only requires sparse deployment. Further, the number of edge servers deployed increases with high user density in these areas. Conversely, the number of edge servers in regions with low user distribution density will decrease accordingly. Therefore, according to the distribution density of service users and the deployment cost in different geographic locations, it is vital to choose an appropriate location to reasonably deploy edge computing resources. It should satisfy the users’ low-latency application requirements and minimize the deployment cost of service providers.

Neither implementation of cost nor low-latency optimization alone can meet the requirements of edge computing infrastructure providers. Therefore, this paper considers the user distribution density (load density) based on edge service’s actual diversified scenario requirements under the constraints of satisfying user low-latency applications. This approach determines a suitable edge server deployment location and deploys an appropriate number of edge computing nodes in this location to achieve high resource utilization and minimize edge server deployment costs. This work mainly explores the locations of LTE macro base stations and multistandard base station convergence point routers within the scope of metropolitan area networks. The balance between deployment cost and network access delay is optimized according to the user distribution density in each service area. This work needs to solve three key issues: (1) the edge location problem: select the ideal edge location from the set of candidate locations; (2) the user association problem: edge server provides the service for the user; and (3) the problem of edge location capacity: according to the user distribution density (load density), determine the appropriate number of servers in each edge location. These factors are usually tightly combined, resulting in a huge search space. This work comprehensively weighs these factors and searches for the best edge server deployment strategy under multiple constraints. The user refers to the user terminal that submits a task request to the local edge server. The candidate edge location refers to a wireless or wired network access point, which can be a base station, router, or gateway.

3.2. Problem Description

The deployment scope studied in our work is the WMAN. The base station located close to the user equipment and the router device locations of the data convergence point are selected as candidate deployment locations. And the edge server deployment issues are described as follows [20]:

Within WMAN coverage, given the deployment location set of potential edge servers and the service coverage area set, the coverage area is the service range of the base station or within one hop distance of the router. The user connects to the edge micro data center through the base station. The edge micro data center can process requests and data offloaded by user terminals. Due to different user distribution densities and loads in each coverage area, the cost of renting each potential location and the number of edge servers needing deployment are not the same. The target of this work is to select suitable places to deploy edge servers from these potential edge locations to meet the low-latency requirements of the application and determine the number of nodes in each edge micro data center based on the user distribution density. So that low-latency applications can be satisfied. Under the premise of restrictions, the overall cost is the smallest.

3.3. Model of Edge Server Deployment

Definition 1 (PNN [18]). The distance between the user and a certain location is calculated. The greater the PNN, the greater distance between user and location, and vice versa, the closer the distance between the two set as the position of ; indicates the proximity between user and edge position (); is the network access point, defined as

Note that it is challenging and costly to directly measure the network distance (delay) between the user and the selected edge location. Therefore, it is a more critical issue in the deployment of edge servers to evaluate the network distance (delay) between the user and the edge location. Geographical coordinate (GC) [18] provides a lightweight network delay evaluation scheme. This work uses the delay level provided by GC to search for the ideal edge position. Like most coordinate systems, this work’s distance prediction between geographic coordinate positions is also based on the Euclidean distance calculation model.

Definition 2 (access point coverage area [15]). The coverage area of the base station is the range within which users can generally receive the transmitted signal. The design coverage distance of the base station in the urban area is about 100-200 meters; in the suburbs, it can generally cover a radius of about 3 kilometers; the coverage area of a router or gateway is defined as within a hop. This study defines the maximum distance that users and edge locations can tolerate in the coverage area as .

3.3.1. System Model

In the wireless metropolitan area network , , is the edge server deployment location, and denotes a link set of access points and edge server potential location. represents the set of all user equipment. Users reasonably use edge computing resources through appropriate access points according to their own needs and geographical location set as lth user, .

It is assumed that the edge server and base station or network aggregation equipment (router and switch) and other edge locations are collocated. is the set of service coverage areas of different network access points, and server clusters are deployed at selected edge locations according to the user distribution density , . is th area. Figure 2 describes a server deployment example, including the location information of base stations and user edge servers. As shown in the figure, 11 base stations are used as candidate locations and the dotted line indicates the service area of a server. The user sends a request, and then, the local manager distributes the load and task according to the available resources and user requirements in the edge cluster. The rental price of edge servers deployed in remote areas compared with the central location is relatively low, but the distribution density of users within one hop in the service area is high.

Locations A, B, and C have high user distribution density within one hop of the server, and multiple edge servers are deployed to meet user needs. However, when the scale of access point is large, there are many candidate locations. Choosing the optimal feasible solution is a more complex problem under various constraints such as delay and lease cost.

Note that and are used for illustrating the deployment of edge servers. When the edge server is deployed at the -th candidate edge position, ; otherwise, if the edge server is deployed at other edge positions, . Users’ requests in one area may be distributed to different edge MDCs for processing, a continuous variable () denotes the load ratio allocated from space to edge MDC. The nonnegative integer variable is the number of edge servers located at edge position . Aiming at low delay and minimum deployment cost, according to users’ service requests in different server areas, this method selects edge locations from candidate set as the deployment server locations.

3.3.2. Cost of Edge Server Deployment

The overall cost of edge server deployment is determined by two parts: the total user access delay and the edge MDC construction cost. And the construction cost is the cost of resource investment.

(1) Total User Access Delay. When a user access request is distributed to an edge micro data center for processing, it requests to connect to the base station where the edge server is located after passing through the network access point it serves. Therefore, the end-to-end delay between the user and the edge server consists of two parts: the access delay between the user and the access point and the network delay between the access point and the base station where the edge server is located. Since the network delay between the user and the access point is not affected by the edge server deployment location, in this work, we only consider the delay between the access point and the edge location where the edge server is located [16]. Based on advanced SDN network technology, the network controller can monitor the delay [24, 25] between the access point in area and the location where the edge server is located.

Due to the mobility of users, the load in different areas changes with time and time. The total delay of user access is affected by the distribution of users, the rate of user requests, and the time the users stay in the area. User distribution characteristics are described by user density. Collect the number of user terminals at time slot in area . The average user density in this area is as in equation (2); is the number of user terminals in the coverage area of the first time slot.

According to the average user density in the area , the request arrival ratio of user and percentage of users stay in area . The average number of user requests (load) in area is calculated as

The requests of different users in the area are distributed to the edge clusters in different locations. is the unit delay caused by accessing the object through the logical link from the access point in the area to the edge location where the edge server is deployed. The continuous variable represents the ratio of the request load allocated to the edge position in the area . The total user access delay is denoted in

(2) The Edge MDC Construction Cost. Deployment of an edge server needs to select the appropriate location and equip it with infrastructure. The edge MDC construction cost includes the cost of location leasing and the cost of equipment required for deployment. is the set construction cost of edge location (including leasing structure and other primary resource allocation), is server unit price, and is the number of server nodes. The edge MDC construction cost is calculated as

3.3.3. Edge Server Deployment Modeling

To minimize end-to-end delay, the nearest edge computing resources should be provided to the user after receiving the user’s request. As the number of edge micro data centers and their servers deployed increases, the end-to-end latency of user request processing will decrease correspondingly, resulting in an increase in the capital investment cost of edge server deployment. It is crucial for the edge facility provider to balance the capital investment cost and end-to-end network delay. Therefore, this work proposes a strategy [20] to use the lowest capital investment cost and minimize end-to-end network delay for satisfying users. Edge server deployment cost includes the sum cost of edge MDC construction cost and total access delay, which can be defined as shown in

Note is the adjustment constant used to adjust the total access delay cost and the proportion of edge server investment cost. The definition is shown in

, is the maximum delay of the farthest edge server in area , and is the maximum number of servers in the edge location. indicates the highest investment cost. represents the maximum total delay of all user requests in area , and denote balance parameter, , and . Therefore, the comprehensive cost minimization model of edge server deployment [25] can be described as

Among them, is the available capacity of a single edge server related to user request load, and is the maximum value of the server at the edge location. It is obvious that the objective function equation (8) works for minimizing these two kinds of cost within the metropolitan area network. Constraint (C1) restricts the assigned tasks from exceeding the maximum load of the edge server cluster at that location. Constraint (C2) ensures that all regional loads are distributed to different edge server clusters. The load ratio of edge server in area is a continuous variable between 0 and 1. Constraints (C3) and (C4) ensure that the total number of locations deployed in the metropolitan area network do not exceed the maximum limit. Constraints (C5-C7) define the value range of variables. is an integer binary decision variable, is an integer decision variable, as a continuous decision variable. Since the objective function equation (8) contains product two decision variables and , so the model is nonlinear. Here, the discrete 0-1 integer variable increases the difficulty of the solution and is regarded as a “complex variable.” Through the model analysis of the edge server deployment cost minimization problem, it can be seen that the variable of the number of servers in a certain location is an integer variable. The load distribution variable of a certain edge location in the area is continuous, and the edge server deployment model is a MINP problem.

According to the above analysis, edge server deployment mainly includes two parts: edge positioning based on the proximity of users and edge locations and determination of the number of edge server nodes based on user distribution and deployment costs. In this paper, the multiple edge server deployment issues studied include not only the user’s attribution but also the spatial positioning of the edge server within the metropolitan area network. Several locations are selected among the candidate network access points, and the user is assigned to this location, which can be summarized as a capacity-limited multifacility positioning problem in discrete space (capacitated facility location problem (CFLP)). Since this problem involves thousands of network access points in the metropolitan area network, the problem scale is relatively large and an NP-complete combinatorial optimization problem. Therefore, selecting an appropriate method for solving integer planning is a vital issue to ensure the accuracy and efficiency of the optimal deployment of edge servers. The symbols used in this work and their meanings are shown in Table 1.

3.4. Linearization Process of Original Problem

For integer nonlinear programming problem P1, the original problem is transformed into the mixed-integer linear programming problem (MIP) P2 by linear transformation .

Generally, the solution to the NP-complete problem can use accurate and approximate algorithms. Commonly used accurate algorithms for solving MIP include the branch and bound method and Benders decomposition method. Approximate algorithms include heuristic algorithms and intelligent optimization algorithms. The branch and bound method is a deterministic algorithm based on search and iteration with a large calculation. The well-known commercial software standard mathematical programming optimizer CPLEX, based on the branch and bound method, combined with cutting planes, heuristics, and other technologies, can quickly solve mixed-integer linear programming problems. Currently, it has been applied in solving the facility location problem, but CPLEX can obtain the optimal solution for small and medium-scale mixed-integer programming problems. The scale of edge server deployment problems in the metropolitan area network environment is relatively large, and CPLEX takes too much time. Get the optimal solution, and it may even be impossible to get a feasible solution. The Benders decomposition algorithm shows performance better to solve the MIP problem [2629]. Thus, this work uses the Benders decomposition algorithm to solve this problem.

4. Benders Decomposition of Edge Server Deployment Problems

The Benders decomposition [2629] algorithm is suitable for solving mixed-integer programming problems; it decomposes the original problem into the main problem containing complex integer decision variables and the subproblems containing only continuous variables according to the different types of variables. So it is suitable to apply this algorithm to solve the edge server deployment problem. In solving the main problem and the subproblems iteratively, the main problem provides a lower bound for the original problem. The obtained integer solution is passed to the subproblem, and the subproblem provides an upper bound for the original problem and returns to Benders cut to the main problem. The algorithm stops when the main and subproblems alternate solve until the upper and lower bounds are equal. At this time the optimal solution to the original problem is obtained.

4.1. Subproblems of Benders Decomposition Algorithm

The fixed 0-1 integer problem variable decomposes the subproblem P3:

Define the dual variable of the constraint (C1) as , the dual variable of the constraint (C2) as , and the dual variable of the constraint (C3) as . Substitute P3; then, . Then, the dual problem P4 of P3 is

Suppose P3 has a feasible solution, according to the duality principle. In that case, dual problem P4 has a bounded solution, and the bounded solution is the pole of a polyhedron composed of constraints (C1) and (C2), and then, an optimal Benders cut can be obtained. If P3 is not feasible, then the dual problem P4 is unbounded. Then, a feasible Benders cut can be obtained for a polar ray. From this, the optimal Benders cut of P3 is . The feasible Benders cut is , is a pole of the polyhedron , and is a polar ray of a polyhedron . Suppose is an auxiliary decision variable for Benders’ main problem, the optimal Benders cut can raise the lower bound of the Benders main problem, and the feasible Benders cut will get the effective lower bound of the original problem. Since generating the optimal Benders cut will speed up the convergence speed of the Benders decomposition algorithm, having more optimal Benders cuts and limiting the feasible Benders cuts is an effective way to accelerate the decomposition algorithm.

4.2. Main Problem of Benders Decomposition

Based on the optimal and feasible Benders cut, the main problem MP is

Although equation (17) considers many linear constraints in theory, only a small part of these constraints are active constraints at the optimal solution. Therefore, the direction constructs a relatively simple form of expression by utilizing the poles and extremes corresponding to these constraints.

5. Implementation of Edge Server Deployment Algorithm Based on Benders Decomposition

5.1. Algorithm Implementation

The edge MDC deployment algorithm based on Benders decomposition proposed in our work is shown in Algorithm 1 [25]. It can be seen from Algorithm 1 that in the second row, the maximum upper limit and the minimum lower limit are initialized. A feasible initial position is selected. In the iterative process of Algorithm 1, the dual problem provides an upper bound for the original problem in line 7 and returns the Benders cut to the MP problem to constrain the main problem and update to form a new main problem for solving edge server configuration. In line 4, the optimal solution of the main problem MP is to provide a lower limit for the original problem. Since does not necessarily decrease at each iteration, in line 5, the upper limit is selected as ; then, update . In addition, in order to avoid the MP main problem being unlimited in the first few iterations, many cuts were generated in the feasible solution initially added to MP.

Input:I:the AP set of base station; U:user set; J: the area set
Output:Y: the deploy site set; :the server number of the deploy site i
Begin
1: initialize gi, the server price; fi, the price of an edge position ; s,the maximum load of a server; c, the largest server number in a single edge position which accommodates
2: initialize , ,;
3: do{
4:  Select the initial server deployment scenario
     
5:  In the first step, all nodes in the region j that satisfy the delay condition are selected for initial deployment
6:   Initialize the main problem model MP(16)-(17)
7:  Compute by(14) and (15)
8:   if (Ck<UB)UB=Ck
9:   Solving MP to get the lower bound Lk by Benders cut constraints
10:    if(lk>LB) LB=lk;
11:     if the MP problem has no solution, the original problem has no solution and the algorithm ends.
12:    update and by MP’s solution
13:     k=k+1
14: while
End
5.2. Algorithm Correctness Analysis

The edge server deployment model is a mixed-integer nonlinear programming problem (MINP). The Benders decomposition (including the generalized Benders decomposition) algorithm is a method to solve the problem by decomposing the MINP problem according to the duality theory. According to the different data types of variables, the decomposition algorithm first linearizes the nonlinear programming problem of edge server deployment and then decomposes the mixed-integer programming problem into main and subproblems and iteratively solves them. The main problem MP is used to solve the location of server deployment, the number of servers in each deployment location of the subproblem SP, and the ratio of server resource allocation. During the iterative solution process, the lower limit of the main problem MP and the upper limit of the subproblem SP are constantly updated. According to the difference between the upper limit and the lower limit of the main problem and the subproblem and the result of the subproblem, different Benders cut constraints are formed and added to the MP, and they are optimized and corrected until the conditions are met, and the optimal solution is obtained. It has been proved by literatures [2628] that the Benders decomposition algorithm can achieve convergence in a limited number of steps. Obviously, choosing the Benders decomposition algorithm can efficiently solve the problem of optimal deployment of edge servers.

6. Example of Edge Server Deployment

Suppose there are five candidate locations for edge server deployment (such as base station or router locations), and each location corresponds to a coverage area as shown in Figure 3. In 10 years, the rental cost ($) of each location and the number of user requests within 5 seconds of the corresponding coverage area is shown in Table 2. The size of each request content is 100 M. The number of edge servers deployed at each edge location is no more than 30, and the unit price of each server is $2000. The maximum processing capacity of each edge server is 300 requests/time. Table 3 shows the average unit access delay corresponding to each edge site of the user set in each area. There are three decision variables , , and ; is a binary decision variable.

Using Benders to solve the edge server deployment, after 18 iterations, . The upper bound value of the objective function is 202125, and the lower bound of the objective function is 201999. The changes of and during the specific execution are shown in Table 4. During the entire edge server deployment period, the 10 servers are deployed in the first location, and the load distribution of area 1 and area 2 to the edge MDC1 is 100%. Four servers are deployed in the third location, and the load distribution ratio of area 3 to the edge MDC2 is 100%. 22 servers are deployed in the fourth location, and the load distribution ratio of areas 4 and 5 to the edge MDC2 is 100%.

7. Performance Evaluation

This simulation [20] is carried out on a personal laptop equipped with Inter (R) Core (TM) i7-3770 [email protected] GHz processor, RAM 12.0 GB memory, and 1 T hard disk space. The algorithm is programmed in C++ language. The simulation results were independently performed 25 times under the same conditions, and then, the average value was taken.

7.1. The Setup of Simulation

This section evaluates the performance of the proposed algorithm based on real and synthetic network topology data sets [17, 20, 21]. The amount of resources requested by each user is a random value in the range of [50,200] MHz. Each server can handle up to 50 requests. And the edge delay is randomly generated between 5 ms and 50 ms. Assume that mobile users usually stay in several places most of the day, such as home and work. Therefore, this work assumes that the location of each user in a specific area covered by five BSs changes randomly. Assume that the maximum number of edge micro data centers is 10% of the number of network access points. These main parameter settings are shown in Table 5.

7.1.1. Data Set

This simulation refers to the experimental settings of the Australian National University [17], the Hong Kong Polytechnic University Cao JN team [21], and the literature [18]. The real data set comes from the network topology of the Hong Kong Metro (HKMTR), including 18 in Hong Kong. The region corresponds to 18 potential edge locations. The number of requests in each area is directly proportional to the number of people in the AP coverage area. Figure 4 shows the Hong Kong subway map used as a WMAN template. Although the network topology of the Hong Kong area is not public, the Hong Kong subway map is used to infer the wired connections between hubs in each area to represent the wired hub in WMAN to the edge of the hub. This paper conducts a comparative analysis of related data sets to test the algorithm’s adaptability, as shown in Table 6.

7.1.2. Comparison Algorithm

To evaluate the performance of the proposed algorithm Benders_SD, this paper selects the heaviest load priority placement (HAF [21]), greedy algorithm [17], and CPLEX algorithm for comparative analysis.

(1) Heaviest Load Priority Placement (HAF). HAF deploys the edge micro data center with the heaviest user load at the network access point. First, sort the locations of base stations or edge routers from large to small according to the accumulated request reach rate of users, and deploy edge computing resources at the edge positions of the first locations. However, the HAF algorithm has two main disadvantages: first, the access point with the heaviest workload is not always the closest to the user; second, assigning users to the nearest edge MDC will cause uneven user distribution, which will lead to load imbalance in some edge MDCs.

(2) Greedy Algorithm (Greedy). The deployment strategy of the greedy algorithm is to select edge sites one by one from the candidate edge locations. It selects a site that achieves the minimum and maximum delay time of user server round-trip in the first round. The rest selects k-1 edge sites in the k-1 round. According to this strategy, when the selected edge site meets the bandwidth requirements of all users, the selection process ends.

(3) CPLEX Mixed-Integer Programming Optimization Algorithm. IBM’s WebSphere ILOG CPLEX algorithm can realize the basic algorithm with the fastest speed and the most reliability. CPLEX provides a flexible high-performance optimization program to solve problems such as mixed-integer planning.

7.1.3. Performance Parameters

Performance evaluation indicators include edge MDC construction cost ($), total user access delay (s), and overall cost. The overall cost is derived from equation (6). The unit of comprehensive cost is delay (s) and creation cost ($) joint decision, represented by S ($) in this work.

7.2. Results and Analysis
7.2.1. Sensitivity Test of Parameter

The values of the delay sensitivity parameters are, respectively, {0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9}, and gradually increases, indicating that the more sensitive to delay, this simulation sets 200 candidate edge positions.

It can be seen from Figure 5 that with the gradual increase of the adjustment parameter , the construction cost of the edge MDC gradually increases. Figure 6 describes the process in which the total user access delay gradually decreases with the parameter increase . The larger one makes the edge server deployment more sensitive to the end-to-end delay. When increasing, the algorithm proposed in this work pays more attention to the delay cost; thus, more servers are needed to reduce the delay and the corresponding edge MDC construction cost increases. It can be seen that adjusting the parameter has a more significant impact on the algorithm results. Figure 7 depicts the overall cost change with the adjustment parameter . The overall cost is the smallest when .

Therefore, this work comprehensively considers the comprehensive benefits of edge computing service providers and users to minimize the comprehensive cost and set the system adjustment parameter θ1 to 2.

7.2.2. Performance Evaluation of MDC Deployment Algorithm at the Lower Edge of the Small Network Service Area

This group of simulations uses the real Hong Kong subway network HKMTR data set to evaluate the edge server deployment algorithm proposed in this work. There are eighteen AP access points, and three of them are selected as deployment locations. This group of simulations relies on the results of parameter sensitivity simulations and sets system adjustment parameter .

Figure 8 shows the creation cost of edge MDC under the four algorithms of Benders_SD, HAF, Greedy, and CPLEX. However, compared with HAF Greedy and CPLEX, Benders_SD reduces the MDC construction cost by an average of 200$, 100$, and 50$, respectively. It can be seen that the Benders_SD algorithm proposed in this work has the lowest construction cost, which is better than the other three comparison algorithms. The HAF algorithm has the highest server deployment cost.

Figure 9 depicts the total end-to-end access latency cost under the four algorithms. Compared with HAF Greedy and CPLEX, Benders_SD reduces the total user access delay by an average of 0.51 s, 0.21 s, and 0.33 s, respectively. At the same time, Benders_SD reduces the total cost by an average of 227.26, 111.23, and 67.64, respectively. The total cost of the Benders_SD algorithm is lower for HAF, Greedy, and CPLEX algorithms in Figure 10. It is shown that the Benders_SD proposed in this work outperforms than the others.

Based on the above comparison results, from the evaluation results of the four algorithms on the HKMTR data set, the Benders_SD proposed in this work has the best performance in three aspects: edge MDC creation cost, end-to-end delay, and total cost. It can minimize the cost of edge computing infrastructure providers and the end-to-end delay of user access.

7.2.3. Evaluation of Edge MDC Deployment Algorithm under Different Number of Candidate Edge Positions

This group of simulations uses a synthetic network data set, the network scale becomes larger, and the number of candidate edge positions in the network changes from 200 to 1000. The range of change in the number of user requests for each candidate edge location (AP access point) is [50,200].

It can be seen from Figure 11 that when the number of candidate edge positions changes from 200 to 1000, the creation cost of edge MDC gradually increases. The creation cost of the Benders_SD edge server deployment algorithm proposed in this work is significantly lower than the cost of HAF and CPLEX, which is slightly close but still better than the Greedy algorithm. When the number of candidate edge positions is equal to 800, the creation cost of the Benders_SD algorithm is 125,819$ less than Greedy and 307,839$ than HAF; when the number of candidate edge positions is 1000, it is 309,072$ less than CPLEX. This shows that as the number of candidate edge positions increases, the Benders_SD algorithm has more advantages.

Figure 12 describes that as the number of candidate edge positions increases, the total end-to-end delay also increases, and Benders_SD outperforms other algorithms. As the problem scale becomes larger, the CPLEX algorithm’s performance worsens. Figure 13 describes the increase in the number of candidate edge positions as the network size increases. The performance of the four algorithms on the overall cost is consistent with the trend of edge MDC creation and the overall end-to-end latency cost. Compared with the three comparison algorithms, the Benders_SD algorithm performs best.

8. Conclusion

The emergence of edge computing plays a crucial role in low latency IoT applications. A MAN contains a large number of base stations that serve as candidate deployment locations for edge servers. Selecting a location for the edge server and determining the number of servers in the location for low latency and high node utilization is an urgent problem to be solved. This work proposes a cost-aware edge server optimization deployment method based on the Benders decomposition algorithm. An objective function is established based on edge server deployment and access cost minimization by using the resource allocation ratio, regional average load, access delay between users, and edge node serving location. Compared with the traditional server deployment strategy, our optimal strategy can more accurately decide the edge MDC’s location and the number of each edge server to ensure low latency and low deployment costs.

In further research, the optimal allocation and deployment of edge computing resources for complex and diverse Internet of things services will be studied from multiple perspectives around computing offloading, resource allocation, and cache content placement to improve system performance, edge service quality, and user experience.

Data Availability

The data that support the findings of this study are available from the corresponding author Shao, upon reasonable request.

Disclosure

Any opinions, findings, and conclusions are those of the authors and do not necessarily reflect the views of the above agencies.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

The work was supported by the National Natural Science Foundation of China (NSFC) under grants (No. 62102200) and Henan Science and Technology Research Project (No. 222102210134), cross science research project of Nanyang Institute of Technology.