Abstract

Recently, edge-based mobile crowdsensing has become an important sensing technology that takes advantage of mobile devices to collect information about surroundings based on using a group of mobile edge servers that are deployed at the network edge as a link between users and the central server for data filtering and aggregation. Each user may collect multiple data types in mobile collective sensing. For facilitating data aggregation, the same data type carried by various users is assumed to be uploaded to the same mobile edge server. The main problem is determining the server which should be activated to process each data type for reducing the overall cost. In this paper, the problem is formulated as one form of the unqualified multicommodity facility location problem. To solve this problem, two edge-server location strategies are proposed, which use a clustering method for dividing the set of mobile users with data items into clusters and use the ant colony approach to select a mobile edge server for each data type in each cluster. Extensive simulations are conducted based on widely used real data sets. The simulation results show that the proposed strategy achieves better performance than the existing methods in terms of service and facility costs.

1. Introduction

In recent decades, a new sensing paradigm appeared, called mobile crowdsensing (MCS), due to the existence of a lot of mobile devices with efficient sensing and powerful capabilities in human life [14]. In MCS, mobile devices are used for collecting sensing data form the surrounding environments [5, 6]. This collected data can be used for introducing various services such as construction of radio environment map [7], management of roadside parking [8], and assessment of road surface [9].

Previously, the architecture of traditional MCS is centralized, where there is a central server (CS) that receives directly uploaded sensing data from mobile users. The main drawback of this traditional MCS is that, in the case of large-scale scenarios, the central server may receive a very big amount of data streams from mobile users, which creates a very high load on the CS and networks. In addition, the leak risk of the user privacy increases because all collected data are stored on the CS. Fortunately, due to the faster evolution of Internet-of-Things (IoTs) and 5G communication, the paradigm of mobile edge-computing (MEC) [1012] is very helpful in solving the problems of the centralized architecture of MCS.

Mobile edge-computing (MEC) [13] can move the computation and processing tasks to mobile edge servers (MESs) that are located near the data source, instead of executing them on the CS [14]. Thus, MCS architecture presents a new layer by distributing set of MESs between the CS and mobile users like a bridge. In this strategy, mobile users can upload sensing data over MESs instead of uploading to CS directly. This layer aggregates and processes uploaded data. Based on data types that are carried by mobile users, the MCS paradigm will guide them to upload data to different edge servers. In other words, each type of data is aggregated on a single MES. Then, the aggregated and processed data are sent to CS for providing the available services of MCS. Aggregating data of the same type on a single MES can filter the redundant sensing data and will remove the erroneous and redundant data. This process reduces the size of data that will be sent to CS, which decreases the computation load and the network traffic on CS.

In edge-based MCS scenario for collecting data of crowdsensing, the total cost (TC) includes two costs that must be taken into account, which are service cost (SC) and facility cost (FC). The SC represents the cost of movement of the users to upload data, the FC includes the server activation cost (SAC) which represents the cost for activating MES, and the data processing cost (DPC) represents the processing cost of uploaded data. In daily life, usually mobile users stay a long time in a few places such as workplaces and home, and then they tend to leave these places for uploading data and go back to their initiation. Therefore, the SC is the total travelling distance for the user from the initial location, thereafter passing by the corresponding MESs and getting back to the initial location. As shown in Figure 1, the SC for user is the summation of costs , , , and . The FC for ES is , which includes the costs for SAC and DPC of type of data, respectively. Thus, based on this scenario, the main problem is which MES should be activated to process each type of data to minimize the total cost (TC). This problem is called mobile edge server activation problem (MESAP).

The edge-based MCS paradigm has been studied within various fields, such as vehicular crowdsensing [16], task allocation [17], and user recruitment [18, 19]. However, none of them takes into account the problem of minimizing the cost of data offloading in the edge-based MCS. Existing works regarding task offloading in MEC concentrate on minimizing the makespan [20, 21] of task execution or overhead [22, 23]. Thus, they do not take into account the movement cost of a user during the data offloading process, and they cannot be applied to the described edge-based MCS scenario directly. In addition, the existing works in MCS did not consider the overhead of data processing which represents the server view, but they focus on minimizing data uploading cost from the user view [24, 25]. The first research that takes into account the server view (facility cost, FC) and the user view (service cost, SC) was presented in [15].

In this paper, to solve the MESAP, the server and user perspectives are considered as presented in [15]. Based on these perspectives, the MESAP will be formulated based on the problem of uncapacitated multicommodity facility location, and two edge-server location strategies are proposed. Each proposed strategy uses a clustering method for dividing the mobile users set with data items into clusters. The first strategy uses the ant colony approach in the first tier to select MES for each data type in a cluster. Then, it merges all the selected sets of MESs and uses a simple heuristic method in the second tier to reallocate each data type to its appropriate mobile edge server, while the second strategy uses the ant colony approach in the two tiers to do the selection and reallocating processes.

The major contributions of this paper are described as follows:(i)Formulating the MESAP as an uncapacitated multicommodity facility location problem(ii)Using clustering to divide the mobile users set with data items into clusters(iii)Using the ant colony approach for selecting the appropriate ES for each data type(iv)Proposing a new heuristic strategy called one-tier ant colony clustering-based strategy for solving this problem(v)Proposing a new heuristic strategy called two-tier ant colony clustering-based strategy for solving this problem(vi)Studying and simulating the performance of the proposed strategies using data sets is used widespread in real world: epfl/mobility, roma/taxi, and geolife trajectory

What remains of the paper is organized in the following manner. The related works are reviewed in Section 2. In this section, the mobile edge server activation problem (MESAP) is described, and also the proposed clustering ant-colony-based strategies. In Section 3, the evaluation and simulation results are conducted to evaluate the proposed strategies performance. Finally, the last section concludes this paper.

2.1. Edge-Based Mobile Crowdsensing (EMCS)

There are some approaches that have been proposed for MCS based on distributed architectures. In [26], a new anonymized data-collection method was proposed for estimating data distributions. In [27], the authors studied the correlation effect of sensing data on differential privacy for protection of MCS systems and introduced two mechanisms of perturbation for two different perspectives. From protector’s perspective, they introduced a mechanism that uses the standard definition of differential privacy for deducing the scale value based on the Bayesian network for modelling the probabilistic relevance among sensing data. While in adversary’s perspective, they proposed a mechanism that analyzes the importance of maximum correlated group for computing the Bayesian differential privacy leakage based on Gaussian correlation model for describing the data correlation.

In [28], the authors proposed two strategies for managing privacy preserving reputation and handling malicious participants in MCS based on edge computing. Marjanovic et al. [29] proposed MEC paradigm for MCS for increasing the quality of service in MCS. In [16], the authors introduced an edge-based framework in applications of largescale vehicular crowdsensing for minimizing the energy consumption of participating vehicles in the heterogeneous crowdsensing applications. The authors in [30] proposed an edge-based network selection scheme in vehicular crowdsensing and formulated the problem as an optimization problem with double objectives to maximize user satisfaction. In [31], based on edge computing, the authors proposed a distributed ledger framework in MCS for supporting decentralized incentives. In [32], the authors introduced a mechanism of edge-assisted incentive in MCS to satisfy the individual rationality and truthfulness. In [17], the authors proposed a fog-assisted task allocation method in MCS. In addition, a scheme for secure data deduplication with fog-assisted method was proposed for improving the efficiency of communication. Furthermore, there are some approaches that have been proposed by concentrating on the user recruitment in edge-aided MCS [18, 19, 33]. In [18], for sparse data collection, the authors investigated the user recruitment. In [19, 33], the authors proposed a mechanism for incentive-aware recruitment in edge-based MCS. In addition, in [15], the authors studied the edge-server location problem in MCS and proposed a strategy for edge-server location to minimize the crowdsensing cost, which considers the facility cost (server perspective) and the service cost (user perspective). However, the authors in [15] did not consider the load balance among MESs in MCS scenario.

Aforementioned approaches explore the edge-based MCS from different aspects. Nevertheless, none of them takes into account all aspects in terms of facility cost, service cost, and load balance. Therefore, in this paper, new edge-server location strategies are proposed to solve edge-server location problem by considering all of these issues.

2.2. The Problem of Facility Location (FLP)

The problem of facility location (FLP) is a classic optimization problem that determines the best location for a warehouse or factory to be placed based on facility costs, transportation distance, and geographical demand. FLP aims to maximize the profit of a supplier based on the given location and demand of the customer. FLP has stirred the interest of numerous researchers. Based on the capacity of the facility, FLP can be categorized into the capacitated facility location [3436] and uncapacitated facility location [3740]. In addition, there are numerous shapes of the classical FLP. For example, the problem of k-median is a kind of FLP which has a limitation on the number of opened facilities. In [41], the authors formulated the problem of total movement minimization of clients and facilities as a problem of k-median. To solve the problem of a k-median, the authors in [42] proposed an algorithm of greedy local search. The k-level uncapacitated FLP is another shape of FLP where the demands must be moved between the facilities in a hierarchical order. For the 2-level FLP, there are some works, such as [37, 38], which proposed the approximation algorithm. In addition, [43] introduced a logarithmic approximation algorithm for the multilevel FLP. On the other hand, a client can request a subset of commodities, and in this case, the FLP problem is called multicommodity facility location (MFL). In [39, 44], for uncapacitated MF, the authors proposed approximation algorithms, while the authors in [35] proposed a wide-ranging approach for capacitated MFL. Furthermore, some works take the facility disruption into account, where some failed facilities may be subjected [4547].

In this paper, FLP will be represented based on the problem of uncapacitated multicommodity facility location as it was considered in [15]. This form of FLP is different from the classical MFL, which takes into consideration two constraints which are as follows: (1) each commodity can be served only by single facility and (2) the travelling distance between various facilities. These two constraints make FLP problem harder than the classical MFL, so most of the present approaches are not applicable directly for solving the FLP problem.

Based on the above-mentioned ideas, the main challenges are as follows: (1) the simplest FLP is NP-hard, (2) there are various data types, and the distance of travelling among mobile edge servers is considered to determine a facility location strategy with minimum cost, so it is more difficult than the traditional FLP, where there is only one kind of data, and (3) the traditional combination optimization approaches could not work well in case of multiple data types scenario.

2.3. Mobile Edge Server Activation Problem (MESAP)
2.3.1. Assumptions and Models

Here, the MCS scenario is described as follows: (1) firstly, moving users collect sensing data, (2) moving users upload data to the mobile edge servers, (3) mobile edge servers execute data filtering and aggregation, (4) aggregated data are sent to the central cloud server, and (5) finally, the central cloud server analyzes the data and then it generates the knowledge that will be used for providing the services of MCS.

To satisfy the previous mentioned scenario, an architecture edge-based MCS is required. Here, this architecture contains a cloud central server, , a set of mobile edge servers, , where is the total number of mobile edge servers, and a set of mobile users , where is the total number of mobile users. In addition, there is a data type set denoted as , where is the total number of data types. Each mobile edge server is able to operate in any configuration and the cost of processing of this combination of data types is denoted as . Each mobile edge server has a cost of activation and for each data type , there is an incremental cost of processing . Therefore, the facility cost to activate edge server with configuration is defined as follows:

Each user in has a set of data items denoted as , where each data item has a data type in . So, each user can carry multiple data types denoted as . Here, the service cost for a mobile user represents the travelling distance of a mobile user in the uploading data process, where any user initiates with an initial location and then moves to its correspondent mobile edge server one after another and finally returns to the initial position. This service cost is defined as follows:where represents the travelling distance cost between two different locations.

Assume the full set of all data items with all users denoted as is described as follows:

As shown in Figure 2, for user that will move towards servers , , and for uploading data, will consume the cost , which represents the total distance travels. While for user that will go to servers and to upload data, will expend cost , which represents the total distance travels. Specifically, the travelling cost between initiation and server is named as service cost, for example, or , and the cost for travelling between servers is named as service cost such as . The main used notations in this paper are shown in Table 1.

2.3.2. Problem Formulation

To find a solution for determining which mobile edge servers will be activated and which data types are assigned to the activated mobile edge servers to minimize the total cost, the mobile edge server activation problem (MESAP) will be formulated.

Based on the description of for each mobile user and for each mobile edge server , a new variable is proposed to indicate whether the mobile edge server is activated or nonactivated. When is activated, will be 1 and 0 and vice-versa. In addition, the variable is proposed to indicate that the mobile edge server processes type data, and it is 0, otherwise. Variable is 1 if user with data type is assigned to server to upload data. Taking into account that when points to the cost for activating server , consequently, the mobile edge server activation problem (MESAP) is formulated as follows:We have

The first constraint (15) means that there exists MES for processing each type of data carried by each user. The second constraint (16) means that each data type is processed by a single MES. The third constraint (17) means that only when MES has the ability to process the corresponding data type, the user can upload data. The fourth constraint (18) ensures that a mobile edge server has the ability to process data just when it is activated. The fifth constraint (19) shows that the values of decision variables , , and are 0 or 1 only. The aim of the proposed strategies is to find the best set of MESs that reduce the total cost and satisfy the abovementioned constraints.

2.4. The Proposed Ant Colony Clustering-Based Strategies

In this section, to solve the MESAP, two edge-server location strategies are proposed. The first strategy is called one-tier ant colony clustering-based strategy (OTACS) and the second strategy is called two-tier ant colony clustering-based strategy (TTACS). In the rest of this section, the key idea of the proposed strategies will be introduced, then the two proposed strategies are described in detail.

2.4.1. Basic Idea

The basic idea of the proposed strategies depends on those as follows: (1) using clustering to divide the set of mobile users with data items into clusters, (2) using the ant colony approach for selecting the appropriate MES for each data type in each cluster, (3) merging the selected subsets of mobile edge servers for all clusters, and (4) reallocating mobile edge server for each type of data for all users based on the merged mobile edge servers set.

To satisfy the basic idea of the proposed strategies, firstly, an overview of ant colony approach will be introduced and then the two proposed strategies OTACS and TTACS will be described in detail.

2.4.2. Overview of the Ant Colony Approach

Ant colony optimization (ACO) is a population-based meta-heuristic technique which depends on the foraging behavior of real ants. These ants forage for food and construct the shortest routes from their nest to the food source. ACO is a class of algorithms which construct their solutions based on the data problem, and it has been presented for application to problems of discrete optimization. In a real environment, ants look for food sources randomly. When an ant discovers a food source, they carry some food back to their colony. Moreover, when they move along the path, they leave a chemical substance known as pheromone while they are moving. In turn, the higher rate of pheromone trails represents shorter paths. By using pheromone trails as a communication mechanism, each ant makes decisions. The intensity of the pheromone trails left on the ground depends on the quality of the solution (food source) found. Pheromone trails accumulate with multiple ants in shorter paths, resulting in a higher density than in longer paths, therefore increasing its attractiveness. By using an evaporation rate, all pheromone remains are eventually reduced. On the other side, an evaporation process introduces the exploration and prevents staying in a local minimum. At the end of each iteration, the values of pheromone are updated [48, 49]:where is the probability of moving decision of ant from node to node . Such decision depends on the level of pheromone and heuristic information. is the set of possible neighborhoods that have not been visited yet by ant , is a heuristic function, and is the pheromone amount on edge and . and are the parameters that determine the relative significance of heuristic information and pheromone concentration. The pheromone update can be formulated in the following manner:

The evaporation update process is given bywhere is the constant reduction factor of all pheromones, is the cost of the solution done by ant , and is a constant. The aforementioned optimization process is stopped after a certain amount of iteration.

2.4.3. One-Tier Ant Colony Clustering-Based Strategy (OTACS)

The first proposed strategy is called one-tier ant colony clustering-based strategy (OTACS). OTACS consists of four phases as follows:Phase 1: clustering phaseIn this phase, to reduce the candidate number of MESs for selection process, OTACS uses the clustering approach (any clustering approach of existing algorithms for clustering can be used) to divide the set of mobile users with their set of data items based on the initial location of each mobile user into clusters such that each cluster must have at least one mobile edge server. Let us denote a set of clusters as , where is the total number of created clusters. Each cluster has the following sets:(i)A set of MESs in a cluster, , where is the number of mobile edge servers in this cluster and (ii)A set of mobile users in a cluster, , where is the number of mobile users in this cluster and (iii)A set of data items for all mobile users in a cluster , where (iv)A set of data types in a cluster, where is the number of data types in this cluster and Here, the K-means algorithm is used to create the required clusters and the optimal number of clusters is obtained by using silhouette method. Assume that there a set of data points , then the silhouette coefficient of a point , , is a measure for cluster validity, which is defined as follows:where stands for the average distance of that point with all other points in the same cluster, represents the average distance of that point with all points in the closest cluster to its cluster. The value of ranges from −1 to +1. If the value of is nearer to 1, then the point is placed in the correct cluster. If the value of is negative, the point is placed in the wrong cluster. If it is around 0, the object is between the clusters [50].Then, the average silhouette value calculated for all points in all clusters (assume that the number of clusters is and the number of total data points is ) is as follows:If the value of is considerably high, then the number of clusters is optimal; in other words, the clustering structure with clusters is appropriate. On the other hand, if the value of tends to be very less or negative, then the cluster structure with clusters is not proper, and it may be having either more or lesser number of clusters than the optimal value.As illustrated in Figure 3, Cluster has one MES, , three users, , and four data types, . Cluster has one MES , two users , and three data types . Cluster has two MESs, , three users , and three data types . Cluster has one MES, , two users , and four data types . Cluster has one MES, , two users , and three data types .Phase 2: in-cluster allocating phaseIn this phase, for each cluster, OTACS uses the ant colony (ACO) approach to select a mobile edge server for each type of data based on the facility cost of each mobile edge server and the service cost of each mobile user . The objective of this phase is defined as follows:We haveThe output of this phase is a set of selected mobile edge servers, , that represents the most appropriate MESs for filtering and aggregating all data items in this cluster such that each data type is assigned to only one mobile edge server in .Phase 3: merging phaseIn this phase, OTACS merges the selected sets of mobile edge servers for all clusters in into one whole set of selected mobile edge servers, , which is defined as follows:Phase 4: reallocating phaseIn this phase, OTACS reallocates the mobile edge server for each data type for all users. For each data type , by using the whole set of selected mobile edge servers, . Assume that the whole set of selected mobile edge servers , where denotes the number of mobile edge servers in . The objective of this phase is formulated as follows:We haveBased on this objective, OTACS uses a simple heuristic approach to select the most appropriate mobile edge server for each type of data. OTACS calculates the overall cost for all mobile edge servers in the whole set of selected mobile edge servers, , and then for each data type , OTACS selects the mobile edge server with the minimum overall cost, , such thatwhere represents the overall cost to assign data type to the mobile edge server .

2.4.4. Two-Tier Ant Colony Clustering-Based Strategy (TTACS)

In the first proposed strategy, OTACS, the load balancing for mobile edge servers in relation to the number of assigned users and assigned data items for each activated mobile server is not taken into account. Therefore, the second proposed strategy called two-tier ant colony clustering-based strategy (TTACS) is used to tackle this issue. TTACS consists of four phases as the first proposed strategy OTACS, but it has the same first three phases as described OTACS: clustering phase, in-cluster allocating phase, and merging phase. While in the fourth phase, reallocating phase, instead of a simple heuristic which is used in OTACS, TTACS uses the ant colony (ACO) approach to select the most appropriate mobile edge server for each type of data. For each data type, TTACS uses a fitness function which depends on the overall cost for all mobile edge servers in the whole set of selected mobile edge servers, , and then it constructs the best mobile edge servers set for all data types to be activated in the edge-based scenario of MCS taking into account improving the load balancing on the this activated servers.

Based on these four phases of OTACS and TTACS, they can select the most appropriate mobile edge server for each type of data to be activated in the edge-based MCS scenario. Figure 4 introduces an example of these four phases of the proposed strategies OTACS and TTACS.

2.4.5. Computational Complexity of OTACS and TTACS

Here, the computational complexity of the suggested strategies is going to be described based on the phases of strategy. As shown on the previous two sections, OTACS and TTACS consist of four phases: clustering phase, in-cluster allocating phase, merging phase, and reallocating phase. So, the computational complexity of OTACS and TTACS depends on the complexity of each phase of these four phases which will be described as follows:(i)Complexity of Clustering Phase. In this phase, OTACS and TTACS use the k-means algorithm for creating k clusters, so the computational complexity of this phase is , where is the maximum iterations number for k-means and and are the number of mobile users and mobile edge servers, respectively.(ii)Complexity of In-Cluster Allocating Phase. In this phase, OTACS and TTACS use the ACO approach for finding the most appropriate mobile edge servers set in each cluster, so the computational complexity of this phase is , where is the maximum number of iterations for ACO and and are the number of mobile edge servers and data types in cluster , respectively.(iii)Complexity of Merging Phase. In this phase, OTACS and TTACS merge all selected sets of mobile edge servers for all clusters, so the computational complexity of this phase is , where is the total number of selected mobile edge servers in the whole set.(iv)Complexity of Reallocating Phase. In this phase, OTACS uses a heuristic algorithm for reallocating the set of selected servers, so the computational complexity of OTACS in this phase is , where is the total number data types in the system and is the total number of selected mobile edge servers in the whole set, While TTACS uses a ACO approach for reallocating the set of selected servers, so the computational complexity of TTACS in this phase is , where is the maximum number of iterations for ACO and is the total number of selected mobile edge servers in the whole set.

As a result, the final computational complexity of OTACS and TTACS are , and , respectively. Table 2 summaries the final computational complexity of OTACS and TTACS.

3. Evaluation and Simulation Results

3.1. Simulation Settings and Data Preparations

To evaluate the proposed strategies, the performance of six strategies which are low cost first (LF), minimum average distance (DIS), biogeography-based optimization with particle swarm optimization (BBO_PSO) [47], APX2 [15], CMSA [15], and random are compared with the proposed strategies OTACS and TTACS. In LF strategy, a mobile edge server with the lowest processing cost, , is selected for each data type . In DIS, a mobile edge server with the lowest average distance to the mobile users set is selected for each data type. In BBO_PSO, a heuristic algorithm is used to select a mobile edge server which maximizes the total cost fitness for each data type. In random strategy, a mobile edge server is selected randomly for each type of data. In APX2 [15], a mobile edge server is selected by using a relaxation-based approximation approach based facility and service costs. In CMSA [15], a mobile edge server is selected by using a connected multiagent simulated annealing algorithm based on facility and service costs.

In addition, to evaluate the performance of the suggested strategies to minimize the sensing cost, three data sets are widespread used in real world, which are the dataset of roma/taxi [51], dataset of epfl/mobility [52], and dataset of geolife trajectory [53]. The GPS trajectories in San Francisco Bay Area of approximately 500 taxis, USA, were recorded in the epfl/mobility set over 30 days. The dataset of roma/taxi in in Rome, Italy, that includes approximately 320 taxis’ GPS for coordinating mobility traces, collected over 30 days. The dataset of geolife trajectory for in Geolife project by 182 users that includes 17,621 GPS trajectories with a distance of 1.2 million kilometers which were collected.

In the simulation, a uniform distribution is used for generating the facility costs randomly, and the service costs for the mobile users are put as the distance of travel when uploading sensing data in the dataset of real world. Note that the first GPS position of the trajectory of a user is selected as the initial position and the POI positions are selected as a candidate mobile edge server positions.

3.2. Evaluation Metrics

Here, the evaluation metrics which are used to evaluate all strategies are described as follows:(i)Number of activated mobile edge servers (NAMESs) is the number of selected servers to be activated for processing all data types(ii)Total cost (TC) is the summation of the facility cost, , and service cost, , for all activated MESs and mobile users.(iii)Load balancing for server-user (LBSU) is the average load balance among the activated MESs based on the number of served mobile users by each activated mobile edge server. Assume that the set of activated MESs is denoted as and the load balance for each activated MES in terms of users is denoted as . The average load balance for server-user, , is defined as follows:whereand is the number of activated mobile edge servers.(iv)Load balancing for server-data (LBSD) is the average load balance among the activated MESs based on the number of received data items from mobile users by each activated MES. Assume that the load balance for each activated MES in terms of received data items is denoted as . The average load balance for server-data, , is defined as follows:whereNote that lower values of avgLBSU and avgLBSD are better for satisfying the load balance on the activated servers.

3.3. Effect of the Number of Data Types

Here, the effects of different number of data types are studied and discussed for all strategies when the number of mobile edge servers is 20, and the number of mobile users is 270 in case of epfl/mobility set, and 130 in case of geolife trajectory and roma/taxi sets. Figures 5(a)5(c) show the number of activated mobile edge servers for epfl/mobility, geolife trajectory, and roma/taxi, respectively. As illustrated in Figures 5(a)5(c), the number of activated servers grows as the number of data types grows for most strategies. Additionally, the proposed strategies, OTACS and TTACS, achieved reasonable values which are not very low or very high. This is because OTACS and TTACS use the k-means clustering approach to reduce the number of candidate servers, and it can select the appropriate number of activated servers based on their fitness function. Additionally, the number of activated servers of TTACS is greater than OTACS for most values of the number of data types. This is because TTACS uses the ant colony approach in its two tiers for improving the load balancing of the activated servers, while OTACS uses ant colony in one tier only.

Figures 6(a)6(c) show the achieved total cost for epfl/mobility, geolife trajectory, and roma/taxi, respectively. As illustrated in Figures 6(a)6(c), the achieved total cost increases as the number of data types increases for most strategies. Additionally, the proposed strategies OTACS and TTACS achieved the lowest total cost among all strategies. This is because OTACS and TTACS use the k-means clustering approach and ant colony approach with a fitness function that depends on minimizing the facility and service cost together.

Figures 7(a)7(c) show the average load balancing for sever-user (avgLBSU) for epfl/mobility, geolife trajectory, and roma/taxi, respectively. As illustrated in Figures 7(a)7(c), avgLBSU decreases as the number of data types increases for most strategies except the DIS strategy. This is because when the number of data items increases, the number of activated servers increases, and the load balancing gets distributed among them. Additionally, LBSU values of OTACS and TTACS are lower than LBSU values of other strategies. This is beacuse OTACS and TTACS can distribute the server load in terms of users by using the k-mean clustering and ant colony approach. In the case of DIS strategy, the avgLBSU values are less affected by changing the number of data types. This because DIS uses only the travelling distance to select the servers to be activated.

Figures 8(a)8(c) show the average load balancing for sever-data (avgLBSD) for epfl/mobility, geolife trajectory, and roma/taxi, respectively. As illustrated in Figures 8(a)8(c), avgLBSD decreases as the number of data types grows for most strategies except the DIS strategy. This is because when the number of data items increases, the number of activated servers increases, and the load balancing gets distributed among them. Additionally, LBSU values of OTACS and TTACS are lower than LBSU values of other strategies. This is because OTACS and TTACS can distribute the server load in terms of users by using the k-mean clustering and ant colony approach. In the case of DIS strategy, the avgLBSD values are less affected by changing the number of data types. This is because DIS uses only the travelling distance to select the servers to be activated.

3.4. Effect of the Number of Servers

Here, the impact of different number of servers are studied and discussed for all strategies when the number of data types is 6 and the mobile users number is 270 in the case of epfl/mobility set and 130 in the case of geolife trajectory and roma/taxi sets. Figures 9(a)9(c) illustrate the number of activated mobile edge servers for epfl/mobility, geolife trajectory, and roma/taxi, respectively. As illustrated in Figures 9(a)9(d), the number of activated servers is less affected as the number of candidate servers grows for most strategies. Additionally, the proposed strategies, OTACS and TTACS, achieved higher values than other strategies, while in the case of roma/taxi, OTACS and TTACS achieved reasonable values which are not very low or very high. This is because OTACS and TTACS use the k-means clustering approach to reduce the number of candidate servers, and it can select the appropriate number of activated servers based on their fitness function.

Figures 10(a)10(c) show the achieved total cost for epfl/mobility, geolife trajectory, and roma/taxi, respectively. As illustrated in Figures 10(a)10(c), the achieved total cost is less affected as the number of candidate servers grows for most strategies. In addition, the proposed strategies, OTACS, TTACS, and APX2, achieved the lowest total cost among all strategies. This is because OTACS, TTACS, and APX2 take into account facility and service costs. Additionally, OTACS and TTACS use the k-means clustering approach and ant colony approach with a fitness function that depends on minimizing the facility and service cost together, and APX2 uses a relaxation-based approximation algorithm.

Figures 11(a)11(c) show the average load balancing for sever-user (avgLBSU) for epfl/mobility, geolife trajectory, and roma/taxi, respectively. As illustrated in Figures 11(a)11(c), avgLBSU is less affected as the number of candidate servers grows for most strategies. This is beacuse when the number of candidate servers grows, the number of activated servers is less affected and the load balancing will be distributed among them. Additionally, the LBSU values of OTACS and TTACS are lower than LBSU values of other strategies. This is because OTACS and TTACS can distribute the server load in terms of users by using the k-mean clustering and ant colony approach.

Figures 12(a)12(c) show the average load balancing for sever-data (avgLBSD) for epfl/mobility, geolife trajectory, and roma/taxi, respectively. As illustrated in Figures 12(a)12(c), avgLBSD is less affected as the number of candidate servers grows for most strategies. This is because when the number of candidate servers increases, the number of activated servers is less affected and the load balancing will be distributed among them. Additionally, LBSU values of OTACS and TTACS are lower than LBSU values of other strategies. This is because OTACS and TTACS can distribute the server load in terms of users by using the k-mean clustering and ant colony approach.

3.5. Effect of the Number of Users

Here, the impact of different number of users are studied and discussed for all strategies when the number of data types is 6 and the number of mobile edge servers is 20 for epfl/mobility, geolife trajectory, and roma/taxi sets. Figures 13(a)13(c) show the number of activated mobile edge servers for epfl/mobility, geolife trajectory, and roma/taxi, respectively. As illustrated in Figures 13(a)13(c), the number of activated servers is less affected as the number of candidate servers grows for most strategies. Additionally, the proposed strategies, OTACS and TTACS, achieved higher values than other strategies. While in the case of roma/taxi, OTACS and TTACS achieved reasonable values which are not very low or very high. This is because OTACS and TTACS use the k-means clustering approach to reduce the number of candidate servers and it can select the appropriate number of activated servers based on their fitness function.

Figures 14(a)–14(c) show the achieved total cost for epfl/mobility, geolife trajectory, and roma/taxi, respectively. As illustrated in Figures 14(a)–14(c), the achieved total cost grows as the number of users grows for most strategies. Additionally, the proposed strategies OTACS, TTACS, and APX2 achieved the lowest total cost among all strategies. This is because OTACS, TTACS, and APX2 take into account facility and service costs. Additionally, OTACS and TTACS use the k-means clustering approach and ant colony approach with a fitness function that depends on minimizing the facility and service cost together and APX2 uses a relaxation-based approximation algorithm.

Figures 15(a)15(c) show the average load balancing for sever-user (avgLBSU) for epfl/mobility, geolife trajectory, and roma/taxi, respectively. As illustrated in Figures 15(a)15(c), avgLBSU is less affected as the number of users increases for most strategies. This is because when the number of users increases, the number of the activated servers are less affected and the load balancing will be distributed among them. Additionally, LBSU values of OTACS and TTACS are lower than LBSU values of other strategies. This is because OTACS and TTACS can distribute the server load in terms of users by using the k-mean clustering and ant colony approach.

Figures 16(a)16(c) show the average load balancing for sever-data (avgLBSD) for epfl/mobility, geolife trajectory, and roma/taxi, respectively. As illustrated in Figures 16(a)16(c), avgLBSD is less affected as the number of users grows for most strategies. This is because when the number of users increases, the number of activated servers is less affected and the load balancing will be distributed among them. Additionally, LBSU values of OTACS and TTACS are lower than LBSU values of other strategies. This is because OTACS and TTACS can distribute the server load in terms of users by using the k-mean clustering and ant colony approach.

In summary, the results of these conducted simulations show that in the first case “changing number of data types,” the proposed methods OTACS and/or TTACS outperform all existing methods in terms of total cost and load balancing for server-user and server-data by which they achieved lower values than existing methods. In the second case “changing number of servers” and the third case “changing number of users,” the proposed methods OTACS and/or TTACS outperform all existing methods in terms of total cost except APX2. However, they outperform all existing methods in terms of load balancing for server-user and server-data by which they achieved lower values than existing methods. So, the proposed methods are better than APX2 because they can guarantee the load balance among candidate servers but APX2 cannot.

4. Conclusion

In this paper, two edge-server location strategies are proposed for minimizing the facility and service cost in mobile crowdsensing. These two strategies are called one-tier ant colony clustering-based strategy (OTACS) and two-tier ant colony clustering-based strategy (TTACS). Each proposed strategy uses a clustering method for dividing the set of mobile users with data items into clusters. OTACS uses the ant colony approach in the first tier to select a mobile edge server for each data type in each cluster. Then, it merges all the selected sets of mobile edge servers and uses a simple heuristic method in the second tier to reallocate each data type to its appropriate mobile edge server, while TTACS uses an ant colony approach in the two tier to do the selection and reallocating processes. The proposed strategies were compared with six of the existing strategies. The conducted simulations were based on widely used data sets in the real world: ep/mobility, roma/taxi, and geolife trajectory. The simulation results show that the proposed strategies achieve better performance than the existing methods in terms of service cost, facility cost, and load balancing for server-user server-data distribution. In the future work, new issues will be considered, such as the energy consumption by mobile users, using different clustering algorithms and applying different MCS scenarios.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This project was funded by the Deanship of Scientific Research (DSR), King Abdul-Aziz University, Jeddah, under grant no. DF-868-130-1441. The authors, therefore, acknowledge DSR technical and financial support.