Abstract

Computing offloading based on mobile edge computing (MEC) for mobile devices (MDs) has received great attentions in recent years. Strategy selection is an extremely important part of computing offloading, so how to make an optimal decision quickly and accurately during the computing offloading is a difficult point. Furthermore, MDs are likely to leak personal privacy when interacting with edge cloud, and there is also an issue about commercial privacy leakage between different cloud service suppliers. In this paper, we propose the privacy-protected edge cloud computing offloading (EPCO) algorithm based on online learning to improve the efficiency of computing offloading while ensuring the privacy of system users. Simultaneously, EPCO also supports different MDs customize their privacy level. We prove that adding privacy protection mechanism is almost no effect on the convergence of the algorithm. The simulation results validate our conclusion using a real-world dataset.

1. Introduction

Mobile devices (MDs) have become extremely popular in recent years due to their mobility and convenience [1, 3]. Meanwhile, the functionality of the application for MDs becomes increasingly powerful [3], which leads to lack of local resources of MDs, such as computing resources, storage, and energy [4, 5]. To this end, computing offloading for MDs has emerged. Researchers proposed mobile cloud computing (MCC) that source starvation can be resolved by sending computing tasks of MDs to remote cloud for execution [6, 7]. However, since cloud servers are often far away from MDs, the data needs to be transmitted for a long distance, which results in a long response time. To this end, much research in recent years has focused on mobile edge computing (MEC) [8], which sends computing tasks to edge cloud servers (ECSs) [9]. ECSs are typically deployed around MDs, which enables a short physical distance between MDs and servers, resulting a shorter latency [10]. The work of this paper is based on the edge cloud network.

Strategy selection is an important part of computing offloading between ECSs and MDs [1]. When a MD decides to offload its computing tasks to an ECS, it must first make a decision to select an optimal server for computing offloading. Researchers have used game theory in past research to solve the problem of selecting servers for computing offloading [11], which was also significantly effective at the time. However, with the increasing demands of users on the quality of network services and the challenges of big data [12, 13], most of the previous studies are outdated. In recent years, online learning algorithms have been greatly developed and used in various fields to help improve system’s performance [1418]. Therefore, we consider using online learning algorithms to solve strategy selection problem of computing offloading. Furthermore, not only the efficiency of computing offloading should be considered, but the privacy of system users should also be concerned. However, few researches involve the above two aspects.

Privacy protection is an important part of the computing offloading [19]. The privacy issues we consider include the following two parts: the privacy of the MDs and the service suppliers. On the one hand, a MD’s privacy may be exposed during data transmission or forwarding if there is a malicious third party involved. The malicious third party can infer the characteristics of the MD by accessing the computing offloading records. For example, a large amount of computation indicates the importance of the MD and the distance item exposes the location of the MD. With multiple of these side information, it is possible to identify a user in real world. For instance, He et al. [19] proposed a privacy-aware task offloading algorithm, which enabled low delay and energy consumption while maintaining an appropriate level of privacy. Min et al. [20] proposed a privacy-aware offloading algorithm that can improve the offloading performance, save energy, and enable privacy of healthcare IoT devices. Although these studies focus on the privacy issue in offloading, they only avoid the possibility of privacy leakage through some transmission method, so the privacy protection effect of these method is limited. On the other hand, since there is commercial competition between service suppliers, the privacy between them should also be considered. Therefore, privacy protection is another important part of the computing offloading. However, the research that considers both strategy selection and privacy protection is barely known.

To overcome above challenge, we consider introducing differential privacy into our computing offloading scenario. Differential privacy which proposed by Dwork et al. [21] has received great attention in the field of privacy protection in recent years. Differential privacy uses random noise to ensure that the private information of the individual will not be disclosed when the result of a query requests to disclose visible information. Zhang et al. and Hassan et al. [22, 23] use differential privacy techniques to address the risk of privacy leakage in their systems. This paper is motivated by the unresolved privacy risks in computing offloading scenarios. The privacy of mobile device users, such as location and device usage, may be leaked through data exchange during the offloading process, which is a potential privacy breach risk for users. According to the research of He et al. [19] and Min et al. [20], privacy risk has indeed become an important issue in computing offloading. Although they paid attention to the privacy issues in computing offloading, these algorithms only avoided the possibility of privacy leakage through a certain transmission method and did not fundamentally solve the problem of privacy risks. Therefore, we introduce differential privacy technology and propose an algorithm EPCO that protects the privacy of multiparty users in computing offloading. In this paper, the system structure is shown in Figure 1. For instance, a healthcare device needs to compute a large amount of monitoring data, so part of the computing tasks should be offloaded to the edge cloud for computing. The device sends the encrypted offloading data to the edge cloud, and then, each service supplier in the edge cloud gives an optimal offloading plan through online learning algorithms, and finally, the device makes a decision. Our main contributions are as follows: (i)In this paper, we propose EPCO algorithm for MD and edge cloud to perform computing offloading based on online learning and differential privacy technology(ii)EPCO preserves the privacy of both MDs and service suppliers. Moreover, we support different MDs to customize their own privacy protection levels. We proved it through theoretical derivation(iii)We verified the theory through simulation experiments. The results show that EPCO guarantees the efficiency of computing offloading while protecting the privacy of system users

In this section, we introduce related work from two aspects: optimal offloading and privacy management.

2.1. Optimal Offloading

Computing offloading alleviates the limitation of MDs’ resources by sending computing tasks to the remote cloud for execution [24]. There has been a lot of research work on computing offloading in the past two decades [2527]. Selecting ECSs for MDs is an indispensable part of computing offloading [28]. Previous decisions about server selection usually used the game theory method, which mainly concerned about energy conservation, network environment perception, and so on. Jošilo and Dán [29] made the decision to choose the wireless access points for mobile users during the computing offloading process. They proved that there is a Nash equilibrium in the model they propose, which can maximize the benefits of all users. However, this solution requires multiple interactions between users and computing resources to obtain better results, which is very unfriendly to time-sensitive applications. Barrameda and Samaan [30] considered to use tree execution dependency trees to enhance the accuracy, which is an important technical indicator in computing offloading. However, this solution is designed for a central cloud with a large number of computing resources. First, it needs to use a large amount of additional computing to run the algorithm, and secondly, the service response time cannot be guaranteed. Although these studies have solved some specific problems, they are gradually unable to adapt to the more complicated situation such as the challenges of big data and the personalized needs of users.

In recent years, online learning has been used in various fields to help improve system performance. Shahrampour et al. [31] used online learning in object recognition to help identify the current object through historical information from other modes. Sakulkar and Krishnamachari [32] proposed two online learning algorithms to help them solve the power allocation problem modelled as a Markov decision process. We consider applying online learning to the scenario of computing offloading, which can help us further improve the efficiency of computing offloading. Cao and Cai [33] used machine learning technology to solve the decision problems of MD for achieving Nash equilibrium points in the noncooperative game that they proposed, which shows the prospect of machine learning in computing offloading research. Therefore, these studies bring us new ideas for strategy selection of computing offloading. We consider that if we try to apply online learning theory and technology to the scenario of computing offloading, this can help us further improve the efficiency of computing offloading.

2.2. Privacy Management

As more and more people pay attention to personal privacy, the issue of user privacy protection should also be considered in computing offloading. Differential privacy is a popular research in recent years in terms of privacy protection. Differential privacy was first proposed by Dwork [34] and gave provable differential privacy protection. In recent years, people have realized the importance of privacy, and privacy has been used in various research fields to protect users’ privacy [35, 36]. Shin et al. [37] proposed a novel matrix factorization algorithm that guarantees per-user privacy under local differential privacy. In addition, they reduced communication overhead between the server and users by dimensionality reduction. Piao et al. [38] proposed an algorithm that can reduce query sensitivity and improved the effectiveness of published data. The above researches on differential privacy only provide same level of privacy protection, which is not practical in many applications. Dobbe et al. [39] proposed a customized local differential privacy mechanism to solve the privacy protection problem in multiagent distributed optimization problems. They proposed an approach for determining sensitivity, and they derived analytical bounds for some quadratic problems. The customizable ideas mentioned here have been adopted by us. In this paper, we allow different MDs to customize their privacy protection levels.

Since few researchers have paid attention to the privacy protection of computing offloading before, there are not many related research contributions, but we will continue to pay attention to the research progress in this area.

3. System Model

3.1. Computing Offloading

Consider an edge cloud network with a set of mobile device and an edge cloud manager (ECM) that manages a set of service suppliers. Each supplier has a set of servers that can provide computing service. Each mobile device has a task that has been determined to perform computing offloading. As shown in Figure 1, at each slot , a MD sends a -dimensional context in denoted by to all suppliers, where is added to Laplace noise based on the privacy protection requirements of different MDs. By receiving , the ECM first broadcasts it to all suppliers. Each supplier then selects an optimal ECS and sends the information of this ECS to the ECM. The ECM provides the MD with the optimal ECS in this network that denoted by . Subsequently, the MD decides whether to perform computing offloading.

Each has a two-dimensional vector denoted by , where and denote the reward in the price and reliability for performing computing offloading, respectively. and are given by and , respectively, where , denotes the expected reward of selecting ECS in given context . , is a random noise, which satisfies . We assume that , is conditionally 1-sub-Gaussian. Formally, this means that

Let and denote the expected reward of a ECS in the price and the reliability for context , respectively. Let denote the optimal ECS for the context .

Assumption 1. We assume that for all , and , , satisfy the following condition: where , . Assumption 1 means that if the offloading price and reliability of two ESCs are similar, it is expected that the cost of their offloading are similar.
Initially, the MD does not know any reward of ECSs. The MD learns the reward of ECSs over time. In order to evaluate the performance of our method, we define the 2D regret of the ECS as the tuple , where When and , we consider that the 2D regret is .

3.2. Differential Privacy

Definition 2 (differential privacy). An algorithm has differential privacy if there is only one entry different in all pairs , and all set of outcomes .

This definition mentioned above applies only to the identical level of privacy protection used by all suppliers. We now consider that each supplier in our system specifies its own privacy .

Definition 3 (local differential privacy). An algorithm has nodes in the system, and we say that the algorithm is locally private for node if for any it satisfies that

And we say that the algorithm is -differentially private, if is -differentially locally private for all suppliers, where .

Definition 4 (sensitivity). The sensitivity of the function is as follows:

Definition 5 (sensitivity). The sensitivity of the function is as follows:

3.3. The Learning Algorithm

In this section, we detail our proposed EPCO as shown in Algorithm 1 (EPCO(1)), Algorithm 2 (EPCO(2)), and Algorithm 3 (EPCO(3)). Since the computing offloading decision of each ECS for different MDs has stochastic distributions, we decide to let our proposed system learn an ECS’s performance by online learning method. According to the sample mean reward of each ECS for the same context vector update, service suppliers learn the performance of each ECS. In order to understand the EPCO, we divide it into three algorithms which are named as EPCO(1), EPCO(2), and EPCO(3), respectively. As shown in Figure 2, the MDs run EPCO(1) to customize their privacy protection level. ECM runs EPCO(2) to interact with the MD and send the best option among all agents to the MD for decisions. Service suppliers run EPCO(3) to select the optimal ECS and interact with the ECM.

First, we analyse the privacy problem of MDs. Since different MDs have different requirements for privacy protection, MDs are allowed to customize the privacy level of each user. In order to protect personal privacy when MDs send computing information, the information is added with a noise which is drawn from the Laplace distribution in EPCO(1). Then, we discuss the privacy issues of the suppliers. When suppliers have selected optimal ECSs for MDs, they send the information to ECM. Since any supplier can access to this information in ECM, the Laplace mechanism is used in EPCO(2) to protect the privacy of service suppliers.

Input:.
Output:.
1: MD is ready for computing offloading;
2: ;
3: Set ;
4: Send to the edge cloud;

In this paper, the context space is divided into identical hypercubes with side length . Let denote the subspace context space of . For ECS and each , EPCO maintains a counter recording the number of times that was selected for the context that belongs to . When a MD needs to perform a computing offloading, it first sends a message to the ECSs containing information about the computing task. In order to protect the privacy of the MD, it adds Laplace noise to this information in EPCO(1). Upon each context data of a MD arrival, the suppliers first identify to which subspace the context belongs. Then, each service supplier first calculates the indices for the rewards (line 5 in EPCO(3)), which is given as follows: where estimates the sample mean of the reward for the selection of in subspace . and denote the price objective and reliability objective, respectively. , where denotes the uncertainty of the reward estimate, which is commonly used to tradeoff exploration and exploitation in online learning [40]. is a random noise obeying the Gaussian distribution. Then, the Upper Confidence Bound (UCB) for is for context in subspace , where denotes the uncertainty due to context partition. Its main purpose is to inflate the reward of ECSs that are seldomly selected, which is more conducive to exploring more suitable servers than just servers with high estimated reward.

Input:.
Output:.
1: Receive from the MD;
2: Broadcast to each service suppliers;
3: Receive optimal ECSs from all service suppliers;
4: Set ;
5: Send to the MD;
6: Observe the decision of the MD and send it to the service suppliers
Input:.
Output:.
1: Initialize: ;
2: Receive from the ECM;
3: fordo
4: Compute via (4);
5: 
6: ifthen
7:  Set ;
8: else
9:  Find the candidate optimal set of ECSs via (5);
10:  Set ;
11: end if
12: Send and to the ECM;
13: Receive ;
14: 
15: 
16: 
17: end for

We add Laplace noise to the index function to protect the privacy of ECSs. When , it means that the confidence of is high, and EPCO(3) calculates the candidate set of the optimal ECSs, which is given as follows:

When , EPCO(3) just set to improve the confidence of (lines 6-7 in EPCO(3)). Simultaneously, an optimal ECS is selected by the exponential mechanism (lines 12-15 in EPCO(2)). We use to denote the total reward of ECS that can be compared, which is given as follows:

where represents a MD’s preference and is adjusted according to the actual needs of a MD. For example, if a MD requires strict service payment, the value of is larger. However, if a MD requires strict reliability, the value of is relatively small. We select the ECS with the highest total reward and sent it to the MD (line 12 in EPCO(3)). Finally, according to the computing offloading decision of the MD (line 7 in EPCO(2)), the service suppliers update the sample mean reward and the counter (lines 14-16 in EPCO(3)).

4. Regret Analysis

In this section, we prove that the 2D regret of EPCO is sublinear functions of . The regret is due to selecting suboptimal ECSs from by time .

Let

denotes the regret for objective in round . The best fixed ECS is denoted by , , and . Then, we have the total regret for selecting suboptimal ECSs

Then, the corresponding expected regret is given as follows:

Let and denote lower and upper bounds for , respectively. Then, and are the lower and upper confidence bounds, respectively. Let

denote that the service supplier is not confident about the reward estimate by time with the context in subspace . Then, we partition the regret into following and bound them, respectively. where denotes the complement of event and is the maximum difference between the expected reward in optimal server and other server for objective . We use to denote the server selected in EPCO(3) algorithm, to denote the optimal server, and to denote the server whose index is highest. Next, we will bound the items in Equation (15). We first bound .

Lemma 6. For any , we have the following:

Proof. Let denote the random reward of server in objective in round . We know

We define upper and lower bounds of the random reward as follows:

Let

Then, we have the following:

Since when , , so below, we only focus on the case of , which can be expressed as follows:

From the Hölder continuity, we have the following derivation:

Then, integrating the above derivation, we have the following:

Using Equation (23) and Equation (24), the question can be expressed as follows:

Thus, plugging Equation (23) and Equation (24) into Equation (21), we obtain the following:

Using Equation (26), we have the following:

Using the concentration inequality, the right side of the above inequality is bounded as follows:

Using the union bound, we have the following:

Using the result of Lemma 10, and can be bounded as follows:

Lemma 7. Under Assumption 1, and are generated by EPCO(3) algorithm. On event , we have the following:

Proof. There are two cases here. When , we have the following: When , we have the following: According to the above two cases, we obtain the following: On event , we have the following: Combined with Equations (34)–(37), we obtain the following:

Lemma 8. Under Assumption 1, and are generated by EPCO(3) algorithm. On event , we have the following:

Proof. We know that when holds, all servers are in interval . Then, we show that also satisfies this condition. On event , we have the following: By Equations (40)–(42), we obtain the following:

Since the selected server is in , we have . Using this result, we obtain the following:

From Equation (44) and Equation (45), we have the following:

Lemma 9. Under Assumption 1, are generated by EPCO(3) algorithm, and the upper limit of the number of rounds for is as follows:

Proof. Since

And each such event increases the value of by one. The number of rounds for is bounded by . Summing all the servers together obtains the final result.

Lemma 10. Under Assumption 1, is generated by EPCO(3) algorithm. On event , we have the following: where

Lemma 11. Under Assumption 1, is generated by EPCO(3) algorithm. On event , we have for all

More detailed proof of Lemma 10 and Lemma 11 is presented in [41] (see Lemma 10 and Lemma 11 in [41]).

Theorem 12. Under Assumption 1, is generated by EPCO(3) algorithm, and we have for any as follows: where

Proof. Combining Equation (12), Lemma 10, and Lemma 11, we have the following:

Theorem 13. Under Assumption 1, is generated by EPCO(3) algorithm, and and satisfy and , and we have the following:

Proof. According to the Theorem 12 and Equation (15), is bounded as follows: Then, we obtain the following: When , we have the following: The result shows that not only the regret of EPCO is sublinear, which is , but also added privacy differential mechanism does not affect its convergence.

5. Differentially Private

The privacy protection mechanism applied in this paper uses differential privacy mechanism, which is originally introduced by Dwork et al. [21].

Theorem 14. EPCO(1) preserves -differential privacy for MD where and each MD of the noise is independently selected according to the Laplace distribution, where the density function is for and .

Proof. We first show that is locally -differential privacy for MD . Now we start by studying the quantity of interest of MD . We use a random variable , for , to denote the output of with input and to denote the output of with input .

According to the definition above, we can rewrite the issue as follows:

We use to denote the tuple . Then, we have the following:

We form this process into a Markov chain where the random vector denotes the Lagrangian , which is presented in [39] (see Theorem 3.1 in [39]). Then, we have the following relationship:

With in Definition 5, we obtain the following:

This result proves the privacy guarantee in Equation (58).

Thus, Theorem 14 proves that our proposed EPCO(1) can guarantee the MD’s privacy and different MDs have different privacy protection levels.

Theorem 15. EPCO(1) preserves -differential privacy for the contextual information of ECSs.

Proof. Let denote true information of a supplier and denote a dataset which differs from in only one data. We define the reward that adds noise as . Then, for different suppliers and , we have the following: Employing Theorem 3.6 in [42] (see page 32 in [42]), Theorem 15 is proved.

Therefore, Theorem 15 proves that service supplier fails to extract information about ECSs in ECM by the rewards. In summary, Theorem 14 and Theorem 15 prove that EPCO can preserve both privacy of MDs and service suppliers synchronously. In addition, EPCO also supports different MDs to customize their privacy protection levels.

6. Simulation Results

In order to verify the efficiency and privacy of our proposed algorithm, we conducted the following simulation experiments with real-world datasets [43]. We compare EPCO with P-UCB1, S-UCB1 [44], and MOC-MAB [41] algorithms. We use Python 3.6 version to implement these algorithms. We run these algorithms on an Acer computer with Intel(R) Core(TM) i5-4460 @ 3.2 GHz and 8 GB RAM. The operating system is Windows 10 Professional. We set , , and to 1. We give the sets and . We let the time horizon and to satisfy different experiments. We take , and the context is chosen randomly from at each round. We use 6 arms in each algorithm and every algorithm runs 20 times. We take the average result of the simulation experiment.

6.1. Analysis of Regret

Figure 3 shows the change of the regret of the algorithm in price and reliability objectives over time. The simulation results show that the regret of EPCO is at a lower position to both price and reliability objectives. But its regret is not the lowest, which is slightly higher than the MOC-MAB algorithm. Because we added a privacy protection mechanism to EPCO, it affected the convergence of the algorithm, but this effect was almost negligible.

6.2. Analysis of Rewards

We compare the rewards of these algorithms in two objectives, and the results are shown in Figure 4. Compared with other multiobjective optimization algorithms, EPCO performs relatively well. Figure 4(a) shows that the EPCO algorithm performs well in price objective, outperforming other algorithms, but slightly inferior to the MOC-MAB algorithm. Figure 4(b) shows that the EPCO algorithm also performs well in reliability objective. This is because although our algorithm has high efficiency, due to the addition of a privacy protection mechanism, a small part of the error is caused, which affects the reward of different objectives.

6.3. Analysis of Privacy Protection

The effect of different privacy protection levels on the performance of the EPCO algorithm is shown in Figures 5 and 6. We set the privacy parameter to 0.25, 0.5, 0.75, and 1, respectively, representing different levels of privacy protection. Figure 5 shows the relationship between different privacy levels and regret. Obviously, when the value of is larger, the regret of the EPCO algorithm is smaller. This is because as the value of increases, the availability of data increases, resulting in a smaller regret. Similarly, in Figure 6, as the value of increases, the availability of data also increases, resulting in a larger reward for the EPCO algorithm.

6.4. Discussion

The simulation results show that EPCO’s regret is at a low level in P-UCB1, S-UCB1, and MOC-MAB, because P-UCB1 and S-UCB1 are not suitable for multiobjective optimization. The regret of MOC-MAB is lower than that of EPCO. This is due to the extra work done by EPCO to protect the privacy of users. With taking 0.25, 0.5, 0.75, and 1.0, respectively, the regret and reward EPCO also produced different consequences. We noticed that when the value of the privacy parameter is larger, the regret of EPCO is smaller, the reward and privacy leakage is larger. This is because a larger has little effect on the calculation performance of the algorithm. Although there is a difference in privacy leakage values from the data point of view, this difference is almost impossible to detect in real scenarios; that is to say, regardless of whether the privacy parameter is set to 0.1 or 0.5, this means that user privacy is almost impossible to be leaked.

6.5. Lessons Learned

The purpose of our simulation experiment is to verify the performance of EPCO from three perspectives, namely, regret, reward, and privacy. We used the Python programming language to implement the EPCO algorithm and used real-world datasets for operations. By comparing with three multiobjective optimization algorithms from the perspectives of regret and reward, namely, P-UCB1, S-UCB1, and MOC-MAB, it is concluded that EPCO does a good job in these two aspects. Meanwhile, it is concluded that EPCO can also effectively protect the privacy of users in the system through the method of privacy leakage. This is because differential privacy mechanism integrates well with online learning algorithms. No matter whether is set to 0.1 or 1.0, there is no trending change in the performance of EPCO. That is to say, theoretically, the value of can be adjusted as large as possible within a certain range, which can not only ensure the privacy of users but also minimize the impact of privacy protection mechanisms on the calculation performance of the algorithm.

7. Conclusion

We have proposed a privacy-protected algorithm and EPCO algorithm, for two types of users of computing offloading. One is for MDs whose privacy is protected by a customizable differential privacy mechanism. The other is for ECS suppliers whose privacy is protected by using a common differential privacy mechanism. In addition, we proved no significant impact on the performance of the EPCO algorithm with our privacy protection mechanisms. Simultaneously, an online learning algorithm is introduced to improve the computing offloading efficiency and a detailed theoretical proof of the method is given. The simulation results verify the effectiveness of the EPCO algorithm.

Data Availability

The data that support the findings of this study are found in “More Google cluster data” [43].

Conflicts of Interest

The authors declare that there is no conflict of interests regarding the publication of this paper.

Acknowledgments

This work was supported in part by the National Natural Science Foundation of China (NSFC) under Grant Nos. (62002102) and No. (62176113), in part by the Leading Talents of Science and Technology in the Central Plain of China under Grant No. (224200510004), and in part by the Luoyang Major Scientific and Technological Innovation Projects under Grant No. (2101017A).