Abstract
Future autonomous electric vehicles (EVs) are equipped with several IoT sensors, smart devices, and wireless adapters, thus forming an Internet of Vehicles (IoVs). These intelligent EVs are envisioned to be a promising solution for improving transportation efficiency, road safety, and driving experience. Vehicular fog computing (VFC) is an evolving technology that allows vehicular applicationrelated tasks to be offloaded to nearby computing nodes and process them quickly. A major challenge in the VFC system is to design energyefficient task offloading algorithms. In this paper, we propose an optimal energyefficient algorithm for task offloading in a VFC system that maximizes the expected reward function which is derived using the total energy and time delay of the system for the computation of the task. We use parallel computing and formulate the optimization problem as semiMarkov decision process (SMDP). Bellman optimal equation is used in value iteration algorithm (VIA) to get an optimal scheme by selecting the best action for the current state that maximizes the energybased reward function. Numerical results show that the proposed scheme outperforms the greedy algorithm in terms of energy consumption.
1. Introduction
Recently, autonomous and connected electric vehicles also known as the Internet of Vehicles (IoVs) have gained extensive attention and flourished as a promising technology by bringing convenience to society by solving the traffic issues like accidents [1], congestion, and environmental pollution [2]. By 2035, it is estimated that around 25% of autonomous vehicles will be onroad [3]. Nowadays, a large number of sensors, smart devices, and controllers are deployed in vehicles to facilitate the drivers and passengers for autonomous driving, infotainment, and natural language processing [4].
According to an estimate in 2020, every day around 4000 GB of data is produced by the vehicles [5]. Due to these smart sensors and controllers, the IoVs consume an enormous amount of power for the processing of data generated by the smart devices [6, 7]. As the vehicles have limited resources of energy and computational power, the computation of smart applications that cannot be managed locally needs to be offloaded to the helping nodes [8].
With the development in the technology of computation and communication [9], the vehicles can be both task producer and service provider nodes. This concept brings the computation capabilities near the proximity of task producer vehicles with the help of Vehicular to Vehicular (V2V) and Vehicular to Infrastructure (V2I) communication [10].
Vehicular fog computing is a novel technique for computing tasks and uses computational resources of both moving and parked vehicles [11]. The basic concept behind VFC is to install resource units (RUs) on connected vehicles so that these vehicles act as fog nodes and deliver their services of communication and computing, according to the requirements [12]. To enable communication of tasks to the vehicular fog nodes, vehicles are armed with different types of network interface components. Vehicular fog nodes communicate with devices and the Internet via cellular network or IEEE 802.11p [13].
VFC is thus a proficient method for low latency tasks and smart IoV applications. However, vehicular fog nodes have limited computation capability of RU’s and bandwidth of communication; hence, it is not possible to process all the tasks and satisfy computation latency requirements by the VFC. To solve this issue, vehicles are also connected to the cloud servers in the form of remote cloud (RC) using cellular communications [14]. Vehicles can transfer the tasks to the RC and avail the opportunity of powerful computational resources. However, long distance between RC and vehicles suffers from challenges of high latency and highpower consumption due to the transfer of tasks to the RCs. To assure reliable computing services to the EVs, the threetiered VFC architecture is considered in this paper. This architecture includes task producer devices, vehicular fog nodes, and the remote cloud as shown in Figure 1 [15].
In this paper, we propose an optimal energyefficient algorithm for task offloading in a VFC system that minimizes the energy consumption of the system. We propose an energybased reward function for the considered problem and utilize parallel offloading. Vehicles with the tasks divide it into several parts and offload it to the neighboring vehicles for processing. One part of the task is computed locally, and the rest of the parts are offloaded to the remaining other cooperative vehicles such that the energy efficiency of the whole system is maximized.
The main contribution of our work is summarized as follows: (i)We propose a novel reward function that considers the energy of the system for parallel task offloading(ii)We formulate the problem of task offloading as semiMarkov decision process (SMDP) that consider factors such as (1) arrival of task, (2) departure of task, (3) arrival of the vehicle, and (4) departure of the vehicle(iii)The state space, actions, reward, and transition probabilities of the VFC system are analyzed and defined to obtain the optimal policy, which determines the best action for the specific state for task offloading(iv)We used the iterative algorithm to solve the optimal task offloading problem and increased the longterm expected reward in the form of saved energy and time(v)We compare the results of the proposed technique with a greedy algorithm and show significant performance gains
The rest of the paper is organized as follows. Section 2 provides the related literature review. In Section 3, we describe the system model of the VFC system. In Section 4, we formulate the problem as an SMDP optimization problem. The solution to the optimization problem is given in Section 5 as an iterative algorithm. Section 6 presents the numerical results and analysis of the performance. Finally, in Section 7, conclusion and future work are presented.
2. Related Work
In recent years, few works have been carried out to investigate the task offloading problem in vehicular fog computing. Wu et al. in [16] proposed a task offloading policy for the VFC and used IEEE 802.11p protocol for the transmission of tasks. Tasks are divided based on priorities according to the delay requirements. The problem is formulated as SMDP, and the task offloading scheme is presented to maximize the longterm reward in the form of reduction of processing time of a prioritized task. To solve this problem, iterative algorithm is used.
In [17], the authors improve the efficiency of applicationaware offloading by proposing a VFC system in which public vehicles such as buses are being used as fog servers. A priority queuing system is applied to model the VFC for the applicationaware delay requirements. The problem is formulated as SMDP. An applicationaware task offloading policy is proposed to obtain the maximum longterm award of the VFC model.
In [18], a novel offloading scheme has been proposed to minimize the cost of energy consumption, failure in offloading, and service latency of the VFC network. At first, the overloaded cloudlet node has been determined, and then, an offloading policy has been introduced to determine which task will be offloaded and for the selection of vehicular node to place the offloaded task.
A gametheoretic approach can be used to overcome the demandresource mismatch by minimizing the usage of resource energy and reducing the response time. The proposed resource allocation model has outperformed the stateoftheart VFC models in terms of improved performance for the vehicles by keeping reducing the energy consumption [19].
A distributed scheduler scheme for the energy consumption minimization for computing has been presented in [20]. The proposed model maximizes the efficiency of the system as well as maintains the quality of service. In [21], the authors show that the longterm reward of the VFC system can be increased by using an optimal task offloading strategy that uses the computation and transmission delays and available RUs. An iterative algorithm is used to solve the problem which is formulated as SMDP.
The work in [22] proposed the concept of VFC in which the electric vehicles are used as fog nodes and used to save energy for mobile devices. They used the Markov decision process to formulate the resource allocation problem and used dynamic programming to solve this problem. In [23], the authors proposed an efficient incentive mechanism based on the contrast matching approach for the problems of task offloading and computational resource allocation problems in the VFC system. By this approach, the base stations offload the task to nearby vehicles and minimize the task delays by using the underutilized resources of the vehicles.
The work in [24] proposed a deep dynamic reinforcement learning algorithm by exploiting the Markov decision process. They used reinforcement learning to get an offloading decision that minimizes the cost of the VFC system consisting of consumption of energy and service delay. In [25], Liu et al. presented a threelayered architecture for service offloading in the VFC system consisting of vehicular fog, fog server, and central cloud. They formulated the probabilistic task offloading problem to minimize the energy consumption payment cost and the execution delay and solved it by the iterative coordination process.
In [26], a task offloading policy for the VFC system is proposed that considers the task priority, vehicle mobility, and the availability of service by the vehicles. The prioritybased task offloading policy was formulated as MDP and solved by a soft actorcriticbased deep reinforcement learning algorithm to maximize the entropy of policy and the reward.
To achieve the benefits prevailed by vehicular fog computing (VFC), the authors in [27] have presented a threelayer VFC model to minimize the response time of the vehicles. The problem has been formulated as a realtime optimization problem for the effective management of decentralized traffic.
Different from the work in the literature that studied the task offloading scheme in VFC system, we develop an energyefficient task offloading scheme in VFC system to maximize the longterm reward by using the parallel computing and computing one part of the task request locally and offloading the remaining task to the vehicular fog nodes.
We utilize vehicular fog computing model in [21] as basis of our work; however, there are two major differences from the previous work. The first difference is the consideration of local computing at the taskgenerating vehicle as well as remote cloud computing whereas [21] only considers resources from other cooperative vehicles. The second difference is that the focus of our work is on energy efficiency, and hence, we propose a new reward function that considers energy consumption of the vehicular computing node.
3. System Model
In the present section, we present the system model inspired by [21] as shown in Figure 2. For computing, VFC is a recent paradigm that has been furnished several applications that required high computation and timecritical applications by offering the computing resources for the processing. All the vehicles, in the VFC, have a processor with a RU and also a source of a task; i.e., the vehicles can offload their computation tasks between each other. Since the arrival and departure of the vehicles from the VFC are random, the resources of computation change randomly in the VFC system. We assumed that all vehicles have the same virtualized RUs, and also, they are conscious of the accessible RUs in the system through communication with other vehicles in real time. When a task request arrives at the system, it has to be decided whether to accept this request or transfer it to the remote cloud (RC) according to the availability of the resources. If the request is accepted by the VFC system, the system decides to allocate the number of RUs according to available RUs.
For the illustration, we consider an example present in Figure 2. Vehicle generates a task and is accepted by the VFC system as there are sufficient available resources; the task is then divided into four equal subtasks; one part is computed by itself and the remaining three parts are offloaded to the three RUs, i.e., , , and If there are no available RUs, the task is transferred to the RC for the computation. After the computation, the result is feedback to the . The vehicles arrive and depart the system according to the Poisson process. The arrival rate of the vehicle is , and the departure rate of the vehicle is . The maximum number of vehicles that the VFC system can handle is denoted as , and we assume that available RUs in the system are which fluctuate according to the departure and arrival of the vehicle. The number of RUs in the VF cannot surpass the maximum number of vehicles , i.e., . Task arrival rate and service rate also follow the poison distribution denoted as and , respectively. The computing service rate is when only one RU process the task for RU service rate .
4. Problem Formulation
In this section, we formulate the problem of task offloading by using SMDP. The number of available RUs in the VFC system changes by the events of departure and arrival of tasks and vehicles. When a task request from the vehicle arrives at the system, the system allocates the different number of RUs or transfer it to the RC for processing. The system achieves a reward as a result of task offloading decision which depends on the energy saved and computation time for the processing of the task.
In this SMDP model, the state is a set consist number of available and occupied RUs under different events. An action indicates the choices for decisions for different states. The reward reflects the advantage of the system in terms of energy and time for different states and actions. The probabilities of transition from one state to another under different actions are described by transition probabilities. The notations used in this section are summarized in Table 1.
4.1. States
The system state indicates the available resources present in the VFC system in the form of RUs, the number of requests processed by various numbers of RUs, and the events of requests and vehicles [21], i.e.,
where is the available RUs in the system and denotes the quantity of task request served by the resource units , and is a particular event that belongs to the set
Here, denotes the task’s arrival, denotes the maximum RUs the system of VFC can assign to the task, is the departure of the task that was serviced and accomplished by RUs, denotes vehicle’s arrival rate, and denotes the departure of the vehicle. The overall sum of vehicles should be greater than the number of RUs in the system. The allocated RUs to the in the system can be calculated by that should always be less than the available RUs in the system at any state, i.e., . Moreover, the remaining number of available RUs is measured from .
4.2. Actions
Action in the VFC system indicates the different possibilities of decision that the system can be taken according to the specific event of the current state [21]. Action based on the state belongs to the set and is denoted by
where shows the case when the task is completed and depart the system; no action of allocation of RUs is taken similarly when a vehicle arrives or departs, only the VFC system is updating its information about the available RUs. When the task request arrived, there are two options; either the request is accepted or transfer it to the RC; indicates the action when the task has arrived and there are no available RUs in the system and transfer it to the RC; means that the task arrives and RUs allocated for the processing of the task. The relationship between action and events is shown in the equation below.
4.3. Rewards
The reward reflects the advantage of the system of VFC after various actions undertaken for several states. As the main purpose of the system is to cut off the energy consumption and execution time of the tasks in the system by saving power and increasing the processing speed, the reward comprises both, the total income and cost of the system [21]. When an action is performed at a specific state , the system earns an instant income .
The state remains for a certain time till the next event is occurring, and state is transitioned to the next state . This time is known as the cost of the system. The difference of the income and the cost is known as reward .
The income and cost of the system are derived below.
4.3.1. Income
The income of the system depends on different events and actions because the state is changed by the occurrence of the events. The income of the VFC system can be described as follows: The income function is explained below.
(1) . When a task request arrives and is accepted by the system when resources are sufficient for the task, the system assigns RUs to complete the request of the task; one part is computed locally by the requested vehicle so that no energy is wasted in idle. The instant income that can be earned by the VFC system is while is the energy saved and is the time saved during the processing of tasks in the VFC system. and are the weightage to the saved time and energy according to the various purposes; they can be predefined where . and are the saved price per unit time and energy to convert the energy and time into revenue. The cost of transferring the task to the VFC system and receiving the result from it is denoted as and is known as the transfer expense. and are the computation and transmitting power, respectively. is the energy consumed when RUs are assigned to the task request RUs process the task, as the requested vehicle also processes one part of the task. The total service time for the processing of the task is .
(2) . When a task request arrives and there are not sufficient resources in the VFC system, the request is not accepted by the system and transferred to the RC for processing. is the immediate reward earned by the system when a task is processed by the remote cloud. Here, and are the cost of transfer expense. denotes the cost of transferring the task to the RC and receiving the result from the cloud. Remote cloud has a very large computation capability so the computation energy and time are not considered and have not affected the energy of the VFC system. But the delay is very large, so it is not a wise decision to transfer the task to the RC.
(3) . For the events of the arrival of the vehicle and the task’s departure, there is no income as the system takes no action.
(4) . The system does not gain any reward when the vehicle departs the system and there are enough RUs to be allocated.
(5) . When the vehicle departs the VFC system and all the RUs already occupied and processing a task at this moment, the departure of the vehicle disturbs the processing, and the system has to pay a penalty of
4.3.2. Cost
To formulate the longterm cost, the discounted cost model is used from [28]. is the expected discounted cost of the system during the duration when the state is transitioned from one state to another by taking an action and defined as
where is the expected service time when system state is changed to next state by taking an action ; this is assumed to be exponentially distributed according to [28]
where is the expected service time’s cost rate for the state and action that is characterized as a function of total occupied RUs, i.e.,
is the factor of discount, and is the expected event rate of the system for the state and action that can be calculated by adding all the rates of arrival and departure of vehicles and task requests. The arrival and departure rate of the vehicle is and , respectively, while the arrival rate of the request of task and departure rate of the task request depends on the various events and actions of the system calculated as below.
(1) . With the arrival of the task request in the VFC system, the system assigns RUs for the task processing. is the arrival rate of the task, and the rate of departure for the task request is because the allocated number of RUs is , under action .
(2) . The system takes no action when a task departs from the system which is allocated by RUs. The task request arrival rate is , while the departure rate of the task is .
(3) . When the vehicle leaves the VFC system, no action is taken by the system but the available number of vehicles decreased by ; hence, is the task request arrival rate, while the departure rate of tasks is .
(4) . The system takes no action while the vehicle arrives, but there is an increment in the available number of vehicles by ; hence, is the task request arrival rate, while the departure rate of tasks is .
The expected service rate for different events and action is calculated as
4.4. Transition Probability
The transition probability is the probability of going from the current state to the future state after taking an action has been taken [21]. We have used the generalized SMDP transition probabilities [21]; however, the proposed work has an additional state for remote cloud computing.
The transition probability in the system of VFC can be explained by the ratio between the sum of all the events and the next event rate. The transition probability is denoted as and formulated as below: (1)(2)(3)(4)(5)
The state transition diagram is shown in Figure 3 which shows the state transition process takes current state and shows it transitioned to the next states for different actions and events with the transition probability [21].
5. Solution
The solution of the above SMDP problem is presented in this section and find the solution to magnify the reward of the VFC system in a long term; by this we save energy and minimize the processing time. To solve the problem, value iteration algorithm is adopted. In the value iteration algorithm, Bellman optimal equation [21, 29] is used. In every iteration of all the states, maximum value function is calculated for each action, i.e.,. When of every state is converging, the step is terminated. The Bellman optimal equation is shown below:
Here, and known as discount function.

To transform the continuoustime semiMarkov decision process into a discretetime process, a new parameter is defined to normalize the reward, discount factor, and transition probabilities. The normalized equations are shown below.
Thus, after normalization, the Bellman optimal equation becomes
Initially, the value function of all states was set to zero. Now, by Equation (18), the normalized value function of states is calculated by using values from the previous iteration. For example, the maximum value function in iteration is calculated by using the iteration. For finding the optimal policy , the absolute difference between two succeeding iterations is computed for each state, i.e., . The algorithm is stopped when the maximal absolute value is lower by the threshold and optimal scheme is obtained.
If the threshold is greater than the maximum absolute value, the algorithm goes into the next iteration and continues till the optimal scheme of offloading is obtained. The pseudocode of the VIA is presented in Algorithm 1.
6. Simulation Results
The performance of our proffered scheme of task offloading has been evaluated in this section by conducting trials and getting experimental results. For the comparison, we compare our proposed scheme with the greedy algorithm (GA), which constantly tries to assign the maximum number of RUs to offload the task for processing [28]. We used the MATLAB R2019b tool for the experiments.
The parameters that have been used in the evaluation are exhibited in Table 2. The maximum number of RUs that can be assigned to the task for processing is ; i.e., 1, 2, or 3 RUs can be assigned to task requests according to the availability of resources in the VFC system.
Action 1, action 2, and action 3 denote the number of assigned RUs to the tasks, while action is the special scenario of sending the service request to the RC. For evaluation, different parameters are adjusted, for example, the number of maximum available vehicle in the system of VFC, the arrival rate of the task requests , and service rate .
Figure 4 illustrates the relationship among the maximum number of vehicles supported by the VFC system and transition probabilities of the different actions taken by the system when the request arrival rate and service rate . It is depicted from the figure that when the number of vehicles in the VFC system is low the transition probability for action and action is lower than the transition probabilities of action and action . This is because when the number of vehicles is less the available resources in the form of RUs are also less and the VFC system allocates a smaller number of RUs to the task request. With the increment of the number of vehicles in the system, a decreasing trend in the transition probability of actions 0, 1, and 2 and an increase in the trend for action 3 can be seen. The system mostly assigns three RUs to the arriving requests to enhance the expected reward in the long term.
Figure 5 shows the comparison of our proposed scheme of SMDPbased algorithm with the GA scheme when and service rate . It can be observed that with the increment in the number of vehicles in the VFC system the expected reward of the VFC system increases since with the increase in the maximum number of vehicles the number of the completed tasks increased, and as a result, reward increased. The SMDPbased scheme has 19% improved performance in terms of reward than the GA as shown from the figure. This is because the GA tries to allocate maximum RUs without taking into account the expected reward of the VFC system.
Figure 6 depicts the relationship among the requests of the task arrival rate of and the transition probabilities of the actions taken by the system, when and service rate . It could be noticed from the figure that the transition probability of action 3 is higher than all other actions when the task arrival rate is low because there is abundant computation capability in VFC. The system tries to assign the utmost number of RUs to the requested task to increase the longterm expected award. With the increase in the rate of the task, arrival requests in the transition probabilities of action start increasing while for action there is a decreasing trend because the VFC system starts taking conservative decisions according to the available resources and to get the maximum reward taking action or by this complete more tasks without transferring the requests to the RC. The transition probability of action 2 shows random behavior; in the beginning, it starts increasing and after some time shows the decreasing trend and at the end start increasing again according to the optimal policy to improve the expected reward.
Figure 7 demonstrates the relationship between and longterm reward of SMDP and compares it with the GA when and service rate . The longterm expected reward shows an increasing trend with the increase in the rate of task arrival request because with the increment in the number of tasks request the completed task also increased, and as a result, longterm expected reward increases. Moreover, it can be conceivable from the figure that our proposed methods outperform the GA.
Figure 8 shows the comparison of the longterm reward of the system when changes from 16 to 8. It can be perceived from the figure that the expected reward for is lower than that for because the number of tasks computed is less when the rate of service is low. Moreover, when the service rate is high, processing of tasks by RUs is faster; as a result, the number of available RUs increases, and more tasks can be offloaded and gain more reward.
In Figure 9, there is a comparison of our proposed scheme of SMDPbased algorithm with the GA scheme when and service rate . It can be seen that our proposed scheme outperforms the GA and shows a similar trend as in Figures 5 and 7.
In this section, we present the performance of our work with the help of different experiments. We see that the proposed algorithm exhibits superior performance than the GA and gains more reward under different parameters; i.e., varying maximum numbers of vehicles in the VFC system, different service rates, and different rates of the task request, our scheme outperforms in all the cases.
7. Conclusion and Future Work
In this paper, we propose an optimal energyaware task offloading technique for the Internet of Vehicles. When a vehicle with the task request arrives, the system decides for the allocation of computational resources, i.e., RUs, and divides the task according to the decision. We use parallel computing and save energy and minimize time delay for our VFC system and formulate the problem as an infinite horizon SMDP. The Bellman optimal equation is used in value iteration algorithm to get an optimal policy that amplifies the longterm expected reward that saved energy and time in this problem. The proposed scheme demonstrates the improved performance of the greedy algorithm as established by the substantial numerical results. In the future, we aim to consider mobility of the vehicles and dynamic wireless connectivity in task offloading.
Data Availability
Data is available from the corresponding author on request.
Conflicts of Interest
The authors declare no conflicts of interest.
Acknowledgments
The authors extend their appreciation to the Deanship of Scientific Research at Imam Mohammad Ibn Saud Islamic University for funding this work through Research Group no. RG210706.