Abstract

In this work, we propose a channel allocation and power control algorithm for energy harvesting (EH) device-to-device (D2D) communication based on nonorthogonal multiple access (NOMA). The algorithm considers users’ quality of service (QoS) and energy causality constraint to maximize the total capacity of D2D groups. The optimal offline allocation of channel and power is realized firstly. Then, the offline optimization results are taken as the training dataset to train the neural network to obtain the optimal model of the transmission power. The online power allocation optimization algorithm is further proposed. Simulation results show that the offline algorithm can improve the total capacity of D2D groups, and the performance of the online algorithm is close to the offline algorithm.

1. Introduction

Device-to-device (D2D) communications can establish direct communication links between adjacent users without passing through base station (BS) or other core networks [1]. It can reduce traffic loads of the BS and the transmission power of D2D users [2]. Nonorthogonal multiple access (NOMA) technology allows a transmitter to send multiple signals at the same frequency through power superposition, which can improve spectrum efficiency. Combining D2D communication with NOMA technology is better for future network deployments and allows more users to connect to the network.

The D2D communication based on NOMA has attracted researchers’ attention recently. In [3], authors analyze the ergodic capacity of the system, where D2D users communicate while act as relays to forward the information of the BS, and NOMA technology is adopted in two phases of the information transmission. In [4], the BS sends information to multiple cellular users through NOMA technology. The total rate of D2D users is maximized under the minimum rate requirement of cellular users. Authors in [5] analyze the rate of D2D users when NOMA technology is used to transmit information, and propose a channel allocation algorithm to maximize the total rate of the system. A NOMA-enhanced D2D communication system is considered in [6], and the subchannel and power allocation is optimized to maximize the system sum rate. The above papers do not consider the energy supply of D2D user and implicitly assume that the energy of D2D transmitter (DT) is infinite. However, the energy of DTs is limited or needs to be charged. Therefore, the assumption of infinite energy is not always consistent with reality [7].

Energy harvesting (EH) technology can solve the energy supply problem of low-energy consumption users and realize green communication [8]. Users can harvest energy by wireless power transfer or from the surrounding environments. Authors in [913] investigate the resource allocation issue when D2D transmitters harvest energy by wireless power transfer. However, when the distance between the users and the power station is far, the method of wireless power transfer will lead to serious energy waste due to the path loss.

In [1417], the D2D users harvest energy from the surrounding environments. Authors in [14] utilize the Pareto-optimal boundary and a subcarrier allocation method to allocate power and channel for D2D users under the energy causality constraint and battery overflow constraint. Many-to-many matching problem between cellular users and D2D users is investigated in [15], and the transmission power of D2D users and transmission time are optimized under the energy harvesting constraint. In [16], authors propose a low-complexity energy-aware space matching method to solve the channel and power allocation problems. In [17], an iterative algorithm based on Dinkelbach and Lagrangian constraint optimization is proposed to maximize the average energy efficiency of D2D users. However, authors [1417] assume that the full system information is available before the transmission process and propose the offline algorithm. Actually, there is no prior knowledge about the harvested energy and the system information. Authors in [18] assume that the harvested energy is causally known and propose an offline joint power control and channel allocation algorithm, and the online transmission power is allocated by using the dynamic programming method. However, due to the high complexity of the online algorithm in [18], the online transmission power cannot be obtained instantaneously while the harvested energy changes.

In order to quickly determine the online transmission power, this paper uses a neural network for online power allocation. This is because the neural network is a nonlinear, adaptive information processing system composed of a large number of interconnection processing units, which has the ability to find optimal solutions at a high speed. Assume the DTs can harvest energy from the surroundings and send the superposed signal to multiple D2D receivers (DRs) with NOMA technology by reusing cellular users’ downlink links. The main contributions of this paper are as follows:(1)In order to optimize the online transmission power of DTs, firstly, the offline transmission power is optimized based on Lagrange constrained optimization by considering the QoS of users and energy causality constraint, and the offline channel allocation and power control algorithm is proposed.(2)An online power allocation optimization algorithm is proposed with causal system information. The optimal transmission power of the offline power allocation algorithm and the system parameters affecting the transmission power are taken as the training data to train the neural network. Thereby, the optimal online transmission power can be obtained by the neural network model.(3)The simulation results demonstrate that the proposed resource algorithm can improve the total capacity of D2D groups compared with the algorithm in [18]. In addition, the performance of the online power allocation optimization algorithm is close to the offline power allocation optimization algorithm.

The rest of this paper is organized as follows. The system model and EH model are described in Section 2. Then, in Section 3, we establish the optimization problem model and solve the optimization problem. We also propose the offline channel allocation and power control algorithm, as well as the online power allocation optimization algorithm. Simulation results are provided and analyzed in Section 4 to show the performance of the proposed algorithm. Finally, conclusions are drawn in Section 5.

2. System Model

We consider a hybrid single-cell scenario, as illustrated in Figure 1, which contains a BS located in the center of the cell, M cellular users, and N D2D groups. Each D2D group includes one DT and two DRs, the receivers in each D2D group are randomly distributed within a circle with radius Rd, and the center of the circle is the corresponding transmitter. We use and to indicate the sets of cellular users and D2D groups, where CUm and Dn represent the cellular user m and D2D group n, respectively. The cellular users occupy orthogonal downlink channels for traditional cellular communication with the BS. Different from traditional D2D communication, in each D2D group, DT can use NOMA transmission mechanism to send messages to both DRs at the same time. Each D2D group multiplexes a cellular user’s channel for communication, and each channel can be multiplexed by at most one D2D group.

We assume that the DTs can harvest energy from the surrounding environment every seconds. That is, the time instants of energy arrival are , and the harvested energy obeys uniform distribution in [0, Emax]. The time interval between two consecutive energy arrival instants is termed as an epoch [18]. The energy arrives at the beginning of an epoch instantaneously and is stored in the user’s battery for later information transmission. The capacity of the battery is much larger than the harvested energy, that is, the capacity limitation of the battery is not taken into account. The storage and retrieval of the energy from the battery is assumed to be lossless [14]. Supposing the energy arrives K times in the total transmission time T. Then, the number of epoch is K.

In order to distinguish the two receivers in the same D2D group, we assume hn,1 < hn,2, where hn,1 (hn,2) is the channel gain between the 1st (2nd) receiver and transmitter in Dn. According to NOMA protocol, the transmission power for 1st receiver is bigger than the transmission power for 2nd receiver. The receiver uses the successive interference cancellation (SIC) technology to detect signals. The basic principle of SIC technology is to gradually eliminate the influence of maximum signal power users. That is, 1st receiver directly decodes its own signal sn,1 from the received superposed signal, and 2nd receiver first decodes the signal of 1st receiver and removes it and then decodes its own signal sn,2 [19].

When Dn multiplexes the channel of CUm, the signal-to-interference-plus-noise-ratio (SINR) of CUm in epoch k is given by

The SINR of the weaker receiver 1 in Dn in epoch k is as follows:

The SINR of the stronger receiver 2 in Dn in epoch k is as follows:

The capacity of Dn in epoch k is given bywhere and are, respectively, the transmission powers of the BS and the DTn, hBm is the channel gain between the BS and the CUm, hnm is the channel gain between the DTn and the CUm sharing the same channel, hBn,1 (hBn,2) is the channel gain between the BS and 1st (2nd) receiver in Dn, represents the power allocation factor of signal sn,1 in Dn, and represents the spectral density of additive white Gaussian noise.

3. Resource Allocation Algorithm for NOMA-Enhanced D2D Communication with EH

In this section, we analyze the optimal channel matching between cellular users and D2D groups, the optimal transmission power of DTs, and the power allocation scheme of stronger and weaker receivers in the D2D group. Taking the QoS demands and the limited transmission power of DTs into consideration to maximize the total capacity of D2D groups, the optimization problem (P1) is established as follows:where xnm is the channel reuse indicator of the D2D group, xnm = 1 means that the Dn reuses the channel of CUm; otherwise, it does not reuse. and represent the minimum SINR threshold of the cellular users and DRs, respectively, En,k represents the harvested energy of DTn at epoch k, and represents the maximum transmit power of DTs. Equation (5) is the objective function of maximizing the total capacity of D2D groups. Equation (6) guarantees the QoS requirements of cellular users. Equations (7) and (8) guarantee the QoS requirements of two DRs. Equation (9) is the energy causality constraint and means that the consumed energy for transmitting signals cannot exceed its harvested energy. Equation (10) indicates the limited transmission power of the DTs. Equation (11) represents that one D2D group can reuse at most one channel and one channel of a cellular user can be reused by at most one D2D group, and equation (12) represents that the power allocated to the weaker receiver is bigger than the stronger receiver but smaller than the total power of DTn.

The optimization problem P1 is a mixed-integer nonlinear programming problem, which is NP-hard problem [20], and the algorithm for finding the exact solution of this problem has exponential complexity. Therefore, we decompose the optimization problem P1 into three subproblems. The first subproblem optimizes the power allocation factor of DRs. The second subproblem uses Kuhn–Munkres (KM) algorithm to allocate channels for D2D groups with the goal of maximizing the total capacity of D2D groups. The third subproblem optimizes the transmission power of DTs under the limitations of harvested energy and the maximum transmission power.

3.1. Offline Channel Allocation and Power Control Algorithm
3.1.1. Optimization of Power Allocation Factor

There is no interference between D2D groups, so the maximum capacity of a single D2D group in K epochs can be firstly solved. When Dn reuses the channel of CUm, and the transmission power of D2D user is fixed, the optimal power allocation factor of two receivers in the D2D group is calculated to maximize the capacity of Dn. When Dn multiplexes the channel of CUm, the total capacity of Dn in K time periods is given bywhere , , , , and .

From equations (7) and (8), we can obtain the following equation:

The Lagrange function of (13) with respect to equations (14) and (15) is formulated as follows:where λ1 and λ2 are the Lagrange multipliers associated with the constraints of equations (14) and (15). The Karush–Kuhn–Tucker condition of equation (16) with respect to power allocation factor can be expressed as follows:where . As whether hBn,1 or hBn,2 is bigger is unknown, so whether is positive or negative is unknown. The first item of equation (17) is the first-order partial derivatives of with , if , is a monotone increasing function of , so the power is allocated to 1st receiver as much as possible under the condition of satisfying the SINR of 2nd receiver. In this case, from equations (18) and (19), we can know λ1 = 0 and λ2 > 0, and the power allocation factor of 1st receiver can be obtained from equation (21) as follows:when , is a monotone decreasing function of , so the power is allocated to 2nd receiver as much as possible under the condition that satisfying the SINR of 1st receiver. In this case, from equations (18) and (19), we know λ1 > 0 and λ2 = 0, and the power allocation factor of 1st receiver can be obtained from equation (20) as follows:

3.1.2. Optimization of Channel Allocation

Considering the objective function (5) and the constraints (6) and (11), the channel allocation optimization problem can be regarded as the optimum matching problem of two weighted bipartite graphs. As shown in Figure 2, D2D group set and cellular user set, respectively, represent two mutually disjointed vertex sets in the bipartite graph. If Dn multiplexes the channel of CUm, vertex n will be connected with vertex m by a line. The weight value on the connecting line is , which represents the capacity of Dn when it reuses the channel of CUm.

KM algorithm is an effective binary matching algorithm, and the maximum weight matching problem can be transformed into the complete matching problem by giving each vertex a label [21]. Therefore, we use KM algorithm to allocate channels for D2D groups. The detailed steps are shown as follows:(1)Initialize the cellular user set CU, the D2D group set D, and the candidate cellular user set of D2D groups .(2)For each D2D group and each cellular user , calculate the with equation (1).(3)If , put CUm into : , calculate the optimal with equations (23) or (24), and calculate with equation (13). Else, .(4)Take as the weight of Dn and CUm, use the KM algorithm [21] to achieve the optimal matching of D2D groups and cellular users’ channels, and the channel allocation matrix X is obtained.(5)If the candidate cellular user set of Dn is , there is no channel of cellular user that can be reused by Dn, that is, for any m, xnm = 0.(6)Update the X, and end of the algorithm.

3.1.3. Optimization of the DTs’ Power

In the previous section, we have assigned the channels to D2D groups. Assuming that Dn multiplexes channel of CUm, and the subproblem of optimization problem P1 regarding the transmission power of Dn in K epochs is given by

The first partial derivative of equation (25) with is as follows:

All parameters in equation (26) are positive, so the objective function is a monotonically increasing function in the feasible domain. In order to maximize , all the energy harvested in K epochs should be used up. When the maximum value is obtained, the following theorem is applied.

Theorem 1. When the maximum value of is obtained, the transmission power of DTn in the epoch k (1≤ k < K) is no more than the transmission power in the next epoch. That is, the transmission power of Dn satisfies for any epoch k (1≤ k < K).

The proof is provided in Appendix.

According to Theorem 1 and the optimization results of power allocation factor and channel allocation, an offline channel allocation and power control algorithm is proposed. The detailed steps are summarized in Algorithm 1.

Input: the number of epoch K, the length of epoch , and the maximum number of iterations Titer, hn,1, hn,2, hBn,1, hBn,2, En,k, , PBS, , , and .
Output: the channel allocation matrix X, the offline transmission power , and the power allocation factor .
(1)Initialization: tnm = 1, .
(2)Allocate channel for D2D groups, and obtain the channel allocation matrix X.
(3)For n = 1 : N
(4)while tnm<Titer
(5)  For k = 1 : K
(6)   If
(7)    ;
(8)    ;
(9)   End if
(10)  End for
(11)  tnm = tnm + 1;
(12)End while
(13)For k = 1 : K
(14)  ;
(15)End for
(16) Calculate with equation (20) or (21);
(17)End for
3.2. Online Power Allocation Optimization Algorithm

In Section 3.1, we assume that the energy distribution of all epochs is known before the signal transmission. The offline optimal transmission power of DTs and the offline power allocation factor of the two receivers in the D2D group are obtained. However, in real scenario, only the energy harvested in this epoch and before this epoch is known in epoch k, and the energy distribution after epoch k is unknown. In this section, an online power allocation optimization algorithm is proposed, that is, the neural network is used to optimize the transmission power of DTs. Then, the power allocation factors of the two receivers in the D2D group are obtained.

Neural network simulates the working method of the human brain, which has strong adaptability and learning ability. Supervised neural networks require training dataset composed of a large number of known input vector and the corresponding output vector. A nonlinear mapping relation between input vector and output vector is obtained through learning. After completion of neural network learning, the corresponding output vector can be calculated according to the mathematical model of the neural network when input an input vector. Since the harvested energy after epoch k is unknown, it cannot be decided whether to store energy for later epoch to maximize the capacity of D2D groups in the total transmission time T. By learning the training dataset of offline optimization algorithm, the neural network can get the mathematical model between the input vector and the transmission power. Then, the online optimal transmission power can be obtained with only the information of the system parameters of epoch k.

The model of the neural network adopts the multilayer feedforward network. It is supervised learning and composed of input layer, hidden layer, and output layer. In the training process, the error between the output generated by the neural network and the actual output is calculated. Meanwhile, the error is minimized by adjusting the weight vector of the neural network [22]. In this paper, mean-square error (MSE) is minimized:where Q represents epoch, y(q) represents the output of the neural network, and t(q) is the actual output.

When some parameters of minimum mean square error, maximum iteration epoch, minimum gradient, and maximum confirmation failure times reach the set value, the training process ends.

In this paper, a two-layer feedforward network is considered. That is, the number of hidden layers is one, due to the fact that the neural network with a single hidden layer is sufficient to approximate any function and any given precision [23]. There are four back propagation (BP) algorithms for training neural network in Matlab neural network toolbox, namely, resilient BP algorithm (trainrp), conjugate gradient BP algorithm (traincgf), gradient descent BP algorithm (traingd), and Levenberg–Marquardt BP algorithm (trainlm). Considering the convergence epoch, convergence time, and MSE of training neural network, trainlm algorithm is selected to train the neural network after testing. The performance of different training algorithms is presented in Table 1.

In addition, when trainlm algorithm is adopted to train the neural network, considering the MSE and complexity of neural network structure, the number of hidden layer neurons is set to 4 after testing. Figure 3 shows the variation of MSE with the number of hidden layer neurons.

The specific structure of the neural network is shown in Figure 4. The input vector is composed of hn,1, hn,2, hBn,1, hBn,2, , , and En,k, and the output is online transmission power , where represents the total energy harvested in the k − 1 epochs before epoch k and represents the total energy consumed in the k − 1 epochs before epoch k. Before training, the input parameters are normalized. Normalization can ensure that each input parameter provides the same influence in the neural network and can make the training process more stable and the convergence speed faster.

When offline power allocation is carried out by Algorithm 1, input vectors and output vectors required to train the neural network can be obtained. Through training neural network, an optimization model to maximize the total capacity of D2D groups is obtained. The online transmission power of the current epoch can be determined by the optimization model. The online power allocation optimization algorithm is shown in Algorithm 2.

Input: the channel allocation matrix X, the number of epoch K, the length of epoch , hn,1, hn,2, hBn,1, hBn,2, the harvested energy En,k, , PBS, , and .
Output: the online transmission power and the power allocation factor .
(1)Initialization: k = 1, , , .
(2)For m = 1 : M
(3)n = find(X(:, m) == 1);
(4)while kK
(5)  Input the input vector to the neural network to obtain ;
(6)  ;
(7)  Calculate with equation (20) or (21);
(8)  k = k + 1;
(9)  ;
(10)  ;
(11)End while
(12)End for

According to Algorithm 2, the complexity of the online power allocation optimization algorithm is O(LK), where L = min(N, M) represents the number of D2D groups for D2D communication and K represents the total number of epochs. In [18], the power allocation algorithm uses Bisection search method to optimize Lagrange multipliers in two nested loops. Thereby, the complexity is O(LK logn), where n represents the product of the number of elements in the two dichotomy intervals.

4. Simulation Results and Analysis

In this section, we present the simulation results of the proposed offline and online algorithms. The influences of the distance Rd between DT and DR, the number of D2D groups N, the maximum value of energy arrival distribution Emax, and the SINR threshold of CU on the total capacity of D2D groups are analyzed. The two-layer feedforward neural network is realized by MATLAB neural network toolbox. The transfer functions of hidden layer and output layer are symmetric sigmoid transfer function and linear transfer function, respectively. The maximum iteration epoch is 1500. The minimum MSE is 10e − 5. The minimum gradient is 10e − 7. The maximum confirmation failure time is 15. The channel gain consists of large-scale fading based on path loss and small-scale fading based on Rayleigh fading, which is expressed as , where β and α, respectively, represents path loss constant and path loss exponent, d is the distance between transmitter and receiver, and represents Rayleigh fading that follows the exponential distribution with unit mean [24]. Other simulation parameters and their specific values are given in Table 2.

Figure 5 shows the impact of different distances between DT and DR on the total capacity of D2D groups. In the simulation, it is assumed that there are 4 D2D groups in the cell, the maximum value of energy arrival distribution Emax is 80 mJ, and the SINR threshold of CU is 2 dB. From the figure, we can see that the total capacity of D2D groups decreases with the increasing distance between DT and DR. This is because the channel gain decreases as the distance between the DT and DR increases, which leads to the decrease of D2D groups’ total capacity. Meanwhile, the proposed algorithm is superior to the algorithm in [18], due to that the proposed power allocation algorithm takes the harvested energy of the whole time T into consideration. Gupta et al.’s study [18] uses the binary search method to optimize the transmission power in a certain period and ignores the energy arrival after the current epoch when determining the binary interval.

Figure 6 presents the total capacity of D2D groups with the increasing number of D2D groups. In the simulation, it is assumed that the distance between DT and DR is 20 m, the maximum value of energy arrival distribution Emax is 80 mJ, and the SINR threshold of CU is 2 dB. As can be seen from Figure 6, with the increasing number of D2D groups, the total capacity of D2D groups increases. This is because when the number of D2D groups is less than the number of cellular users, more D2D groups can reuse the channels of cellular users for communication with the increase of the D2D group number, which increases the total capacity of D2D groups. When the number of D2D groups is larger than the number of cellular users, channels will be assigned to D2D groups with better channel gain, increasing the total capacity of D2D groups. Additionally, the proposed online power allocation optimization algorithm in this paper can achieve 98.6% of the performance of the offline power allocation algorithm. This is because the offline algorithm can achieve the optimal power allocation with the assumption that the harvested energy of all epochs is available. Online power allocation is unable to maximize the total capacity of D2D groups on the whole time because the harvested energy after current epoch is unknown. However, because the training dataset of the neural network is the optimal data of the offline power allocation algorithm, the performance of the online algorithm is only slightly lower than that of the offline algorithm.

Figure 7 shows the impact of the maximum value of energy arrival distribution on the total capacity of D2D groups under different SINR thresholds of CU. In the simulation, we assume that there are 4 D2D groups in the cell, and the distance between DT and DR is 20 m. It can be observed that for a certain SINR threshold, with the increasing maximum value of energy arrival distribution, the total capacity of D2D groups increases. This is because when the maximum value of energy arrival distribution increases, the average harvested energy of each epoch will increase. Then, the average transmitting power of DT is increased in each epoch, which leads to the rise of the total capacity of D2D groups. In addition, for a certain SINR threshold, with the increasing maximum value of energy arrival distribution, the transmission power of DT in each epoch is closer to the preset maximum transmission power . This results in the performance difference between offline algorithm and online algorithm to gradually decrease. From the figure, we can also observe that the total capacity of D2D groups with a SINR threshold of 2 dB for cellular users is larger than that of 8 dB. This is because the SINR threshold of cellular users increases, D2D groups’ candidate cellular user number is reduced and even no candidate cellular user. D2D groups can reuse fewer channels and even cannot communicate, causing the total capacity of D2D groups to decrease.

5. Conclusions

This paper assumes a scenario where DTs harvest energy from the surrounding environment and use NOMA technology to send information to two DRs at the same time. We investigate the problem of assigning optimal transmission power, power allocation factor, and channels to D2D groups. The total capacity of D2D groups is maximized while satisfying the QoS requirements of users and the energy causality constraints. Firstly, the offline channel allocation and power control algorithm is proposed based on the assumption that all harvested energy information is known. Then, the optimal transmission power of the offline algorithm is taken as the output vector, the system parameters that affect the transmission power are taken as input vector, making up the training dataset to train the neural network, and the optimization model of transmission power is obtained. According to the system parameters in a certain epoch, the optimal transmission power of the current epoch can be obtained by the neural network. An online power allocation optimization algorithm is proposed by considering the maximum transmission power limitation. Simulation results show that the proposed offline algorithm can effectively improve the total capacity of D2D groups, and the online power allocation optimization algorithm can approach the upper bound provided by the offline power allocation algorithm.

Appendix

Proof of Theorem 1

Considering epoch k and epoch k + 1, the capacity sum of two epochs is , where represents the capacity sum of two epochs, which is a special case as K = 2. Because the capacity of battery is large enough, when , some energy of epoch k can be stored for epoch , and then the transmission power of two epochs is , and the capacity sum of two epochs is . Then, let us prove that C1 < C2:

Both sides of equation (A.1) divide and then sorted as

The second partial derivative of with transmission power is given by

The second derivative of with transmission power is negative, and then the first derivative of with transmission power is a decreasing function. The first derivative of with is smaller than due to , that is, the right side of equation (A.2) is negative, and because , we can get C1 < C2.

When , if the energy of epoch k stored for epoch k + 1 leads to , it can also be concluded that the capacity of the two epochs is smaller than C2. Therefore, when the transmission power of epoch k is bigger than epoch k + 1, the transmission power of two epochs is averaged. When the transmission power of epoch k is less than epoch k + 1, the transmission power of two epochs cannot be adjusted to be their average value because of the energy causality constraint. Extending the two epochs to K epochs, Theorem 1 is proved.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This work was funded by the National Natural Science Foundation of China (nos. 61971239 and 61631020).