Abstract
In millimeter wave (mmWave) communication systems, beamformingenabled directional transmission and network densification are commonly used to reduce high path loss and improve signal coverage quality. The combination of the two approaches will pose a challenge to radio resource allocation, which is especially true when terminals move frequently. The existing works presented some effective solutions for resource allocation in dense mmWave cellular networks, but they assumed that terminals move infrequently. So, these works cannot be directly applied to the dense mmWave cellular networks where terminals move frequently. In this paper, based on the results of the existing beamforming training (BFT) informationaided radio resource allocation algorithm, we propose a relay selection method to select a set of reasonable relays to take over the terminals whose performance deteriorates due to movement, which can ensure that each selected relay is as close as possible to the original performance of the corresponding moved terminal. Then, the resource allocation problem between the Device to Device (D2D) links from the selected relays to the corresponding moved terminals is formulated as a potential game model. By designing the utility function reasonably, the resource allocation results on the D2D links can converge to a Nash equilibrium solution. The simulation results show that the proposed scheme adapts to the scenario with frequent terminal movement, restrains the sharp performance decline caused by terminal movement, and outperforms the existing related algorithms in terms of average energy efficiency and throughput per link.
1. Introduction
As mobile edge computing and artificial intelligence technologies are increasingly integrated into rich application scenarios [1–4], the demand for spectrum resources for applications is rising sharply. Also, as the spectrum resources in the traditional frequency band will be used up, the huge bandwidth resources in the millimeter wave (mmWave) band have become the main spectrum supply sources to meet the increasing demand for radio network capacity. However, the signal transmission in the mmWave band suffers from high propagation loss, so beamforming is usually adopted to overcome it by providing directional gain. Also, the signal transmission in the mmWave band is sensitive to blockage, so the construction of dense mmWave cellular networks is an effective way to reduce the blocking probability and improve the coverage quality [5].
In dense mmWave cellular networks, the beams of user terminals and base stations must be aligned to ensure their high directional gains, so an effective beamforming training (BFT) mechanism is needed to ensure the accuracy of beam selection and the timeliness of updating beam pair. There are the perfect standards in mmWave wireless local area networks (WLANs) for the BFT mechanism [6]. Also, some improved BFT mechanism for WLANs is proposed in [7–9]. However, none of these BFT mechanisms is suitable for dense mmWave cellular networks, since the various network access points (e.g., macro/micro base station (MBS), small base station (SBS), and access point (AP)) and the large number of user equipment (UE) are included in dense mmWave cellular networks and thus lead to high BFT overhead.
On the other hand, when nonorthogonal multiple access (NOMA) and mmWave techniques are combined to enhance concurrent service capability in cellular networks, the amount of signaling information exchange for beam training will be dramatically increased. Fortunately, random beamforming (RBF) is an effective way of reducing the amount of overhead and latency in mmWaveNOMA cellular networks [10–12]. However, to take full advantage of NOMA, which allows multiple users to access the same timefrequency resource, cellular networks must have the capabilities for superposition coding and successive interference cancellation (SIC), which will place greater cost demands on cellular networks.
We proposed a BFT mechanism for dense mmWave cellular networks [13], which does not mandate such capabilities for cellular networks and also does not consider the mobility of user terminals. When a terminal aligned with an access point through the BFT mechanism moves its location, either a beam tracking scheme or BFT mechanism is used to realign the beam pair. The beam tracing scheme is limited by more factors when compared to the BFT mechanism, such as mobile range and frequency band reuse. When a terminal does not move out of the original coverage area, the beam tracking scheme is easily used, but the original frequency band may not be reused. The same coverage area ensures that the original beam can still be reused by it, while the beam direction at the new location may overlap with other beam directions and thus cannot reuse the same frequency band.
Although the BFT mechanism has no such limitations, it must be performed to ensure that the performance of each pair of beams remains optimal whenever there is any terminal movement. Frequent BFT execution will cause significant overhead, especially when only a few terminals move frequently. Therefore, in order to reduce the BFT overhead, a direct and effective way is to reduce the BFT execution frequency. Also, within the interval between two consecutive BFT executions, we need to properly allocate radio resources to account for the beampair performance degradation problem of the terminals that move their locations. Therefore, unlike the work in [13], which assumes that any user terminal is not moving, we put forward an efficient radio resource allocation scheme considering terminal mobility in dense mmWave cellular networks. The main contributions are listed below. (1)Different from existing related studies (e.g., the works in [7, 13]), we address the radio resource allocation under the network energy efficiency constraint, which is aimed at maximizing the number of concurrent connections(2)When a terminal is assigned a beam and gets a chance to communicate with an AP, it will be allowed to select one of its neighbor terminals as its Device to Device (D2D) relay if the alignment of the beam pair deviates due to movement and thus communication performance degrades(3)We propose a relay selection method to select a set of reasonable relays to take over the terminals whose performance deteriorates due to movement, which can ensure that each selected relay is as close as possible to the original performance of the corresponding moved terminal(4)We formulate the resource allocation problem between the D2D links from the selected relays to the corresponding moved terminals as a potential game model, where the resource allocation results can converge to a Nash equilibrium solution by designing the utility function reasonably(5)The allocated resources for each D2D link can ensure that the amount of data received by the selected relays from the corresponding AP is forwarded to the corresponding moved terminals in time. The simulation results show that the proposed scheme adapts to the scenario with frequent terminal movement, restrains the sharp performance decline caused by terminal movement, and outperforms the existing related algorithms in terms of average energy efficiency and throughput per link
The rest of the paper is organized as follows. Section 2 reviews the works related to radio resource allocation, while Section 3 introduces the system model. Section 4 describes problem formulation, while Section 5 details the radio access resource allocation in dense mmWave cellular networks with frequent terminal mobility. Section 6 analyzes the simulation results, while Section 7 summarizes this paper.
2. Related Work
We mainly review the existing studies with respect to radio resource allocation of mmWave networks in this section. For mmWave WLANs, it was believed that distributed network architecture will lead to very low resource allocation efficiency while centralized network architecture is conducive to improving its efficiency [14–16]. Usually, a controller is adopted to manage and control multiple APs in dense mmWave WLANs, and they are connected to each other through directional mmWave links [17], which is different from the schemes of wired interconnection in traditional lowfrequency bands.
It was also believed that the centralized control mode based on the cloud radio access network (CRAN) is suitable for mmWave WLANs [6, 7]. Furthermore, the authors of [7] explored mmWave beam management and interference coordination problem in dense mmWave WLANs. The authors of [8] focused on how to reduce beam alignment latency in mmWave WLAN, and they took advantage of the correlation structure among beams to identify the optimal beam, which is aimed at avoiding searching the entire beam space. The authors of [9] considered hybrid beamforming to address the user selection problem for an uplink multiuser transmission in mmWave WLAN, which is aimed at lowering system complexity by avoiding the collection of perfect channel state information from all potential users.
Unlike the above works, some typical works considered the BFT problem in cellular networks [10–13]. The works in [10–12] discussed the combination use of NOMA and mmWave to support massive connectivity. The authors of [10] focused on the use of RBF to support machinetomachine (M2M) communications, while the authors of [11] paid special attention to the distribution of mobile terminals in similar application scenarios. Furthermore, the authors of [12] addressed the problem for jointly optimizing power allocation and RBF to improve the system performance. As mentioned, advantages of NOMA come at the expense of higher network construction costs.
For dense mmWave cellular networks without NOMA, the authors of [13] discussed the BFT mechanism and proposed an effective resource allocation scheme based on priority strategy in terms of energy efficiency. The authors of [18] address the radio resource allocation in terms of transmission power and transmission duration for selfbackhauling dense mmWave cellular networks, where the considered problem is formulated as the noncooperative game with the common utility function. The authors of [19] focused on the multibeam concurrent transmission problem, where reasonable beam pair selection is critical. They showed that the proposed method is superior to other beam pair selection methods in terms of the network sum rate and convergence speed of beam pair. The authors of [20] studied the radio resource allocation problem by joint power control and user association in dense mmWave heterogeneous networks and proposed the reinforcement learning framework to solve it.
The authors of [21] investigated the radio resource allocation in terms of transmission power and spectrum for dense mmWave cellular networks with integrated access and backhaul architecture to optimize network capacity under data rate constraints. The authors of [22] focused on the multiconnectivity of mmWave networks, which involves optimal user association and power allocation. Their optimization objectives are the overall energyefficiency maximization and the balance between the achievable rates of all users and load of all mmWave base stations. The authors of [23] focused on the radio resource allocation in multitier heterogeneous networks with a disparate spectrum (i.e., microwave and mmWave) and proposed a coordinated approach for user association and spectrum allocation by using noncooperative game theory. The authors of [24] believed that efficient and fast beam management in initial access is essential since the connections may drop frequently due to the high blockage susceptibility of mmWave links. They proposed a deep contextual banditbased approach to perform fast and efficient initial access.
The above works do not consider terminal mobility as a major concern. However, the authors of [25] viewed the existing works in terms of beam and mobility management in mmWave cellular networks. The authors of [26] presented a sensitivity study on the stability of the selected beams, which is measured by the timeofstay of beams, the mmWave link quality under different operational conditions with respect to UE mobility, propagation environment, and operational bands. But they did not discuss how to ensure the stability of the beam performance. The authors of [27] focused on the misaligned mmWave beam problem in mobile environments. By using deep learning to learn the mobility information, they proposed an adaptive beam management scheme to address this problem, where a sender and its corresponding receiver execute handoff in advance before loss of connectivity based on the prediction result.
The authors of [28] focused on beam management for mmWave unmanned aerial vehicle (UAV) networks and proposed a datadriven beam pattern selection scheme to achieve fast beam tracking results. The authors of [29] studied the systematic beam management strategy, where the beam tracking overhead is reduced by extending the beam coverage to accommodate user movement. Unlike the works in [27–29] based on beam tracking, the problem background in [13] is closest to the problem background of our paper. As the foregoing, the work in [13] is not suitable for the scenario where terminals move frequently. In order to effectively alleviate the problem of system performance degradation caused by the movement of the terminal in [13], we need to propose an efficient radio resource allocation scheme considering terminal mobility in dense mmWave cellular networks.
3. System Model
3.1. Network Architecture
The network scenario for our concern is described as follows. SBSs are overlapped in a microcell, and UEs are randomly distributed in the same microcell, where MBS is located at the center of this microcell. The set of SBSs is denoted as , while the set of UEs is denoted as . In addition, APs are overlapped in the small cells, and the set of APs denoted as . For brevity without loss of generality, a small cell structure in the MBS coverage area is shown in Figure 1, where UE ’s original beam cannot provide good transmission performance after it moves and then it selects UE as its D2D relay.
3.2. mmWave Signal Propagation Theory and Model
In this section, we briefly describe the theory and model of mmWave signal propagation. According to the commonly used switchbased analog beam pattern in [30], the normalized beamforming gain is given by where and are the beam width of the main lobe and the beam offset angle to the main lobe in radian, respectively, while is the side lobe gain and . For brevity without loss of generality, we assume that each of MBS, SBS, AP, and UE has the same limited number of beams, where the beams do not overlap each other and each beam covers a specific orientation. If we use to denote the maximum beam width of each of MBS, SBS, AP, and UE, the minimum number (i.e., ) of beams of each of MBS, SBS, AP, and UE is estimated by
When the beams between UE and AP are aligned, the directional transmission gain and directional receiving gain are estimated by where and are the beam width values of the transmitter and the receiver, respectively. On the basis of [7, 31, 32], the channel between UE and AP is expressed as follows:
In (4), represents the number of paths while the superscript “()” denotes the th path. From equation (3), we know that is the directional transmission gain of the th path between UE and AP and is directional receiving gain of the same path. In addition, represents the amplitude of the th path, while represents the Dirac delta function. represents the propagation delay of the th path, which is expressed as follows:
In (5), represents the distance of the th path between UE and AP , while represents the speed of light. According to [15], the lineofsight (LoS) path (i.e., ) always exists, while the rest of the paths (i.e., from to ) are nonlineofsight (NLoS). According to [32], the amplitude (i.e., ) of the LoS path is expressed as follows:
In (6), represents the wavelength and as well as represents the carrier frequency. represents the distance of the LoS path between UE and AP . According to [32], the amplitude of each NLoS path is related to both path loss and reflection coefficients, which are expressed as follows:
In (7), represents the reflection coefficient of the th reflection for the th path, while represents the number of reflections of the path. Due to very high reflection loss in the mmWave band [5], only one reflection of a given path (i.e., ) is considered in this paper, so formula (7) is rewritten as follows:
On the basis of [33], the channel gain can be expressed as follows:
We use to denote the directional beam’s transmission power from AP to UE . On the UE side, the receiving power (i.e., ) from AP can be estimated by
The interference power (i.e., ) perceived by UE from other APs (e.g., AP () and UEs (e.g., UE () can be estimated by
In (11), is the directional transmission gain from the th path between UE and AP , while is the directional receiving gain from the same path. On the basis of formula (1), the directional transmitreceive gain (i.e.,) of each path can be estimated by
In (12), is the beam offset angles from the AP ’s (AP transmits to UE ) transmitting beam direction to the position of UE , while is the beam offset angles from the UE ’s (UE receives from AP ) receiving beam direction to the position of AP ; Condition A represents both and ; Condition B represents both and ; Condition C represents both and ; Condition D represents both and .
4. Problem Formulation
In this paper, it is assumed that there is not any multiconnectivity capability for UEs, and thus, UE can only connect to one AP at a time in a dense mmWave cellular network. In addition, is the binary association variable, where is equal to 1 if UE is connected to AP ; otherwise, is equal to 0. On the basis of formulas (10) and (11), the signaltointerference plus noise ratio (SINR) between AP and UE is expressed by formula (13) if UE connects to AP :
In (13), is the SINR between AP and UE and is the bandwidth of the mmWave band, while is the background noise power spectrum density. The throughput of the link from AP to UE is expressed by
The throughput (i.e., ) of the entire mmWave network (only the throughput from APs to UEs) is estimated by
The power consumption (i.e., ) of the entire mmWave network (only the power consumption from APs to UEs) is estimated by
In (16), is the power consumed by a radio frequency (RF) chain, where is set to 0.0344 Watts in [34]. The energy efficiency (i.e., ) of the entire mmWave network (only the energy efficiency from APs to UEs) is
In this paper, we aim to maximize the number of concurrent access links under the premise that the network energy efficiency is not lower than the preset threshold. Because the path loss of mmWave signal propagation is large and the transmission power is constrained by the maximum power in practical scenarios, each UE may not receive signals from all the APs in the network. In other words, each UE may only connect to its neighboring APs. Each AP may only get a set of candidate UEs based on the BFT mechanism in [13].
By observing the receiving bit error rate (BER), we can know whether there is a direct mmWave link between a UE (e.g., ) and an AP (e.g., ). A scheme for estimating the success rate of bit transmission (i.e., . Here, denotes UE ’s bit transmission success rate, while denotes the corresponding SINR value) is described in [35]. Based on it, we can derive an approximate relation expression between the acceptable BER value (e.g., ) and the corresponding SINR value (e.g., ), which is given by
To make sure that the BER level of UE ’s receiving data from AP is not more than , AP ’s transmission power should not be less than , which is given by
If is not smaller than , we think that there is a direct mmWave link between the UE and the AP . For a set of candidate UEs of AP (), the directional link’s quality (e.g., ) between each UE () and AP should be higher than a preset threshold (e.g., ), in which the set of candidate UEs of AP is represented as . So, the radio access resource allocation problem for downlink unicast communications can be transformed into the following problem:
In (20), represents a matrix with element , represents a matrix with element , and represents a matrix with element or . In (21), Constraint means that each UE can only connect to one AP; Constraint indicates that the number of UEs that an AP can serve is not more than ; Constraint ensures that the power consumption for each AP cannot exceed ; Constraint means that the beam width is limited to [, ]; Constraint indicates that, for an AP, the sum of beam widths used for serving the connected UEs cannot exceed , since each node’s beams are assumed to be nonoverlapping in Section 3.2; Constraint means that the network energy efficiency cannot fall below the threshold .
5. Radio Access Resource Allocation in Dense mmWave Cellular Network
The optimization problem is combinatorial and nonconvex. The combinatorial nature comes from constraint , and the nonconvexity is caused by constraints . Although a brute force approach can be used to obtain the optimal solution, it is computationally infeasible for a dense mmWave network. Therefore, we propose an approximate algorithm for solving problem in this section. In addition, when the working mmWave links between UEs and APs are interrupted due to UEs’ mobility or blockage, we should redo the BFT processes in [13] to find other suitable links. However, before redoing the BFT processes, the working mmWave links with degraded performance should be recovered as much as possible.
Therefore, we propose an efficient radio resource allocation scheme considering terminal mobility in dense mmWave cellular networks, which mainly includes a BFT informationaided radio access resource allocation algorithm to solve problem . For the solution to problem , our purpose is to maximize the number of concurrent access links under the premise that the network energy efficiency is not lower than the preset threshold, a D2D relay selection algorithm for the UEs of the working mmWave links with degraded performance to recover these UEs’ access performance as much as possible, and a set of schemes based on the potential game to solve the resource allocation problem between D2D links and relay links.
5.1. Radio Access Resource Allocation considering Terminal Mobility
The overall control flow of radio resource allocation is described in Algorithm 1, where the number of the UEs that all the APs can concurrently serve is accumulated (see line 1), and then, the counting variables (e.g., is the counting variable of UE ) of these UEs are initialized to 0, indicating that none of them has ever played a relay role (see line 2); next, after the invoked BFT informationaided radio resource allocation algorithm returns (see line 3), the timing interval is set to trigger the next call to this algorithm (see line 4); finally, if the timer does not time out, the D2D relay selection algorithm and gamebased resource allocation algorithm will be invoked repeatedly to prevent serious degradation of beam performance (see lines 5~8); otherwise, the BFT informationaided radio resource allocation algorithm is invoked again after the number of times of each UE acting as a relay is accumulated (see lines 9~14).

5.2. Beamforming Training InformationAided Radio Access Resource Allocation Scheme
Each SBS (e.g., ) can obtain all the signaltonoise ratio (SNR) values (i.e., ) in its coverage by using the BFT mechanism in [13]. However, directly using the SNR values to measure mmWave channel quality is not always a reasonable choice. Therefore, it is a better choice to take the data rate per unit of power consumption (DRPC) as mmWave channel quality measurement, which can reflect the goal of high energy efficiency pursued by the high capacity wireless networks, which is estimated by
In (22), is the energy efficiency of the mmWave link from AP to UE . To simplify the analysis, we keep all the DRPC values into in descending order. Then, there is a onetoone match between the elements of and those of , in which , and denotes the number of elements in a set and is the set of all the UEs associated with all the APs. Moreover, the number of elements in , , and is .
Because of the combinatorial and nonconvex properties, the problem described in (20) and (21) cannot be solved directly. Therefore, this problem is decomposed into three subproblems. Firstly, and are initialized to get the optimized . Then, the optimized and initialized are used to find the optimized . Finally, the optimized and are adopted to solve the optimized . In this way, we can obtain a suboptimal solution to the original problem.
Algorithm 2 describes the above solution process. and are firstly initialized to the average transmission power (i.e., ) and the maximum beam width (i.e., ), respectively, and then, the optimized is obtained. After that, is initialized as a vector with zero, and then, Algorithm 2 chooses the AP and UE pair corresponding to , computes energy efficiency according to the formulas (see lines 15~17), and sets to 1. Next, Algorithm 2 chooses the AP and UE pair corresponding to and computes energy efficiency . If and is a positive number that is not greater than 1, Algorithm 2 sets to 1; otherwise, it sets to 0.
Through repeating the above steps, Algorithm 2 chooses the AP and UE pair corresponding to and computes energy efficiency . If , is set to 1; otherwise, is set to 0. When all the binary association variables are set, Algorithm 2 can obtain the optimized after each is mapped to . indicates which AP each UE should connect to, so each beam direction is determined.
After the optimized is obtained, it together with the vector with the maximum beam width is used to solve the power allocation vector . Algorithm 2 starts from the directional mmWave link with the highest DRPC value determined by , , and and then decreases the transmission power from the average power in . If the current energy efficiency is not less than the preset threshold and the current transmission power is more than , Algorithm 2 continues to reduce the transmission power ; otherwise, it optimizes the transmission power of the mmWave link with the secondhighest DRPC value. Through repeating the above steps, the transmission power of each working mmWave link will be optimized. After each transmission power is mapped to , the vector will be obtained.
After the optimized and are obtained, the optimized can be solved by applying them into a process similar to the above. In other words, Algorithm 2 starts from the directional mmWave link with the highest DRPC value determined by , , and and then decreases beam width from . If the current energy efficiency is not less than the preset threshold and the current beam width is more than , Algorithm 2 continues to reduce the current beam width; otherwise, it optimizes the beam width of the mmWave link with the secondhighest DRPC value. Through repeating the above steps, the beam width of each working mmWave link will be optimized. After each beam width is mapped to or , the vector will be obtained.
It is worth noting that is more than in Algorithm 2, which denotes that the number of elements in is smaller than that of . If the value of is 0, should be filled with 0, so that the number of elements in is the same as that of . Likewise, the number of elements in must be the same as that of . If the value of is 0, should also be filled with 0 to meet the requirements of the same number of elements (i.e., ). In particular, the algorithm prioritized access services to the UEs that acted as relays, where the more times a node acts as a relay will get the higher priority (see line 6). At the same time, when a UE that has acted as a relay obtains an access service, the cumulative number of times it acts as a relay is subtracted once (see line 11).

The above algorithm invokes the following algorithm based on the idea of decreasing search one by one.

5.3. D2D Relay Selection Algorithm
The set of candidate relays selected by a mobile UE should be limited to its LoS range, which helps to guarantee the D2D link performance between the mobile UE and any member of the candidate set. In view of the complexity of the mathematical representation of mmWave signal propagation characteristics, we adopt the free space model for traditional frequency band signal propagation to derive an approximate upper bound of the communication distance between a mobile UE and its any relay. For a D2D transmission link from relay UE to mobile UE , its maximum communication distance (i.e., ) is estimated by where is the maximum transmission power of relay UE ; are the transmission and receiving antenna’s gains, respectively, while and are the wavelength of the signal carrier and the system loss coefficient, respectively; and is the SINR of the D2D transmission link from UE to UE , which is estimated by formula (18) when is given in advance. According to [36], if the free space model is adopted, the maximum communication distance should not be greater than the crossover distance . Therefore, the upper bound of the communication distance (i.e., ) is estimated by where is estimated by where is the height of the transmitting antenna on the ground while is the height of the receiving antenna on the ground. In order to facilitate the adjustment of the range of relays that a mobile UE can choose, we construct the following estimation formula: where is a relay selection range regulation coefficient, which is a positive value no more than 1, is the effective communication distance threshold actually adopted by the mobile UE, and the distance between the mobile UE and the selected relay does not exceed .
The relay selected by the mobile UE should reuse the beam of the mobile node as perfectly as possible. Since the mobile UE can get the beam after the BFT process, it shows the superiority of its original position. Therefore, if the relay selected by the mobile UE is in its original location, it is the ideal case. In fact, their locations are hard to overlap. In order to easily and quickly select the appropriate relay, a mobile UE should first select the several relays in ascending order in terms of the angle between the line from its original location () to the associated AP and the line from the relay to the same AP, and then, the one closest to the original location is selected from these candidate relays. When the coordinates (e.g., ) of a mobile UE’s relay (e.g., ) and the AP (e.g., ) are known, the angle (e.g., ) with the AP as the center is computed by
The MBS performs Algorithm 4 to coordinate the relay selection operation, where the D2D link set and the D2D relay set are initialized to empty to prepare for recording candidate D2D relay links and D2D relay UEs during the interval (see lines 1~2), and then, the MBS only accepts relay selection declarations during the first half of the interval so that the selected relay UEs have enough time to play their roles (see lines 3~14).

Each mobile UE that wants to select a relay will execute Algorithm 5, where it initializes the candidate relay set to empty to prepare for recording its own candidates and sends D2D relay selection declaration to the MBS when it feels that its communication performance is declining (see lines 1~4), and then, it selects the appropriate UE from its neighboring area to act as a relay (see lines 5~18).

5.4. Resource Allocation Algorithm Based on Potential Game for D2D Links
When outband D2D communication is adopted, is the set of outband D2D links while is the set of outband D2D relays. Since the outband D2D frequency band is shared by all the members of the set , any outband D2D link will receive the interferences coming from the other outband D2D links of the set , which can be formulated as
The SINR of outband D2D link is given by where is the mmWave WiFi bandwidth shared by all the outband D2D links. Then, the throughput and energy efficiency of outband D2D link can be given by
For mobile UE with outband D2D relay UE in AP coverage, in order to improve the energy efficiency of outband D2D link , we formulate each outband D2D link’s transmission power adjustment problem as : where constraint provides a set of the available transmission power levels for each relay, where and is the cardinality of the set ; constraint is aimed at guaranteeing that any relay UE’s outband D2D link throughput is not less than its cellular mmWave relay throughput.
According to the BFT mechanism, we know that the relay UE reusing the beam of mobile UE cannot achieve higher performance. Therefore, must be more than , where is the performance of mobile UE at its original location . Since has been obtained by formula (14) during the execution of Algorithm 2, we can change constraint to “” to reduce the computation.
The optimization problem is the combinatorial and nonconvex problem. The combinatorial property comes from constraint , while the nonconvexity is generated by the optimization objective and constraint . Although can be solved by adopting an exhaustive search approach, we will model it as a potential game, where it only gets an approximate optimal solution but adapts to the larger number of outband D2D mmWave links.
In the potential game model for approximately solving the optimization problem , each D2D relay UE competes with each other for the expected transmission power to optimize its D2D link energy efficiency. Since each D2D relay UE acts selfishly and is aimed at obtaining as much individual profit as possible, it is a player in the game process. Thus, we formulate the following utility function for its decision: where and are the nonnegative penalty scalar in “bps/Watt.” is the penalty function described in [19, 37], which meets that if ; otherwise, ; is a weight coefficient and means the ratio of the actual individual earnings of player to its utility value, which is more than 0 and less than 1.
The first term in (32) denotes the part of the total utility of the player ’s energy efficiency of outband D2D link, where the first term in parentheses is the player ’s energy efficiency of outband D2D link, while the second term in parentheses is the player ’s relaying constraint, which means that the player will be punished if it chooses the game strategy that does not follow constraint .
The second term in (32) denotes the part of the total utility from all the outband D2D links’ average energy efficiency, where the first term in parentheses is all the outband D2D links’ average energy efficiency, while the second term in parentheses is all the relay UEs’ relaying constraints, which means that any relay UE will be punished if it chooses the game strategy that does not follow constraint . According to the theory in [38], we have the following.
Definition 1. Game is an Ordinal Potential Game (OPG), if , , there exists a potential function such that
The player ’s game strategy is represented by , while a game strategy profile of all the players in the set except for r is represented by . When a game strategy of any player r, and an alternative game strategy are given, and the game strategies of other players remain unchanged, we have
According to Definition 1, the formulated game model is the potential game with the potential function . If there is not the constraint , each Nash equilibrium (NE) of the formulated game model can be considered a suboptimal solution. However, due to constraint , it is uncertain whether the constraint is satisfied by each NE. Therefore, it is necessary to discuss the existence of feasible NE in the formulated game model.
If and , there are some feasible game strategies for each player that meets the constraints and ; the problem of optimizing and the problem (31) have the same suboptimal solution. Based on these conditions, any player’s game strategy that does not follow constraint will make be smaller than 0, which means that the game strategy profiles violating constraint will not improve . We propose Algorithm 6 to solve the optimization problem .

Algorithm 6 is executed jointly by the MBS and all the relay UEs. In the operations performed by the MBS, the executor firstly checks to see if the members in the D2D relay set has been updated. If yes, it will perform operations of lines 2~20; otherwise, those of lines 21~23 are executed. The purpose of lines 2 to 20 is to update the policy consisting of the powers of all the relay UEs, where the set is initialized to an empty set to record the relay UEs that have finished the adjustment of their transmission powers and the variable is initialized to 0 to record the number of relay UEs that have reported their desired transmission powers during this round of power adjustment process (see lines 3~4).
The variable accumulates regardless of whether each relay UE’s power value received by the MBS is changed (see lines 7 and 10). However, the MBS only adds the relay UEs reporting old power values to the set (see line 11) while it updates by using each new power value (see line 8). After receiving the power values from all the relay UEs (i.e. is equal to , see line 13), the MBS broadcasts the packet including “end” to all the relay UEs if each relay UE no longer changes its power (see line 14); otherwise, it starts the update of relay UEs’ powers again after broadcasting to all the relay UEs (see lines 17~20).
In the operations performed by each relay UE, if the executor receives “power adjustment” packet from the MBS, it will perform operations of lines 2~14; otherwise, those of lines 15~17 are executed. The purpose of lines 2 to 14 is to readjust the executor’s power, where is initialized to minimum power in and reported to the MBS. After receiving the updated power values from other relay UEs from the MBS (see line 3), each relay UE firstly calculates the utility value by using the current power (see line 4), and then, it solves the optimal utility in this round (see lines 5~6).
When the utility value can be increased, the updated power will be reported to the MBS and then the relay UE waits for feedback from the MBS to start the next round of power update (see lines 7~9); otherwise, it checks to see if it receives “end” from the MBS after reporting its old power value to the MBS (see line 11). If receiving “end” from the MBS, it will go to line 1 (see line 14), where it starts the next round of power updates after receiving the “power adjustment” packet from the MBS. Otherwise, it will go to line 3 (see line 14), where it will continue to update its power value in this round after receiving from the MBS.
6. Performance Evaluation
6.1. Simulation Parameter Settings
We adopt a simulation environment similar to that in [13], which is a circular plane with the MBS as the center and a radius of 300 meters. In this simulation area, all the UEs are randomly distributed in the ringshaped area with a radius ranging from 100 m to 300 m centered on the MBS, all the SBSs are evenly deployed in a circular ring with a radius of 200 m centered at the MBS, and all the APs are evenly deployed in a set of circular rings with the radius of 60 m centered at each SBS. For simplicity without loss of generality, we only consider one MBS, four SBSs, and sixteen APs.
To simulate the mobility of the simulation environment, we divide the simulation time into a set of time slices with equal length. At the beginning of each time slice, each UE makes the decision whether to move with the probability . If an UE gets the decision to move, it will choose a value at random from to to change its axis and axis coordinates, respectively. The interval between the two consecutive BFT processes is not less than one time slice. Unless otherwise specified, the simulation parameters are listed in Table 1, most of which refer to the values of simulation parameters in [7, 13].
The schemes in [7, 13] are most similar to ours, but the scheme in [7] only adopts two steps for the BFT process when it is applied to the simulation environment. For ease of referring to these schemes, we refer to the scheme in [7] as a twostep BFT throughput optimized radio resource allocation scheme (2BFTTHO) and the scheme in [13] as a threestep BFT throughput optimized radio resource allocation scheme based on priority policy in terms of energy efficiency (3BFTTPP). Also, for convenience, we refer to the scheme in this paper as a threestep BFT radio resource allocation scheme based on priority policy in terms of energy efficiency, which is aimed at maximizing the number of concurrent access links (3BFTMAL).
6.2. Simulation Results and Analysis
We conduce a total of five groups of simulation experiments. In the first group of simulation experiments, we set the decreasing step size for power and beam width to 1% and the interval between execution of BFT to four time slices. Based on these settings of parameters, we compare the performance of 3BFTMAL in this paper with those of 3BFTTPP and 2BFTTHO by changing the number of UEs in a given area.
From Figures 2(a) and 2(b), we can observe that, as the number of UEs increases, the average energy efficiency and throughput per link of the three schemes slightly increase. This is because when the number of concurrent connections provided by each AP is fixed, the more number of UEs makes the BFT procedure have more chance to get better concurrent connections. Since the scheme in this paper can suppress the significant decrease in beam performance due to terminal movement by selecting D2D relay in the interval between the two consecutive BFT processes, it is significantly better than that of 3BFTTPP in terms of the two performance metrics. Meanwhile, considering the fact that 3BFTTPP prioritizes radio access link selection based on link energy efficiency, we can observe that 3BFTTPP is also superior to 2BFTTHO in these two performance metrics.
(a)
(b)
In the second group of simulation experiments, we set the interval between execution of BFT to four time slices. Based on these settings of parameters, we compare the performance of 3BFTMAL in this paper with those of 3BFTTPP and 2BFTTHO by changing the decreasing step size for power and beam width. From Figures 3(a) and 3(b), we can see that the change in step size has a slight influence on the average energy efficiency and throughput per link of the three schemes, especially when the step size is large. This phenomenon is mainly attributed to the fact that the accuracy of network parameter adjustment decreases with the larger step size.
(a)
(b)
In addition, from Figure 3(b), we see that the curve of 2BFTTHO has a turning point when the decreasing step size is 6%. This is because, when the decreasing step size is larger, the probability of poor performance of the convergence point increases. But that does not mean that it is bound to get worse. Under 2BFTTHO with the step size 6%, this case occurs, which is a manifestation of randomness.
In the third group of simulation experiments, we set the decreasing step size for power and beam width to 1% and the interval between execution of BFT to four time slices. Based on these settings of parameters, we compare the performance of 3BFTMAL in this paper with those of 3BFTTPP and 2BFTTHO under the different values of ambient noise power density. From Figures 4(a) and 4(b), we can see that the average energy efficiency and throughput per link decrease as ambient noise power density increases. The main reason is that, when the transmission power is fixed, very low ambient noise power density will produce very high SNR, and thus, there is a significant throughput improvement according to the Shannon theorem.
(a)
(b)
In the fourth group of simulation experiments, we set the decreasing step size for power and beam width to 1% and the interval between execution of BFT to four time slices. Based on these settings of parameters, we compare the performance of 3BFTMAL in this paper with those of 3BFTTPP and 2BFTTHO under the different probability that each UE moves its position. From Figures 5(a) and 5(b), we can observe that the change in the probability that each UE moves its position has a certain effect on the average energy efficiency and throughput per link. Also, we observe that there has a slight effect on the average energy efficiency and throughput per link when the probability that each UE moves its position is small.
(a)
(b)
In the fifth group of simulation experiments, we set the decreasing step size for power and beam width to 1%. Based on these settings of parameters, we compare the performance of 3BFTMAL in this paper with those of 3BFTTPP and 2BFTTHO under the different interval between executions of BFT. From Figures 6(a) and 6(b), we can observe that the change in the interval between execution of BFT has a significant effect on the average energy efficiency and throughput per link, where the two performance metrics of the three schemes decrease significantly with the increase of the interval value. The main reason is that the longer time interval between two consecutive BFT processes will lead to a larger deviation of beam performance from the optimal value due to terminal movement.
(a)
(b)
7. Conclusion
In this paper, we proposed the radio resource allocation under the network energy efficiency constraint and the context of frequent mobile terminals, which is aimed at maximizing the number of concurrent connections. Firstly, we adopted the strategy in [13] to solve the basic resource allocation problem in dense mmWave cellular networks. Then, we proposed a relay selection method to select a set of appropriate D2D relays to take over the terminals whose performance deteriorates due to movement. Finally, based on the D2D relay selection results, we designed the resource allocation algorithm based on potential game theory to solve the resource allocation problem among D2D relay links. When the terminals move frequently, compared with the scheme in [13] and the scheme in [7], the scheme in this paper achieves better performance values in terms of average energy efficiency and throughput per link.
Data Availability
The simulation data used to support the findings of this study are available from the corresponding author upon request.
Conflicts of Interest
The authors declare that there is no conflict of interest regarding the publication of this article.
Acknowledgments
This work was supported in part by the National Natural Science Foundation of China (No. 61873352).