Abstract

The current paper considers the joint precoding and transmit antenna selection to reduce hardware cost, such as radio-frequency chains, associated with antennas in the downlink of multiuser multiple-input multiple-output systems with limited feedback. The joint precoding and transmit antenna selection algorithm requires an exhaustive search of all possible combinations and permutations to find the optimum solution at the transmitter, thus resulting in extremely high computational complexity. To reduce the computational load while still maximizing channel capacity, the cross-entropy (CE) method is adopted to determine the suboptimum solution. Compared with the conventional genetic algorithm and random search method, the CE method provides better performance under the same computational complexity, as shown by the simulation results.

1. Introduction

Studies have shown that the capacity of multiple-input multiple-output (MIMO) systems equipped with multiple antennas at both the transmitter and receiver sides increases almost linearly with the minimum number of transmit and receive antennas [1, 2]. Although systems employing a large number of transmit and receive antennas can improve their performance, two drawbacks are associated with such systems. First, the system requires the same number of radio-frequency (RF) chains as the antennas used, which typically increase the complexity and cost of system hardware. Second, in a multiuser scenario, a base station (BS) communicates simultaneously with several cochannel users in the same frequency and time slots, thereby degrading system performance [3].

A promising solution to reduce the hardware complexity of RF chains is the concept of antenna selection scheme [4, 5], in which only the best subset of antennas is used, whereas the remaining antennas are not employed, thus reducing the number of required RF chains. Although antenna selection schemes have the potential to lower the hardware cost while retaining many diversity benefits of a MIMO system, they pose a new challenge to the optimal antenna selection, which requires an exhaustive search (ES) of all candidate combinations to achieve the best system performance. As a result, the computational load required for such optimal selection grows exponentially with the total number of antennas available. To reduce computational complexity, several suboptimal algorithms for antenna selection have recently been proposed, a summary of which can be found in [4, 5].

To mitigate multiuser cochannel interference in a multiuser MIMO (MU-MIMO) scenario, transmit precoding is usually employed to achieve higher channel capacity [6]. However, transmit precoding generally requires accurate channel state information (CSI) at the transmitter, thus necessitating a huge amount of feedback from receivers. Unfortunately, the feedback rate is usually small in a practical MIMO system. Moreover, the precoding optimization problem is shown to be nondeterministic polynomial-time hard (NP-hard) [79]. To address these issues in a practical way, limited feedback is commonly used to convey the CSI to the transmitter. In addition, rather than directly sending the quantized version of the estimated CSI at the receiver back to the transmitter, a predetermined finite set of precoding vectors, referred to as the “codebook,” is selected based on predefined criteria and is fed back to the transmitter. As both the transmitter and the receiver know the codebook, only the index of the selected code word is delivered to the transmitter, reducing the feedback rate.

To lower the hardware complexity and achieve superior performance in the MU-MIMO systems simultaneously, the overall system performance is greatly expected to benefit from the combination of the precoders and the transmit antenna subset selection [1013]. However, in the conventional approaches, for example [1012], the transmit antenna subset and the precoding vector selections are conducted separately, leading to performance loss. To fill the performance gap, Huang et al. [13] first formulated the problem of joint selection of the optimal Grassmannian precoders and transmit antenna subset in MU-MIMO systems, with limited feedback as a particular combination optimization problem. The genetic algorithm (GA) [14] is then applied to solve the problem. GA is a metaheuristic search method that is suitable for solving optimization problems. It encodes each candidate solution (called an individual) into a bit string (called a chromosome) and then associates it with an objective function. The general scheme for GA consists of five main procedures: initialization, evaluation, selection, crossover, and mutation. The first procedure, initialization, is used to randomly generate a genetic pool of 𝑁GA individuals to represent the 𝑁GA initial candidate solutions. The next procedure, evaluation, measures the fitness of each individual solution in the population and assigns to it a score. A proportion of the existing population with higher fitness is then selected to yield a new generation in the selection procedure. The selected individuals are crossed-over with the crossover probability in the crossover procedure. Finally, the mutation procedure is used to randomize the change of the selected individual solution with the mutation probability.

Although the GA-assisted joint selection approach proposed in [13] outperforms the traditional separate scheme in terms of average capacity and bit error rate (BER) performance, the performance of the GA-assisted joint selection approach can be improved. Recently, Rubinstein proposed an innovative metaheuristic approach called cross-entropy (CE) method for global optimization [15]. The main idea of the CE method is to maintain a distribution of possible solutions and to update this distribution accordingly. Based on the importance sampling strategy, CE updates the distribution by minimizing the cross-entropy or Kullback-Leibler divergence to find the closest density to the optimal importance sampling one. In principle, GA and CE are population-based stochastic search methods. Unlike GA, however, CE uses the statistics of the entire population rather than the individual solutions to produce the next population, which increases the ability of CE to identify more global optima than GA. In addition, through the use of a smoothing parameter, CE avoids getting stuck at a local optimum. CE is also quite robust with respect to initial conditions and sampling errors, in contrast to other metaheuristics such as simulated annealing. Most importantly, CE is robust and efficient in practice, as well as possessing of asymptotic convergence properties. Based on the above points, the present paper proposes the application of the CE method to the problem described in [13]. Simulation results show that the proposed algorithm is superior to the GA method in terms of average capacity and BER performance under the same complexity.

2. System Model and Problem Definition

2.1. System Model

As shown in Figure 1, a downlink MU-MIMO system with limited feedback is considered, in which the BS communicates with 𝐾 active users over flat fading channels. Similar to [13], our MU-MIMO system is based on the following assumptions: (1) the BS is equipped with 𝑁𝑇 transmit antennas, whereas each user has 𝑁𝑅 receive antennas; (2) both the BS and the users know a codebook of 2𝐿 precoding vectors, where 𝐿 is the number of feedback bits; (3) each user has perfect knowledge of its own CSI that can be used to select and find the index of the best precoding vector from the Grassmannian codebook [16].

Based on the above-mentioned assumptions, the signals received by the 𝑘th user can be described as 𝐲𝑘=𝐇𝑘𝐅𝐬+𝐧𝑘,(1) where 𝐲𝑘𝑁𝑅×1 is the received vector of the 𝑘th user, 𝐇𝑘𝑁𝑅×𝑁𝑇 is the channel matrix from the BS to the 𝑘th user, and 𝐬=[𝑝1𝑠1𝑝2𝑠2𝑝𝐾𝑠𝐾]𝑇 is the transmitted vector from the BS, where ()𝑇 denotes the transposition, and 𝑠𝑘 and 𝑝𝑘 are the binary phase-shift keying (BPSK) modulation symbol and the transmitted signal power for the 𝑘th user, respectively. Hence, the total transmit power can be expressed as 𝐾𝑘=1𝑝𝑘=𝑝0. 𝐅=[𝐟1𝐟2𝐟𝐾]𝑁𝑇×𝐾 is the precoding matrix, where 𝐟𝑘𝑁𝑇×1 is the precoding vector for the 𝑘th user, which can be obtained based on the Grassmannian line-packing criterion [16], and 𝐧𝑘𝑁𝑅×1 is a Gaussian noise vector with zero mean and covariance matrix 𝑁0𝐈𝑁𝑅 for the 𝑘th user, where 𝐈𝑁𝑅 denotes an 𝑁𝑅×𝑁𝑅 identity matrix.

At the receiver side, a linear minimum mean square error (MMSE) MIMO detection is employed for each user to obtain better performance. After performing MMSE MIMO detection, the detected information signal for the 𝑘th user is given by ̂s𝑘=𝐆𝑘𝐲𝑘.(2) Here, 𝐆𝑘 is the linear MMSE decoding matrix of the 𝑘th user, which can be expressed as 𝐆𝑘=𝐡𝐻𝑘𝐇𝑘𝐇𝐻𝑘+𝐾𝑁0𝑝0𝐈𝑁𝑅1,(3) where ()1 and ()𝐻 denote the inverse of the matrix and the conjugation transpose operation, respectively; 𝐇𝑘=𝐇𝑘𝐅 is the equivalent channel matrix for the 𝑘th user after precoding, and 𝐡𝑘def𝐇=[𝑘]𝑘def=𝐇𝑘𝐟𝑘, where []𝑘 denotes the 𝑘th column of the matrix.

Given the above system model, the system capacity achieved by the 𝐾 users can be expressed as cap𝐅,𝐇1,,𝐇𝐾=𝐾𝑘=1log1+SIN𝑅𝑘𝐅,𝐇𝑘,(4) where SINR𝑘 is the output signal-to-noise plus interference ratio (SINR) obtained with the linear MMSE detection for the 𝑘th user, which is given by [17] SINR𝑘𝐅,𝐇𝑘=𝑝𝑘𝐆𝑘𝐇𝑘𝐟𝑘2𝐾𝑖=1𝑖𝑘𝑝𝑖𝐆𝑘𝐇𝑘𝐟𝑖2+𝐆𝑘2𝑁0.(5)

2.2. Problem Statement

To reduce both the multiuser interference and hardware cost in the MU-MIMO downlink with limited feedback, a joint optimal precoding and transmit antenna selection scheme are proposed in [13] to maximize the system capacity as follows: argmax𝝎Ωcap𝐅𝝎𝑝,𝐇1,𝝎𝑡,,𝐇𝐾,𝝎𝑡,(6) where 𝝎𝑡 is the selected subset of the antenna index for the transmitter, and vector 𝜔𝑡 contains the elements of either 0 or 1 to indicate if a given antenna is selected. 𝐇𝑘,𝝎𝑡 is an 𝑁𝑅×𝑛𝑇 subblock matrix of 𝐇𝑘 (i.e., the channel matrix associated with selected transmit antennas), in which 𝑛𝑇𝑁𝑇 is the desired number of transmit antennas to be selected; 𝝎𝑝 is the selected subset of precoding vector index. 𝝎𝑝 is the binary vector used to present the selected subset of the precoding vector index. Thus, the numbers of possible combinations for 𝝎𝑡 and 𝝎𝑝 are =𝐶𝑁𝑇𝑛𝑇 and =𝑃2𝐿𝐾, respectively, where 𝐶𝛼𝛽=𝛼!/[𝛽!(𝛼𝛽)!] denotes the binomial coefficient and 𝑃𝛼𝛽=𝛼!/(𝛼𝛽)! is the number of permutations of 𝛼 distinct objects taken 𝛽 at a time. For ease of presentation, we use 𝝎=(𝝎𝑝,𝝎𝑡) to denote the joint precoding and antenna selection, thus obtaining 𝒜=× total number of all possible combinations. In addition, the set containing all the combinations of 𝝎 is defined as Ω={𝝎(1),,𝝎(𝒜)}, where 𝝎(𝑞) denotes a particular combination, or 𝝎Ω. Furthermore, we define cap(𝝎)cap(𝐅𝝎𝑝,𝐇1,𝝎𝑡,,𝐇𝐾,𝝎𝑡) for ease of notation.

The choice of 𝝎 to maximize (6) is a combinatorial problem, and it can be achieved by ES. However, ES requires a large amount of computations to search all possible candidates, posing a potential problem for its practical implementation. To alleviate the computational load, suboptimal solution, such as the GA approach [13], may be used. However, the performance of GA still has room for improvement. Inspired by the success of the CE method in solving complex combinatorial optimization problems, we propose the application of the CE method to solve (6).

3. The Proposed CE-Based Approach

The CE method, proposed by Rubinstein and Kroese [15], is a type of important sampling technique for estimating the probabilities of rare events. Afterwards, it was modified to minimize the cross-entropy (also known as Kullback-Leibler distance) with respect to the optimal important sampling distribution and was evolved into the CE optimization [15]. The strength of the CE method lies in its ability to provide a systematic and efficient way to solve continuous and combinatorial optimization problems.

In principle, the CE method is an iterative algorithm in which each iteration contains two main phases: first, a set of samples is generated according to a set of dynamic parameters; second, the set of dynamic parameters, which regulates the generation of random samples, is updated based on the selected elite samples for producing better new samples. To apply the CE method in the joint precoding and transmit antenna selection problem, a family of probability densities {𝑓(,𝐯)} has to be determined. This family has support 𝒜, and it is parameterized by 𝐯. Similar to the typical CE applications for discrete optimization problems, we can define a family of Bernoulli probability mass functions associated with the joint precoding and antenna selection vector 𝝎={𝜔𝑖}𝑁𝑇+𝐾×𝐿𝑖=1, 𝜔𝑖{0,1}, as given by 𝑓(𝝎,𝐯)=𝑁𝑇+𝐾×𝐿𝑖=1𝑣1𝑖𝑖(𝝎)1𝑣𝑖11𝑖(𝝎),(7) where 𝐯={𝑣𝑖}𝑁𝑇+𝐾×𝐿𝑖=1 denotes the probability vector we need to update and the indicator function, 1𝑖(𝝎){0,1}, indicates whether the 𝑖th element of 𝝎 is selected. In short, 𝜔𝑖 is generated according to the Ber(𝑣𝑖) distribution.

In each iteration 𝜏, 𝑁CE samples {𝝎(𝑗,𝜏)}𝑁CE𝑗=1 are drawn from the density 𝑓(,𝐯(𝜏1)). From these samples, we calculate their fitness values {cap(𝝎(𝑗,𝜏))}𝑁CE𝑗=1 using (4) and sort them in descending order such that (1,𝜏)cap(𝑁CE,𝜏)cap, where (𝑗,𝜏)cap represents the 𝑗th order statistic of the sequence cap(𝝎(1,𝜏)), cap(𝝎(2,𝜏)), ,   andcap(𝝎(𝑁CE,𝜏)). Then, the elite samples, which achieved the best performance in the current set, are collected according to the performance criterion, that is, the samples for which cap(𝝎(𝑗,𝜏))𝛾(𝜏) are selected. Here, 𝛾(𝜏) is the threshold to determine the elite sample, which is set to 𝛾(𝜏)=(𝜌×𝑁CE,𝜏)cap,(8) where 𝜌(0,1) denotes the fraction of all samples, and is the ceiling operation. Thus, these elite samples constitute an elite set Φ(𝜏), which is given by Φ(𝜏)={𝝎(𝑗,𝜏)cap(𝝎(𝑗,𝜏))𝛾(𝜏)}.

The goal of the CE method is to construct a sequence of parameter vectors 𝐯(1),𝐯(2),,𝐯(𝜏),, such that 𝝎(𝜏), approaching the global optimum 𝜔, can be obtained from 𝑓(,𝐯(𝜏)) as 𝜏 increases. Specifically, the CE method generates a sequence of couples (𝛾(𝜏),𝐯(𝜏)) that steer the search towards a neighborhood of the optimal couple (𝛾,𝐯). To achieve this goal, the CE method minimizes the cross-entropy between the updated random mechanism and the probability distribution of the selected elite samples as the update criterion. According to [15], the optimization problem of minimizing the cross-entropy to the optimal distribution 𝑓(,𝐯) can be solved analytically, which is given by [15] 𝐯(𝜏+1)=argmax𝐯1𝑁CE𝑁CE𝑗=1𝕀{𝑗Φ(𝜏)}𝝎ln𝑓(𝑗,𝜏),𝐯,(9)

where 𝕀{𝑗Φ(𝜏)} is an indicator variable defined by 𝕀{𝑗Φ(𝜏)}=1,if𝑗Φ(𝜏)0,otherwise.(10)

The maximum of the CE program is determined by setting the first derivative of (9) with respect to 𝐯 equal to zero, where 𝜕𝜕𝑣𝑖𝝎ln𝑓(𝑗,𝜏)=1,𝐯𝑖𝝎(𝑗,𝜏)𝑣𝑖𝑣𝑖1𝑣𝑖.(11) Thus, 𝜕𝜕𝑣𝑖𝑁CE𝑗=1𝕀{𝑗Φ(𝜏)}𝝎ln𝑓(𝑗,𝜏)=1,𝐯𝑣𝑖1𝑣𝑖𝑁CE𝑗=1𝕀{𝑗Φ(𝜏)}1𝑖𝝎(𝑗,𝜏)𝑣𝑖=0.(12) Solving for 𝑣𝑖 in (12) yields the estimate of the optimal probability density function (pdf) parameter. We have 𝑣𝑖=𝑁CE𝑗=1𝕀{𝑗Φ(𝜏)}1𝑖𝝎(𝑗,𝜏)𝑁CE𝑗=1𝕀{𝑗Φ(𝜏)}(13) for 𝑖=1, , 𝑁𝑇+𝐾×𝐿.

Remarks 1. (1) The samples generated based on 𝐯(𝜏) cannot guarantee that the samples are feasible because the generated samples must satisfy the given constraints (i.e., the restricted number of the selected antennas). Therefore, we may need to add or remove randomly the necessary 1𝑠 for each generated sample to meet the constraint. (2) To prevent the CE method from premature convergence, a smoothing procedure is typically adopted to update the probability vector 𝐯 as follows 𝐯(𝜏)=𝜃×𝐯(𝜏)+(1𝜃)×𝐯(𝜏1),(14)

where 𝜃(0,1] is the smoothing parameter. The updated 𝐯(𝜏) is then used in the next iteration for generating samples.(3) The parameters 𝜌 and 𝑁CE determine how many “elite samples (=𝜌×𝑁CE)” should be chosen to produce better performing samples in the next iteration. In this case, the choice of 𝜌 depends on the sample size 𝑁CE and the objective function of the considered problem. If sample size 𝑁CE is large, then 𝜌 can be small. As a result, the obtained elite samples can provide enough data to produce better performing samples for the next iteration. On the other hand, if sample size 𝑁CE is not large enough, then 𝜌 cannot be small. Otherwise, the proposed algorithm will be easily trapped in the local solutions. Generally, when we apply the CE for the considered problem, sample size 𝑁CE will be determined first. Afterwards, parameter 𝜌 can be chosen. In practice, 𝜌 is typically chosen from the range [0.01,0.1].

The procedure for the proposed CE-assisted joint selection algorithm is summarized as follows. (1) Set the iteration counter 𝜏=1 and initialize the probability vector 𝐯=[𝑣1(0),,𝑣𝑁(0)𝑇+𝐾×𝐿] with 𝑣𝑖(0)=1/2.(2) Draw 𝑁CE random samples {𝝎(𝑗,𝜏)}𝑁CE𝑗=1 from the density function 𝑓(;𝐯(𝜏1)).(3) Randomly add or remove the necessary 1𝑠 for each sample 𝝎(𝑗,𝜏) to ensure that each sample is 𝝎(𝑗,𝜏)Ω.(4) Calculate their objective values {cap(𝝎(𝑗,𝜏))}𝑁CE𝑗=1.(5) Use (8) to obtain 𝛾(𝜏) to determine the elite set Ω(𝜏).(6) Update 𝐯(𝜏) using (13).(7) Obtain the smoothed 𝐯(𝜏) using (14).(8) Repeat Steps (2)-(7) for 𝜏=𝜏+1 until the predefined number of iterations is met.

4. Simulation Results

We compare the proposed CE method with the GA-based [13] and random search methods in terms of average system capacity, BER, and computational load. To determine the performance gap between the true optimal performance and the compared methods, we also adopt the ES algorithm as the reference bound. The simulation scenario is as follows [13]. A downlink MU-MIMO system with limited feedback is considered, with the data model described in (1). Here, we assume that i.i.d. BPSK modulated symbols are simultaneously sent from the transmit antennas of the BS to every user with equal energy through the i.i.d. Rayleigh fading channels. The number of users 𝐾, the number of transmit antennas at the BS 𝑁𝑇, the number of selected transmit antennas 𝑛𝑇, and the number of the receive antennas for each user 𝑁𝑅 are 2, 16, 4, and 2, respectively. In addition, the six-bit (𝐿=6) Grassmannian codebook, which has 64 precoding vectors [16], is used for the precoding. In the parameters used in the proposed CE method, the smoothing parameter 𝜃 is 0.8 (The reasons for choosing 𝜃=0.8 are based on the following: the recommendation of [15], which suggests the range for the design of 0.7<𝜆<1, and our simulation experience for this problem. We found empirically that a value of 𝜃=0.8 gives good results.) and 𝜌 is 0.1, and the algorithm is stopped when the iteration number exceeds the predetermined value.

In the first simulation, we fix the signal-to-noise ratio (SNR) at 6 dB and determine how the capacity performances of the various algorithms are affected by the different number of searches 𝒮. Here, the number of searches for GA and CE consists of the population size multiplied by the maximum number of iterations. The simulation results are shown in Figure 2, in which the population sizes for the CE and GA are 𝑁CE=𝑁GA=840. The average capacity is evaluated over 10,000 independent trials. As expected, the average capacity increases when the number of iterations increases. It can be seen that the convergence speed of CE is low at the initial stage (i.e., when the total number of searches 𝒮 is small) because we do not have enough elite samples to produce better samples in the next iteration. Consequently, CE performs slightly worse than GA with a lower value of the total number of searches. After this initial stage, the proposed algorithm is superior to the GA and random search methods under the same number of searches. When the number of searches reaches approximately 16,800, the average capacity value is not improved further, and CE and GA converge to a reasonable solution. Therefore, the aforementioned three methods are realized with 𝒮=16,800 searches in the conducted simulations.

Figures 3 and 4 show the capacity and average BER performances, respectively, of the various algorithms with different SNRs. The following can be observed from the simulation results: (1) both the capacity and average BER performance improve as the SNR increases for all algorithms; (2) the proposed CE method always performs better than the GA and random search methods regardless of the SNR regions; and (3) both the capacity and average BER performance achieved by the optimum ES algorithm are always better than those obtained by the proposed CE method but at the cost of a much larger computational load, where the ES method requires 7,338,240 searches to obtain the optimum solution. A detailed complexity comparison between the CE method and the other algorithms is shown in Table 1.

5. Conclusion

The current paper presents a CE-based scheme that realizes the joint precoding and transmit antenna selection in the downlink of MU-MIMO systems with limited feedback to reduce the interference effectively and lower the required RF chains. With the aid of the CE method, the large amount of search required can be successfully reduced. The simulations demonstrate that the proposed CE method not only provides better capacity performance but also enjoys complexity advantages compared with the conventional GA method.

Acknowledgments

The author wishes to thank the Guest Editor and all the anonymous reviewers for their valuable comments and suggestions for enhancing the readability and technical quality of the paper. The work of J.-C. Chen was supported in part by the National Science Council (NSC) of Taiwan under Grant NSC 100–2221-E-017-006-MY3. The work of C.-P. Li was supported by the Metal Industries Research and Development Centre under Contract no. 100-EC-17-A-01-01-1010.