Abstract

In this work, we consider a multiple input multiple-output system with large-scale antenna array which creates unintended multiuser interference and increases the power consumption due to the large number of radio frequency (RF) chains. The antenna selective symbol level precoding design is developed by minimizing the symbol error rate (SER) with limits of available RF chains. The -norm constrained nonconvex problem can be approximated as -minimization, which is further solved by alternating direction method of multipliers (ADMM) approach. The basic ADMM scheme is mapped into iterative construction process where the optimum solution is obtained by taking deep learning network as building block. Moreover, because that the standard ADMM algorithm is sensitive to the selection of hyperparameters, we further introduce the back propagation process to train the parameters. Simulation results show that the proposed deep learning ADMM scheme can achieve significantly low SER performance with small activated subset of transmit antennas.

1. Introduction

With the severe spectrum shortage in conventional cellular network, the large-scale antenna system in the millimeter-wave (mmWave) bands has been considered as a potential solution to meet the constantly growing demand by the users for higher data rate. The mmWave systems are capable of accommodating a large number of half-wavelength-based antennas while maintaining a compact form factor. High-gain multiple-input-and-multiple-output (MIMO) beamformers are advocated to compensate for their substantial path loss. There are many challenges regarding the implementation of conventional fully digital beamforming in large-scale antenna systems, such as complexity, energy consumption, and cost [1, 2]. Therefore, hybrid beamforming structures and symbol-level precoding (SLP) with limits of radio frequency (RF) chains are developed as alternative choices.

Hybrid beamforming that is comprised of a linear network of variable phase shifters in the RF domain, has typically been combined with baseband digital beamforming. For the multiuser downlink mmWave communication system, the interuser interference is suppressed by zero-forcing- (ZF-) based hybrid beamforming algorithms [3, 4] and regularized ZF hybrid beamforming [5, 6]. The sum-rate maximization is addressed in [7], where the analog beamformer is optimized via alternating optimization approach. The maximization of the minimum total transmit rate with optimal power allocation is further developed via low-complexity path-following algorithms [8]. However, the limitation on the number of RF chains makes the designs of hybrid beamforming considerable challenging due to the unit-modulus constraints imposed on the analog beamformers. Unless the number of users is less than half the number of RF chains, a highly nonconvex design problems could result in computational intractability [9].

On the other hand, the SLP has been developed by tracking the interference symbol-by-symbol and turning the induced interference into constructive interference [10]. With the knowledge of data and channel state information (CSI), the early work on SLP focuses on the adaptation on linear precoding method, such as ZF-SLP beamformer [11, 12], signal-to-interference-and-noise ratio (SINR)-SLP [1315], and minimum mean square error (MMSE)-SLP [16]. Recently, the SLP schemes have been widely combined with optimizations to achieve further performance improvements [1719]. There are additional studies on constructive precoding including per-antenna power constraint [20], noise-robust SLP [21], symbol error rate (SER) minimization [22], and nonlinear channels [23]. Note that the implementation of existing techniques is usually based on standard optimization tools with high computational complexity. Considering the limited computational capabilities of communication systems, their implementation is infeasible, especially in the large-scale antenna array systems. In order to reduce the power consumption of digital RF chains, the SLP technique assisted with antenna selection (AS-SLP) is designed via the orthogonal matching pursuit (OMP) [24] and coordinated descent (CD) algorithms [25].

In this work, we consider mmWave communication system with the large-scale antenna array in which only a subset of transmit antennas is activated. In order to reduce the power consumption of RF chains, the proposed AS-SLP design is developed to minimize the achievable SER by using a subset of activated transmit antennas. The underlying combinatorial optimization problem is further transferred into a regularized -norm form. The optimum solution is achieved via alternating direction method of multipliers (ADMM) algorithm. However, the improper regularization parameters of the -norm optimization may degrade the SER performance, and the specification of the hyperparameters is a challenge. To overcome this difficulty, a deep architecture dubbed as ADMM-Net is introduced to link the iterative algorithm to the deep learning architecture. The hyperparameters become learnable and are optimized via back propagation by minimizing the corresponding loss function using the gradient algorithm. The main contributions of this work can be summarized as follows. (i)The SLP design is developed to jointly select the optimum subset of RF chains and minimize the SER between the desired and received symbols over large-scale MIMO system. Instead of using exhaustive research, the optimization problem is approximated as a regularized -norm constrained problem and further solved by ADMM algorithm(ii)Different from the traditional iteration-based or optimization-based algorithms, the ADMM process is mapped into a deep ADMM-Net with the data flow graph, in which the th iteration corresponds to the -stage data flow and three types of operations are mapped into three types of nodes. Under this structure, the proposed ADMM-Net is unfold without any extreme learning machine. The optimum solution can be iteratively recovered by constructive blocks(iii)The back propagation process with the standard gradient-based optimizer is introduced to overcome the difficulty in determining the hyperparameters. The corresponding hyperparameters are learnt from back propagation by minimizing the gradient of the loss function. Assisted with back propagation, the proposed ADMM-Net can further improve the SER performance(iv)The mean square error (MSE) and SER performance of the proposed ADMM-Net algorithm are compared with the state-of-the-art AS-SLP algorithms. Simulation results demonstrate that the proposed ADMM-Net scheme can reduce the achievable MSE and SER considerably and consume the low transmit power with optimum subset of transmit antennas

The remainder of the paper is organized as follows. Section 2 presents the AS-SLP model and formulates the optimization problem. Section 3 presents the detail of proposed ADMM-Net. The training procedure based on back propagation is established in Section 4. Simulation results are discussed in Section 5. Finally, Section 6 concludes the paper.

Notations: Matrices and vectors are typefaced using slanted bold uppercase and lowercase letters, respectively. is an identity matrix. is used to describe the complex space of matrices, and denotes a complex Gaussian distribution with mean and covariance . For a vector like , are the Euclidean distance.

2. System Model and Problem Formulation

In this work, we consider a downlink multiuser MIMO system in which the base station (BS) equipped with transmit antennas communicates user terminals with single antenna. Considering high cost and power consumption of RF components, each transmission only activates a subset of transmit antennas (). The active transmit antennas are selected by switches that connect with the available RF chains. A mmWave channel is considered [26], such as where is the total number of transmission paths and is the complex gain of the th path of i.i.d. . The variables and are the azimuth angles of arrival and departure (AoAs/AoDs) of th path of the receiver or transmitter, respectively. The variables and are the antenna array response vectors at the transmitter and receiver, respectively. The receive signal can be equivalently rewritten as with is the additive white Gaussian noise (AWGN) noise followed a circularly symmetric complex Gaussian distribution, . In this work, we adopt the SLP scheme in which the precoded transmitted signal is designed on a symbol-by-symbol basis as a function of the instantaneous CSI, that is, where is the intended symbols drawn from prespecified constellation; the operation is the SLP conventional scheme.

Suppose the intended symbol is the receive constellation point, the upper-bound of SER can be further expressed as [27] where is the number of minimum distance neighbors of the intended symbol, , and the constant denotes as the minimum distance between two closest neighboring constellation points. Moreover, is the distance between the intended symbol and noiseless received symbol , that is, . Because of the decreasing property of -function, the SER is determined by the distance .Considering the hardware complexity of large antenna array systems, we strive to activate a subset of transmit antennas and minimize the achievable SER by minimizing the average Euclidean distance between the received signal by each user and the desired information symbols, that is, where is the transmit signal-to-noise ratio (SNR), and Note that the -norm constraint determines the number of activated transmit antenna; therefore, the optimum solution works as antenna selection implicitly. However, the aforementioned optimization problem is NP hard due to the nonconvexity of the -norm constraint. A straightforward approach would compute the precoding vector for every possible combinations by minimizing the objective function (5a). Once the optimum subset of activated transmit antennas is determined, the optimum precoded signal can be designed based on conventional SLP schemes, such as ZF algorithm. where is the effective channel matrix. It is acquired from after replacing the rest of its columns which correspond to the zero elements of with null vectors. However, with available antenna, there are possible subsets of activated antenna. The straightforward exhaustive search method becomes computationally inefficient for the large-scale antenna array system, which motivates us to develop an efficient approach that can determine the optimum activated antennas with low complexity.

3. ADMM-Net for SLP Model

3.1. ADMM Solver

Due to the -norm constraint, the optimization problem is NP-hard, which can be reformulated, and the optimum solution can be obtained by the recent developed algorithms, such as OMP and CD algorithms. In this work, the -norm regularization is introduced as one of the effective approaches to jointly optimize multiple variables, that is, where is the -norm as convex surrogate of -norm, and the parameter controls the trade-off between noise sensitivity and signal sparsity. The augmented Lagrangian method (9) can be bring robustness compared with dual ascent method to achieve convergence even if the underlying problem is not strictly convex. However, the augmented Lagrangian cannot be separately minimized over the variables. Therefore, the ADMM algorithm that blends the decomposability of dual ascent with the superior convergence properties of the method of multipliers is introduced to solve this problem. By introducing an intermediate variable as bridge parameter to establish consensus among , the problem (9) can be reformulated as The corresponding augmented Lagrangian function over and can be written as where is a dual variable and is a positive penalty parameters. Let as the scaled form of dual variable, the ADMM algorithm alternately optimizes by solving the following subproblems, such as where denotes the th iteration and is an update parameter for the Lagrangian multiplier. The variables and are iteratively updated in alternative directions to complete joint minimization, and the dual variable is updated according to and .

3.2. Deep ADMM-Net Architecture

Based on the iterations in ADMM solver, we map the standard ADMM iterative process (13) into a deep ADMM-Net with a data flow graph, which is comprised of the nodes and the connecting lines shown in Figure 1. More specifically, the th iteration of the ADMM algorithm corresponds to the data flow in the th stage which are connected by solid lines. Three types of operations are mapped into three types of nodes, that is, reconstruction layer , nonlinear transform layer , and multiplier update layer . Each layer is discussed as follows.

3.2.1. Reconstruction Layer

This layer reconstructs the symbol-level precoder in (12). Given and from previous layers in stage , the output of this layer is defined as where is the initialize input and is the learnable penalty parameter in the th stage. The output of this layer is the input for subsequent multiplier update layer and nonlinear transform layer in the th stage. Note that, for the first stage, the reconstruction layer is defined based on the inputs channel gain matrix and the signal symbol vector , that is,

3.2.2. Nonlinear Transform Layer

This layer performs the nonlinear transform via the shrinkage function , that is, in which the soft-threshold function controls the sparsity of the output by adjusting the variable. The data flow of this layer in the th stage are from two previous layers and . Based on (12), given and , and , the nonlinear transform layer is defined as that is, where is the th elements of . Each element of is compared with in sequence. Hence, the number of nonzero elements of variable is depended on the value of .

3.2.3. Multiplier Update Layer

This layer performs an update process of the joint dual variables of and . Given the inputs , , and , the output of this layer in the th stage is defined as where is the learnable parameters in the th stage.

3.2.4. Network Parameters

The deep ADMM-Net architecture is designed to update the hyperparameters, such as in the reconstruction layer, , in the nonlinear transform layer, and in the multiplier update layer. All of these parameters are taken as the weights of the neural network to be learned.

4. Network Training

In the conventional ADMM solver, the parameter set are initialized randomly or determined empirically, where are indexes for each stage. The different initializations would affect the accuracy of recovered transmitted signal vector and the achievable objective function in (9). The network training is introduced in which the hyperparameters become learnable variables because of back propagation with the gradient-based algorithm. In the training phase, we first sample a set of channel matrix. Each element of is generated by applying (1). The noise variance was randomly sampled based on SNR requirement. Suppose the generated training data set is , where is the number of samples, is the th transmitted signal vector, and is the th observed symbol vector. After determining the loss function, the training data is employed to optimize for the parameter set which is randomly initialized at the beginning of algorithm. In the testing phase, suppose the new observed signal and its corresponding channel matrix are known, we feed the trained ADMM-Net with learned hyperparameters to obtain the optimum . More details are presented as follows.

4.1. Loss Function

In this work, the AS-SLP design is obtained based on the knowledge of intended symbol and channel matrix. Hence, the normalized mean square error (NMSE) between the intended symbol and estimated symbol is chosen as the loss function. Given series of training data, the loss between the network output and intended transmitted signal vector can be defined as where is the estimated transmitted signal vector as the network output, and it is obtained based on parameter set and data set . The parameters of deep ADMM-Net can be learned by minimizing the loss using gradient-based algorithm L-BFGS [28].

4.2. Back Propagation

To compute the gradients of loss, the parameter set is updated via back propagation. The data of the th stage are flowed by the order of , , and in the forward paths, and we compute the gradients in an inverse order in the backward paths. For the th stage, the gradients computation for each layer are briefly introduced as follows.

4.2.1. Multiplier Update Layer

Figure 2 shows that this layer has three inputs, that is, , , and . Its output computes , , and in next layers. The parameter of this layer is , and the gradient of the loss with respect to this parameter can be computed as where is the summation of gradients along three dashed red arrows. The gradients of the loss with respect to three inputs , , and in this layer can be presented as

4.2.2. Nonlinear Transform Layer

Figure 2 shows that the nonlinear transform layer has two inputs and with two parameters and . The gradient of the loss with respect to two parameters can be computed as where , , and are calculated by following method. Let be the th element of , it yields Similarly, the term can be illustrated as The gradients of the loss with respect to two inputs and in this layer are

4.2.3. Reconstruction Layer

In Figure 2, the reconstruction layer contains two inputs and and one parameters . Let , the gradient of the loss with respect to this parameter can be computed as where . The gradients of the loss with respect to the inputs and in this layer can be written as

5. Simulation

In this work, we consider the massive MIMO system over mmWave channel which includes a single BS and multiple user terminals. For each channel matrix realization, it is assumed that , and are both uniform randomly selected from the interval . The -order quadrature amplitude modulation (-QAM) is considered. The proposed AS-SLP design via deep ADMM-Net with back propagation (abbr. as ADMM-Net) is compared with the AS-SLP design via basic ADMM solver (abbr. as ADMM) (12), the AS-SLP design via OMP (abbr. as OMP), the AS-SLP design via CD (abbr. as CD), and the ZF SLP with random antenna selection (abbr. as ZF-random) [29]. Note that all the SLP schemes are designed by ZF method once the sparse pattern of is determined. In the ADMM algorithm, without specified indication, the initialized hyperparameters are set as , , , , and , respectively.

Figure 3 illustrates the convergence performance of ADMM, ADMM-Net, and CD algorithms with transmit antennas and activated antennas under SNRdB. The group of solid curves and dash curves were initialized with the parameters and , respectively. The achievable objective function in (5a) is defined as the average Euclidean distance between the received symbols and the desired symbols, that is . Given the same input parameter set , the objective function via the proposed deep ADMM-Net algorithm converges to the lowest value. Because of back propagation process, the learnable hyperparameters provide the improvement on the objective function. On the other hand, the convergence speed of CD algorithm is faster than that of ADMM framework. However, the CD algorithm is sensitive to the parameter initializations, compared with the proposed ADMM-Net which is more stable.

Figures 4 and 5 demonstrate the MSE performance with transmit antennas and different number of activated antennas. The MSE between the intended transmitted signal and the estimated objective is defined as As we expected, the MSE value decreases with increasing SNR (as shown in Figure 4). Because of the -norm constraint, the OMP algorithm achieves higher MSE than that obtained by the ADMM algorithm in the low SNR regime. By learning the hyperparameters, the ADMM-Net algorithm achieves the lowest MSE among the ADMM, CD, and OMP algorithms over all SNR domains. Moreover, Figure 5 shows that when the number of activated antennas increases, the MSE under obtained by all algorithms becomes larger, which indicates that the sparse signal is in favor of the AS-SLP design.

Figure 6 presents the performance of average SER of ADMM-Net, ADMM, CD, OMP, and ZF-random algorithms with and and -QAM modulation. The proposed ADMM-Net algorithm can achieve the lowest SER among the ADMM, ZF-random, CD, and OMP algorithms in all SNR domains. The improvement of ADMM-Net comes from the learnable hyperparameters trained via back propagation. Moreover, under the same number of available transmit antennas , the more transmit antenna is activated, the better performance can be achieved in high SNR regime. Furthermore, under the same number of activated antennas , the ADMM framework with transmit antennas can obtain lower SER than that with , because of the high degree-of-freedom for antenna selection.

Figure 7 demonstrates the average SER performance versus number of activated transmit antennas under and -QAM under . It explores the effect of the different number of activated transmit antennas on average SER performance. Generally, the achievable SER is reduced with the increased by most for aforementioned algorithms, except the ZF-random approach. It is because that the probability of selecting the improper activated subset by the ZF-random approach would increase as increases. Moreover, because of the back propagation process, the ADMM-Net scheme can provide the lowest SER among all other algorithms with different settings on and .

6. Conclusion

In this work, we develop the AS-SLP design to reduce the power consumption of the RF chains by jointly minimizing the achievable SER and the number of activated transmit antennas. In the SLP scheme, the optimization problem is formulated as the minimization of the average Euclidean distance between the received symbols and desired symbols with the constraint on the number of activated transmit antennas. Due to the nonconvex -norm constraint, the underlying optimization problem is further transferred into the regularized -norm problem and solved via ADMM algorithm effectively. By taking the deep learning network as building block, the conventional ADMM process is mapped into the iteratively constructive process which is compose of the reconstruction layer, nonlinear transform layer, and multiplier update layer. Furthermore, considering the effect of parameters initialization on SER performance, the hyperparameters are learned from the ADMM-Net via back propagation. Simulation results demonstrate that the proposed deep ADMM-Net scheme can achieve the considerably low MSE performance and reduce the SER significantly with low power consumption of RF chains.

Data Availability

All experiments of our work were conducted in MATLAB (2017b) and all parameters setting and results of data were presented in the simulation part of the paper, so the data were not additionally provided.

Conflicts of Interest

The authors declare that they have no conflicts of interest.