Abstract

In recent years, extensive research has been conducted to obtain better detection performance by combining massive multiple-input multiple-output (MIMO) signal detection with deep neural network (DNN). However, spatial correlation and channel estimation errors significantly affect the performance of DNN-based detection methods. In this study, we consider applying conditional generation adversarial network (CGAN) model to massive MIMO signal detection. First, we propose a preset conditional generative adversarial network (PC-GAN). We construct the dataset with the channel state information (CSI) as a condition preset in the received signal, and train the detector without direct involvement of CSI, which effectively resists the impact of imperfect CSI on the detection performance. Then, we propose a noise removal and preset conditional generative adversarial network (NR-PC-GAN) suitable for low-signal-to-noise ratio (SNR) communication scenarios. The noise in the received signal is removed to improve the detection performance of the detector. The numerical results show that PC-GAN performs well in spatially correlated and imperfect channels. The detection performance of NR-PC-GAN is far superior to the other algorithms in low-SNR scenarios.

1. Introduction

Massive multiple-input multiple-output (MIMO) technology in wireless communications can significantly improve spectral efficiency and link reliability. Specifically, the total throughput is improved by using a large number of transmit antennas and receive antennas simultaneously for multistream communication. However, implementing massive MIMO system detection is a challenging problem [1]. Among existing detectors, maximum likelihood (ML) detection achieves the best performance, but the complexity of considering all combinations of transmission symbols during detection makes it infeasible for the practical detection. The common linear detectors based on zero-forcing (ZF) [2] and minimum mean square error (MMSE) criteria [2] require matrix inversion during detection, which becomes very complicated in massive MIMO systems with a large number of antennas. There are also suboptimal algorithms whose detection accuracy decreases significantly with the number of antennas, such as spherical decoding (SD) [3] and semidefinite relaxation (SDR) [4]. Approximate message passing (AMP) [5] and orthogonal AMP (OAMP) [6] exhibit a sharp increase in complexity in massive MIMO systems.

Deep learning (DL) has been successfully applied in areas such as computer vision, automatic speech recognition, and natural language processing [7] due to the powerful learning capability that enables it to approach the objective function step by step with nonlinear operations and neural networks. Recently, detection methods based on deep neural network (DNN) framework have also been proposed for massive MIMO detection to pursue performance enhancement [8]. DetNet with unfolding projection gradient descent method proposed in references [9, 10] achieves promising results under i.i.d. (independent and identically distributed) Rayleigh fading channel. However, due to its complexity, a longer training time is required when achieving comparable detection performance to SDR. A sparsely connected neural network (ScNet) with network simplification based on DetNet was proposed in [11], which has better detection capability than DetNet and dramatically reduces the network complexity. However, ScNet significantly compromises detection performance in high-order scenarios, and the multisegment mapping network (MsNet) proposed in [12] uses the sigS step function to solve this problem. Both, LcgNet [13] and DL-based [14] are based on conjugate gradient descent [15] combined with DNN, and they perform similarly with large-scale antennas. Due to the algorithm model’s low degree of nonlinearity, the algorithm’s performance in the spatially correlated channel decreases. The OAMP-Net [16] based on the fusion of OAMP and DNN has performance advantages among existing algorithms and achieves considerable performance under both i.i.d. Rayleigh fading channel and spatially correlated channel. However, its detection process is extremely complex, making it only suitable for small systems, but not for the massive MIMO with a large number of antennas. The detection algorithm MMNet proposed in [17] is suitable for online training and has some adaptability to spatially correlated channels due to its high number of trainable variables. The authors in [18] combined deep learning and SD, proposing sparsely connected sphere decoding (SC-SD) with lower complexity. The authors in [19] proposed to combine deep reinforcement learning (DRL) with Monte Carlo tree search (MCTS) to obtain DeepMcTs detectors with better detection performance. The detection methods combining DNN framework with traditional algorithms can rely on DNN’s powerful data and nonlinear expression learning ability, demonstrating superior performance over traditional algorithms. However, these algorithms are either limited by the distribution of training data or by the original algorithm model, and the detection performance decays significantly when the communication environment becomes complex; for instance increased spatial correlation, difficulty obtaining accurate channel state information (CSI), or high-noise power.

Generative adversarial network (GAN) [20] is effective in learning data distribution. Currently, GAN is widely used in computer vision, such as image restoration [21] and image super-resolution reconstruction [22]. Besides, GAN models are gradually applied in the field of communication. The authors in [23] used GAN models for modulated signal classification; the authors in [24] proposed conditional generation adversarial network- (CGAN-) based end-to-end communication for unknown channels; the authors in [25] and [26] solved the channel estimation problem with CGAN and achieved good performance gains.

Motivated by the existing work, we propose a preset conditional generative adversarial network (PC-GAN) based on CGAN for uplink massive MIMO signal detection and apply an improved U-Net [27] structure in the generator to enhance the network learning capability. In addition, to adapt the low-signal-to-noise ratio (SNR) communication scenarios, we design a noise removal and preset conditional generative adversarial network (NR-PC-GAN) detection with a denoising function. Our contributions are summarized as follows:(1)First, we leverage the image processing method of CGAN model in the MIMO signal detection and propose a PC-GAN detection method to generate a similar probability distribution to the transmitted signal. The excellent nonlinear capability of PC-GAN enables it to release the influence of spatial correlation on the detection accuracy to a certain extent. We treat the CSI as a condition and preset it so that it is no longer directly involved in the detection process. This method can reduce the dependence of detection methods on CSI and effectively resist the impact of imperfect CSI on detection performance.(2)The generator adopts an improved U-Net structure, which aims to improve the detection accuracy without affecting the network convergence speed, and it consists of an encoder and decoder that contain a small amount of convolution and deconvolution, respectively. Compared with the original U-Net, its computational complexity is reduced while solving the overfitting problem and improving the upper limit of detection accuracy. In addition, the improved U-Net increases the number of feature maps of the decoder, which enhances the feature reconstruction capability of the network and ensures the convergence speed of the network.(3)An NR-PC-GAN detection method suitable for high-noise power scenarios is proposed. This detection method multiplexes the same network for noise removal and signal detection and improves detection accuracy at the expense of complexity. Simulation results show that NR-PC-GAN exhibits good noise removal capability and has superior detection performance under low-SNR conditions.(4)The detection accuracy, complexity and robustness of the proposed PC-GAN are evaluated. The detection accuracy of PC-GAN is compared with the other detectors, and the results show that the proposed PC-GAN exhibits advantages over OAMPNet and MMNet, both in spatially correlated channels and imperfect channels. In the online training mode, the computational complexity of PC-GAN is reduced by thousands of times compared with the DNN-based detection method. In addition, PC-GAN shows good robustness when SNR and channel gain are mismatched. When the SNR conditions of training and testing are inconsistent, we call it SNR mismatch, and when the channel gain of training and testing is inconsistent, we call it channel gain mismatch.

2.1. The MIMO Signal Detection Problem

In a massive MIMO system with the number of receive antennas N and the number of transmit antennas M, the received signal is given by the following:where is the transmitted signal, denotes the channel matrix, and is the additive Gaussian white noise during signal transmission with zero mean and variance .

2.1.1. Linear Detection

The detection principle of the linear detection algorithm is that the final detection result is obtained by multiplying the received signal by a receive filter . The MMSE detector can be expressed as follows:where denotes the unit matrix of represents the transpose of . It can be seen that the MMSE detection process involves matrix inversion operation. When the number of antennas in a massive MIMO system increases, the computational complexity of MMSE detection will be very high.

2.1.2. DNN-Based Detection Algorithm

Based on the projected gradient descent method, the authors in [9] used DL for MIMO detection and proposed the DetNet detection algorithm. This detection algorithm shows good performance under i.i.d. Rayleigh fading channels and achieves higher detection accuracy under low-order modulation schemes than that under high-order modulation. The DetNet detection algorithm can be described by the following equations:where is the number of layers of the network, is a segmented symbolic mapping function similar to the hyperbolic tangent, and are trainable variables.

2.2. CGAN Model

GAN was proposed as a machine learning method in [20], which consists of two networks, the generator and the discriminator. The generator attempts to fool the discriminator by generating data similar to the real sample distribution, and the discriminator tries to distinguish the sample sources correctly. During the training process, the generator and the discriminator compete with each other, and their generation and discrimination abilities are gradually improved to Nash [28] equilibrium. The basic GAN generator synthesizes real samples from random noise; this unsupervised training process is not accessible, and the properties of the generated samples cannot be controlled. To solve this problem, an improved CGAN was proposed in [29], adding additional conditional information to the basic GAN to generate samples with specific attributes. The structure of CGAN network is shown in Figure 1, where denotes the input noise, is the generation target, and is the output of the generator. The discriminator x and judges to be real and to be fake. Condition c can be a category label or other type of data that controls sample attribute generation. The condition c is input as an independent item to the generator and discriminator.

3. Proposed PC-GAN

This section proposes a PC-GAN detection method for massive MIMO detection based on CGAN. Unlike CGAN, we preset the condition before entering it into the network for PC-GAN. We preprocess the transmitted signal, transform the transmitted signal matrix and received signal matrix into images, and then perform MIMO signal detection with an image-to-image conversion. In this section, we first introduce the system model, the signal image processing and the precondition construction process. Then, we detail the detection working process of PC-GAN and its structure.

3.1. System Model

As shown in Figure 2, we image the signal and solve the signal detection problem based on CGAN. We consider an uplink massive MIMO system with N antennas at the base station (BS) and M single-antenna users at the transmitter. We consider preprocessing the transmitted signal before it passes through the channel, constructing the transmitted signal matrix (P is the number of upsampling points), and transmitting the signal through the channel to the BS to obtain the received signal matrix . Thus, Equation (1) can be written as follows:where is the channel matrix, and represents the additive Gaussian white noise. Then, we use the CGAN-based detection network to obtain the transmitted signal from the received signal to complete the signal detection.

3.2. Image Processing and Condition Presetting

In the massive MIMO detection problem, the original transmitted signal is usually considered as a modulated signal of dimension . The principle of PC-GAN for MIMO signal detection is image feature extraction and reconstruction. The essence of an image is a matrix, and the characteristics of the image are closely related to the relationship between the elements in the matrix. However, the transmitted signals are M mutually independent symbols, and to establish the connection between the transmitted signals, we consider the preprocessing of . First, we upsample to get the sampled signal matrix . Then, with the help of a high-frequency sinusoidal carrier, we can get the transmitted signal matrix with a close correlation between the elements. The correlation between the elements of the signal matrix can be intuitively determined from the real part image of the signal matrix in Figure 3. It can be seen that the image of the transmitted signal matrix after preprocessing has certain texture characteristics, and we can complete the signal detection work based on the texture characteristics of the image.

The channel matrix in MIMO communication is the bridge between the received signal and the transmitted signal. In the PC-GAN detection method, we consider the channel matrix as conditional information to control the process of obtaining the transmitted signal from the received signal. The intuition behind this is that the conditional input term in CGAN can control the network to generate samples with the specific properties. The principle of existing MIMO detection methods, both traditional and DNN-based methods, is to calculate the transmitted signal based on the received signal and the channel matrix. In MIMO communication, the signal received by the BS is obtained by transmitting the transmitted signal through the channel, which means that the received signal is related to the channel matrix. Based on this theory, we propose a detection network that only needs to provide received signal. In order to train a detection network that no longer needs to provide channel matrix separately, we perform conditional presetting when constructing the training set. First, we simulate the transmitted signal matrix , collect the conditional information channel matrix , and then preset the condition in the received signal using Equation (4). We transform and into image tensor and with dual channel of real and imaginary parts and construct training sets with and . The dimensions of and are and , respectively. During network training, the CGAN in Figure 1 needs to input additional conditions into the generator and discriminator. The input of the PC-GAN generator is already preset with condition , so the generator does not need additional conditional input. In addition, when constructing the dataset, we ensure that and correspond strictly so that the discriminator does not require additional conditional input. The purpose of conditional presetting is to reduce the dependence on the channel matrix in the detection process, so that the detection method can be adapted to imperfect CSI and, at the same time, simplify the network input.

3.3. The Working Process of PC-GAN Detection

In our work, we use PC-GAN to build the mapping relationship from the received signal image to the transmitted signal image . As shown in Figure 4, we input into the generator and into the discriminator for PC-GAN training. We use the generator that has completed the training as the PC-GAN detector, which can convert the received signal image to the transmitted signal image .

During the training period, the generator performs feature extraction and feature reconstruction on the received signal image , and then generates the signal image . We collect and to train the discriminator’s ability to distinguish the real samples from the generated samples. If the discriminator can successfully distinguish and , the result is fed back to the generator to obtain the gradients of the two networks. The generator and discriminator continuously perform max–min games, and the signal image generated by the generator gradually approaches the real transmitted signal image , until the discriminator is unable to distinguish the generated samples from the real samples, and then the training ends. We apply the following loss function to achieve the optimization of the PC-GAN detection method:where and are the outputs of the discriminator. In the training process, we record as label “1” and as label “0”. When the discriminator can not distinguish the generated samples from the real samples, the network training can be completed. We can use the trained generator to perform MIMO detection, in the case of inputting a new received signal image .

3.4. Generator and Discriminator Structure

The core idea of our work is to transform the received signal image into the transmitted signal image . This process requires feature extraction and feature reconstruction for . We use an improved U-Net in the generator. U-Net is a variant of FCN [30], which can be trained completely with fewer samples. Unlike only one deconvolution of the encoder in FCN, the encoder on the left of U-Net is strictly opposite to the encoder on the right, which enhances the U-Net’s ability for the feature recovery.

We made some adjustments to the U-Net, and the improved structure of U-Net is shown in Figure 5(a). We consider a 3-layer U-shaped structure. The improved U-Net simplifies the convolution operation to reduce the computational complexity while avoiding model overfitting. A size adjuster containing convolution and deconvolution layers is designed to shape the input to the same size as , so that the detection system can be applied to a multi-antenna system. The encoder consists of batch normalization, convolution, and ReLU activation functions, and the decoder consists of batch normalization, deconvolution, and ReLU activation functions. The convolution operation is used for the feature extraction, and the deconvolution operation is used for the feature reconstruction. The batch normalization is used to continuously adjust the intermediate output of the neural network to make the neural network more stable. Setting the ReLU activation function both avoids gradient vanishing and enhances the nonlinear capability of the network. In addition, we set buffer blocks to avoid convergence difficulties caused by too-deep feature extraction.

Our detection work essentially establishes the mapping function from to . The mapping relation from to is more complicated than the mapping relation from noisy image to noiseless image. So, if the role of the decoder in the image denoising process is considered as feature recovery, then it is regarded as feature reconstruction in the process of MIMO detection. We consider increasing the number of feature maps in the decoder to enhance the feature reconstruction capability of the network. As shown in Figure 5(a), the convolution step in the first layer of the encoder is and the number of feature maps is 64, whereas the deconvolution step in the first layer of the decoder is and the number of feature maps is 128. It shows that the number of encoder and decoder feature maps of the improved U-Net is no longer completely symmetrical. In addition, the size of the convolution and deconvolution kernels used is , and the small convolution kernels can enhance the nonlinear capability of the network. Further, the improved U-Net retains the skipping connection to fuze the multiscale features and accelerates the convergence speed.

The discriminator structure is shown in Figure 5(b). We use the patch architecture [31] with three convolution layers and one fully connected layer. The output of the regular discriminator is a single evaluation value, but the patch architecture maps the input to a receptive field, which is averaged over all responses. The final output of the discriminator is an evaluation of the entire image generated by the generator. The patch architecture enhances the ability to identify local details, which is particularly important for the accuracy of MIMO detection. In addition, we added batch normalization and ReLU activation function after the convolution layer to ensure the robust training of the discriminator.

4. Proposed NR-PC-GAN

Based on the PC-GAN detection method, this section proposes the NR-PC-GAN detection method for massive MIMO detection in low-SNR scenarios. NR-PC-GAN achieves multiplexing of noise removal and signal detection with one network, reducing hardware costs. The MIMO system has noise interference in the communication process, adversely affecting the detection accuracy of the detector. The detection performance decreases with the increase of noise power. Assuming that there is no noise interference in the communication process, we define as the noiseless received signal matrix received by the BS, and its expression is as follows:

Equation (4) can be rewritten as follows:

Before the detection, we remove the Gaussian additive white noise from the noisy signal matrix and obtain the noiseless signal matrix . We transform and into image tensor and of dimension , respectively, and use the PC-GAN with image denoising method to obtain the pure received signal. The training of the NR-PC-GAN detection network is divided into two stages: denoise training and detection training.

4.1. Denoise Training

As shown in Figure 6, is input to the generator, is input to the discriminator, the generator and the discriminator continuously play minimax game, and the generator is trained to generate , which is similar to . When the training is completed, the denoiser is obtained.

4.2. Detection Training

We train the detection network with the obtained denoiser, as shown in Figure 7. NR-PC-GAN detection network is basically the same as PC-GAN detection network; the only difference is that the input of generator is no longer , but the output of the denoiser.

After two training stages, we can use the two trained generators shown in Figure 7 to perform MIMO detection under low SNR. Compared with the PC-GAN detection, NR-PC-GAN detection has been trained twice, and it is straightforward to find that its complexity is twice that of PC-GAN detection. NR-PC-GAN detection achieves improved detection accuracy at the expense of complexity.

5. Simulation and Numerical Results

In this section, we first present the experimental setup and implementation details. Next, numerical results are provided, comparing the detection performance of the proposed detection method with the other detection methods. Then, the contribution of the improved U-Net structure to the detection accuracy and convergence is investigated. Finally, we analyze the complexity of the PC-GAN detection method.

5.1. Implementation Details

In our simulations, the proposed PC-GAN and NR-PC-GAN are implemented in TensorFlow 2.0 with Python [32]. We considered two antenna configurations, 16 × 64 and, 32 × 64, and two modulation methods, QPSK and 16QAM. The experimental process is divided into a training phase and a test phase, and the experiment adopts online training mode, i.e., the channel matrices H in the training and test sets are identical, and 50 samples of H are generated for each experiment. We consider two channel types, spatially correlated and imperfect CSI.

5.1.1. Spatial Correlation

The spatially correlated channel described by the Kronecker model [33] is as follows:where is the i.i.d. Rayleigh fading channel, and are the correlation matrices for receiving antenna and single antenna users, respectively, generated according to the exponential correlation model with correlation coefficient ρ ⊂ (0,1) [17], and the closer ρ is to 1, the stronger the correlation.

5.1.2. Imperfect CSI

Based on the LS (least squares) method for estimating the i.i.d. Rayleigh fading channel to obtain an imperfect channel, we use the normalized mean square error (NMSE) to characterize the difference between the estimated channel matrix and the original channel matrix as follows:where denotes the matrix norm, and we compute to obtain the NMSE value in dB.

In our work, a total of nine MIMO detectors, PC-GAN, NR-PC-GAN, OAMPNet, MMNet, DetNet, ScNet, LcgNet, DL-based, and MMSE, are simulated. We construct the training set and test set by following a random normal distribution and generating the transmitted signal from the corresponding constellation set (QPSK or 16QAM). The received signal is transmitted over a channel and carries noise. The size of the training set for PC-GAN and NR-PC-GAN is 2,500, and the training process is iterated 20 times with a batch size of 10 for each iteration, while the size of the training set for other networks is 200,000, and the training iterations are 20,000, with a batch size of 500 for each iteration. All networks were evaluated using a test dataset of 20,000 samples. We set the number of network layers to 10 for OAMPNet, MMNet, LcgNet, and DL-based, 30 for DetNet and ScNet, a single-layer structure for PC-GAN, and NR-PC-GAN equivalent to two-layer PC-GAN.

5.2. PC-GAN Detection Performance Analysis

In this subsection, we investigate the massive MIMO detection performance of PC-GAN in the case of spatially correlated channels and imperfect CSI. In addition, we analyze the robustness of PC-GAN to the scenarios with the various mismatches, including SNR and channel gain mismatches.

5.2.1. SER (Symbol Error Ratio) Performance under Spatially Correlated Channel

Figure 8 compares the SER performance of PC-GAN, OAMPNet, and MMSE with QPSK modulation, correlation coefficients ρ of 0.6, 0.7, and 0.8, and antenna configuration of . It can be seen that under the same conditions, the PC-GAN detection performance has obvious advantages. Specifically, PC-GAN has 2.4 dB gain over OAMPNet at SER of under ρ of 0.7. In addition, all the three detectors considered are affected by the channel correlation and show different degrees of degradation in the detection accuracy. OAMPNet detection performance is relatively sensitive to correlation. With the increase in correlation, the detection performance decreases significantly. In contrast, PC-GAN detection is more resistant to correlation. In the case of SER at , when rises from 0.6 to 0.7, OAMPNet detection performance decreases by 2.6 dB and PC-GAN detection performance decreases by 1.4 dB. When rises from 0.7 to 0.8, PC-GAN detection performance only decreases by 1.8 dB, while OAMPNet is no longer adapted to that condition.

Figure 9 compares the SER performance of the proposed PC-GAN, OAMPNet, MMNet, DetNet, ScNet, LcgNet, DL-based, and MMSE at 16QAM modulation with ρ of 0.5. It can be seen that the proposed PC-GAN can maintain the performance advantage under high-order modulation. The PC-GAN provides performance gains of 1.2 and 1.6 dB compared to MMNet and OAMPNet, respectively, at SNR of 10−4. Next, we analyze the reasons why PC-GAN shows good detection performance under correlated channels. The spatial correlation of the channel causes multiple transmitted signals of MIMO to interfere with each other during transmission, which causes the received signal to change at the BS. When the linear connection between the received and transmitted signal is weakened, the nonlinear connection is enhanced. OAMPNet, MMNet, and other detectors assist traditional linear algorithms with DNN’s nonlinear capabilities for detection. The limited nonlinear capability leads to a significant decrease in their detection accuracy with the increase of channel correlation. The PC-GAN detection method inherits the feature of GAN models that do not require inference during learning [29] and its ability to correct biases introduced by various factors and interactions more easily. Compared with the DNN-based detection algorithm, PC-GAN has stronger nonlinear ability, making it perform well in spatially correlated channels. On the other hand, the small convolution kernel and ReLU function are used in the generator to further enhance the nonlinear capability of the network.

5.2.2. SER Performance under Imperfect CSI

Figure 10 compares the SER performance of PC-GAN, OAMPNet, MMNet, DetNet, ScNet, LcgNet, DL-based, and MMSE, with different SNR for the scenario with QPSK modulation, NMSE of channel estimation being −13 dB, and antenna configuration of . As shown in Figure 10, PC-GAN outperforms other detectors in the range of SNR from 0to 14 dB. When the detection accuracy SER reaches the level of , PC-GAN has about 1.4 and 2.3 dB performance gain over OAMPNet and MMNet, respectively. Further, we extend the experiments to show the detection performance of the various detectors under different NMSE. Figure 11 compares the SER performance of various detection methods at different NMES with SNR of 6 dB. PC-GAN achieves superior performance under imperfect CSI, and the gain of PC-GAN over MMNet and OAMPNet is 6 and 2 dB, respectively, when the SER is . The SER superiority of PC-GAN can be sustained when NMSE changes from −22 to −4 dB. From Figure 11, we can see that the SER advantage of PC-GAN over other detectors gradually increases as the NMSE changes from −22 to −4 dB, which means that the larger the error in the channel estimation, the more PC-GAN can show the performance advantage. Next, we analyze the reasons why PC-GAN exhibits good detection performance under imperfect CSI.

The adaptability of PC-GAN to imperfect CSI stems from the fact that we conditionally preset the CSI. In MIMO detection methods, CSI accuracy directly affects the detection performance. OAMPNet, MMNet, DetNet, ScNet, LcgNet, and DL-based are all DNN-based detection algorithms. For the DNN-based detection algorithms and traditional linear algorithms MMSE, the channel matrix is directly involved in the operation as deterministic information during the detection process. In this case, the detection accuracy of these detection methods is positively correlated with the degree of accuracy of . The lower the degree of accuracy of , the more significant the decrease in detection accuracy. As can be seen in Figures 10 and 11, the detection accuracy of the DetNet and ScNet detection methods is strongly affected by the imperfect CSI. This is due to the fact that DetNet and ScNet are data-driven detection methods based on which the detection accuracy is more data dependent, and the DetNet and ScNet detection performance is severely degraded when the channel matrix is no longer accurate. This further validates that the direct involvement of CSI in the operation is results in the degradation of the detection performance of existing algorithms under imperfect CSI conditions. Our proposed PC-GAN detection method presets the channel matrix as a condition in the received signal image , and realizes MIMO signal detection by constructing a mapping function from the received signal to the transmitted signal. The channel matrix is no longer directly involved in the detection process, which effectively resists the impact of imperfect CSI on the detection performance.

5.2.3. Robustness

To verify the robustness of PC-GAN detection, we train PC-GAN under a specific SNR of 12 dB and test its detection performance under the SNR ranging from 0 to 12 dB, to investigate the impact of SNR mismatch on the detection performance. Similarly, to verify the robustness of PC-GAN detection against imperfect CSI, we train PC-GAN under an imperfect CSI with NMSE of −22 dB and test its detection performance under imperfect CSI with NMSE of −24, −23, −21, and −20 dB. Figures 12 and 13 show the detection results of PC-GAN in SNR and NMSE mismatch states with 16QAM modulation, , and antenna configuration.

As shown in Figure 13, when the SNR is near 12 dB, the detection performance of the mismatched network is almost the same as that of the matched network; when the SNR is far below 12 dB, the SER performance curve of the mismatched network gradually deviates from the matched network. Generally speaking, mismatch has little effect on the detection capability of PC-GAN. In Figure 12, when SER is at , the mismatched network decreases about 0.4 dB compared with the matched network and has about 0.8 dB gain compared with OAMPNet. It can be seen from Figure 13 that when the mismatched NMSE is −21 and −20 dB, the degree of detection performance degradation is large, and when the mismatched NMSE is −24 and −23 dB, the degree of detection performance degradation is small. This means that the channel estimation error is smaller during testing than during training, which can mitigate the impact of channel mismatch on SER performance to a certain extent. When SER is , the SNR of the network with mismatched NMSE of −23 dB is about 2 dB lower than that of the matched network, but still shows a gain of 1.7 dB compared to the MMSE detector. It can be seen that the impact of channel mismatch on the detection performance is within an acceptable range.

5.3. NR-PC-GAN Detection Performance Analysis

In this section, we study the massive MIMO detection performance of NR-PC-GAN under the condition of high-noise power and show the results of the denoiser. Figure 14 shows the SER performance of NR-PC-GAN, PC-GAN, and other detectors with the setting of QPSK modulation, antenna configuration of , and SNR ranging from −6 to 8 dB. It is clear that NR-PC-GAN performs superiorly at low SNR. When the SNR is 2 dB, the detection SER can reach , while PC-GAN and OAMPNet achieve the same SER performance with SNR of 4.9 and 7.4 dB, respectively. In addition, compared to PC-GAN, the advantage of NR-PC-GAN gradually decreases with the increase of SNR because the advantage of denoise network is no longer evident with the increase of SNR. In this case, no improvement in detection performance can be achieved even with the addition of denoising. Therefore, NR-PC-GAN is more suitable for high-noise channels.

There are two curves in Figure 15; Y_noise indicates the difference between the noisy signal image and the noiseless signal image . Y_denoise indicates the difference between the image generated by the denoiser and the noiseless signal image, and the NMSE values are obtained as follows.

It can be seen that the difference between and is much lower than the difference between and . Figure 16 shows , and at SNR of −6 dB, and it can be seen that is quite different from . The denoiser restores the picture texture by removing the noise and generates , which is very similar to . There is a tiny difference between and , which shows that the denoiser can effectively remove the noise. In addition, as shown in Figure 15, with the increase in SNR, the gap between the two curves decreases. The difference between the two curves is 1.4 dB at an SNR of −6 dB, while the difference is about 1.1 dB at an SNR of 8 dB. This further shows that NR-PC-GAN has better performance at low SNR.

5.4. The Contribution of Improved U-Net

We set the experimental conditions to QPSK modulation, ρ equal to 0.6, and the antenna configuration of . We compare the SER performance of PC-GAN detectors with different iterations when the generator adopts the U-Net structure, the FCN structure and the improved U-Net structure, respectively, and analyze the effect of the improved U-Net on the SER performance and convergence of PC-GAN detection methods. The experimental conditions were kept consistent except for the generator structure. A comparison of the main operations of the encoder and decoder of the original U-Net, FCN, and improved U-Net is given in Figure 17. It can be concluded that the original U-Net decoder and encoder are strictly symmetric; the improved U-Net decoder and encode have the same number of layers, but the numbers of feature maps are no longer perfectly symmetric. In contrast, FCN is a completely asymmetric structure. As shown in Figure 18, the U-Net structure converges after 15 iterations, whereas the improved U-Net and FCN need 19 and 25 iterations, respectively, to reach convergence. This is because the decoder of FCN has only one deconvolution, while the decoder of U-Net is strictly opposite to the encoder, which is a structure of step-by-step amplification, and the superposition of multiple deconvolution operations speeds up the convergence of network. Therefore, the symmetry of the generator decoder and encoder is the main reason for the convergence speed. However, the U-Net structure is trained with fewer samples, and too many convolution operations cause overfitting, significantly damaging the SER performance of detection. Our improved U-Net halves the number of convolution operations compared to the original U-Net, effectively avoiding overfitting. It can be observed that the SER performance of the improved U-Net reaches below when the network is converged, while that of the original U-Net and FCN is and , respectively. Such a significant performance improvement is also attributed to the enhanced decoder feature reconstruction capability of the improved U-Net.

5.5. Complexity Analysis

In this subsection, the number of operations of the proposed detection network is compared with the other detection networks. As we know, the complexity of detection algorithms mainly comes from multiplication operations. In Table 1, we roughly estimate the number of multiplication operations for DetNet, ScNet, LcgNet, DL-based, MMNet, and the PC-GAN as well as NR-PC-GAN proposed in this paper, using the QPSK modulation method as an example. The OAMPNet algorithm requires matrix inversion operations at each iteration, with a computational complexity of , much higher than the other methods.

In the DNN-based detection algorithm, the computational complexity mainly consists of two parts: initialization and network iteration process. For example, DetNet needs to calculate and during initialization, which require and operations, respectively, while is the number of operations for iteration process, and is the number of iterative layers. The PC-GAN complexity expression differs from the other networks. As shown in Table 1, d is the number of convolution kernels, and are the image length and width, and are the number of the i-th convolution/deconvolution input and output feature maps (the total number of convolution and deconvolution during training and detection is 16 and 13, respectively). In the detection process, PC-GAN convolution brings about 10 times more operations than in DetNet detection. However, in the training process, because PC-GAN uses improved U-Net, only a small number of iterations and a small number of batches per iteration are needed to complete the training. DetNet completes training with 1,000 times more iterations than PC-GAN and 50 times more batches per iteration than PC-GAN. Because we use an online training scheme, the complexity advantage of the PC-GAN method is significant. In addition, NR-PC-GAN is equivalent to performing PC-GAN twice, so the computational complexity is twice that of PC-GAN.

6. Conclusions

In this paper, we propose a PC-GAN for massive MIMO detection. In order to make the CSI no longer directly involved in the detection process, PC-GAN presets the CSI as a condition in the received signal to perform MIMO signal detection in the form of learning the probability distribution of the transmitted signal. To improve the nonlinear capability of the network, we use a small convolutional kernel and a ReLU function in the generator. In addition, to make the detection adaptive to low-SNR scenarios, we propose NR-PC-GAN with a denoising function to gain detection performance by removing the noise in the received signal. Numerical results show that the detection accuracy of PC-GAN under spatially correlated channels and imperfect CSI can surpass that of OAMPNet and MMNet, which are the representative detection methods among the existing works. Moreover, with online training mode, the complexity of PC-GAN can be reduced by several thousand times compared to the DNN-based detection method. In addition, NR-PC-GAN demonstrates superior detection performance in scenarios with high-noise power.

Data Availability

The authors will supply the relevant data in response to reasonable requests.

Conflicts of Interest

The authors declare that there is no conflict of interest regarding the publication of this paper.

Acknowledgments

This work was supported by the Fundamental Research Funds for the Central Universities (grant number: 3072022CF0802).