Abstract

In a complex electromagnetic environment, there are cases where the noise is uncertain and difficult to estimate, which poses a great challenge to spectrum sensing systems. This paper proposes a cooperative spectrum sensing method based on empirical mode decomposition and information geometry. The method mainly includes two modules, a signal feature extraction module and a spectrum sensing module based on K-medoids. In the signal feature extraction module, firstly, the empirical modal decomposition algorithm is used to denoise the signals collected by the secondary users, so as to reduce the influence of the noise on the subsequent spectrum sensing process. Further, the spectrum sensing problem is considered as a signal detection problem. To analyze the problem more intuitively and simply, the signal after empirical mode decomposition is mapped into the statistical manifold by using the information geometry theory, so that the signal detection problem is transformed into geometric problems. Then, the corresponding geometric tools are used to extract signal features as statistical features. In the spectrum sensing module, the K-medoids clustering algorithm is used for training. A classifier can be obtained after a successful training, thereby avoiding the complex threshold derivation in traditional spectrum sensing methods. In the experimental part, we verified the proposed method and analyzed the experimental results, which show that the proposed method can improve the spectrum sensing performance.

1. Introduction

Spectrum sensing is a key step in the cognitive radio (CR) technology [1, 2]. The main task is to quickly and accurately detect whether the primary user (PU) is using the spectrum. Traditional spectrum sensing methods include energy detection [3], matched filter detection [4, 5], and cyclostationary feature detection [6]. These methods all have shortcomings. For example, in the case of uncertain noise, the detection performance of energy detection degrades sharply. Matching filter detection needs to know the PU signal and noise information in advance.

In complex electromagnetic environments, noise uncertainty greatly impairs the performance of spectrum sensing [7, 8]. Therefore, it is necessary to reduce the noise and redundant information in the signals collected by the secondary users (SUs). The main challenge is that most of the signals collected by the SUs are nonstationary and nonlinear. To cope with that, traditional noise reduction methods are proposed, including short time Fourier transform, Wigner Ville distribution, and wavelet transform [9, 10]. However, these methods are all based on Fourier transforms, limited by the principle of uncertainty. Boudraa et al. proposed an empirical mode decomposition (EMD) algorithm, which can decompose the noise-infected signal into multiple intrinsic mode function (IMF) components [11]. An EMD denoising method based on continuous mean square error criterion is proposed [12]. Through analyzing the energy abrupt point between IMF components, the low-order IMF component before the mutation point is filtered out to achieve noise reduction.

With the rapid development of information geometry, the concept of statistical manifolds can be applied to the signal detection problems. These problems can be transformed into geometric problems on manifolds and then analyzed using geometric tools, so as to indirectly solve the signal detection problems. Liu et al. used information geometry for radar signal detection and proposed the matrix constant false alarm probability and a geodesic distance detector [13]. Lu et al. applied the information geometry theory to spectrum sensing and obtained the close-form expression of the decision threshold using the matching methods, but the algorithm has higher complexity [14]. Chen et al. added the manifold geometry measurement and obtained the decision threshold through simulation [15]. However, all these methods need to derive and calculate the decision threshold, which is not only complicated but also inaccurate.

The application of machine learning in various fields has also attracted great interest of many researchers [16, 17]. Thilina et al. used K-means, the Gaussian mixture model (GMM) in unsupervised learning and neural network (NN), support vector machine (SVM) in supervised learning to study spectrum sensing [18]. A spectrum sensing method based on signal energy is proposed, which uses K-means to classify these energy features. At the same time, the spectrum sensing performance of the algorithm under different channel models is further analyzed [19]. A spectrum sensing method based on dominant features and maximum and minimum eigenvalues is proposed, and K-means and GMM are selected to form the unsupervised learning framework [20]. Clustering algorithm and eigenvalue-based algorithms are also proposed. In the signal feature extraction phase, a feature extraction method is proposed to increase the number of cooperative SUs logically. Then, K-means and K-medoids are selected as the learning framework [21].

Based on the previous studies, this paper proposes a cooperative spectrum sensing method based on empirical mode decomposition and information geometry (EMDIGK), to further improve the spectrum sensing performance in complex electromagnetic environments. In the feature extraction phase, the EMD algorithm is first used to denoise the signals collected by the SUs, as to obtain the signal characteristics more accurately. We introduce the order decomposition and recombination (ODAR), interval decomposition and recombination (IDAR) to logically increase the number of SUs. Then, the covariance matrix of the split and recombined signal matrix are calculated separately, and then the covariance matrix is mapped onto the manifolds using information geometry theory, and the geodesic distance on the manifold is calculated and used as the signal feature. Then, the K-medoids clustering algorithm is used in the training and spectrum sensing phase. The EMDIGK method proposed in this paper does not need to obtain the signal information of the communication PU in advance, so it is easy to be obtained for the labeled training data. In the experimental part, we verified the efficacy of EMDIGK through simulation. The experimental results show that EMDIGK successfully improves the spectrum sensing performance in complex electromagnetic environment.

For ease of reference, the symbols and notations used in this paper are summarized in Table 1.

2. The Basic System Model of Cooperative Spectrum Sensing

In the cognitive radio network (CRN), from the perspective of a single SU, the existence of the PU signal can be defined as a binary hypothesis as follows, where indicates that the PU signal does not exist and indicates that the PU signal exists. represents the signal transmitted by the PU and represents the Gaussian noise in the environment, and denotes the number of sampling points. Based on (1), the detection probability () and the false alarm probability () of the system can be defined as (2) and (3), respectively.

For a proposed algorithm, if remains unchanged, a larger indicates better performance. Similarly, a smaller is desired when is fixed.

In CR system, spectrum sensing is usually performed in a complex environment, and then single SU sensing is often affected by multipath loss, shadows, and hidden terminals, resulting in performance degradation of the entire system [22]. In CRN, the fusion center (FC) collects the information collected by all SUs involved in cooperative sensing and then makes a final judgment based on this information [2326]. The cooperative spectrum sensing system model is shown in Figure 1.

In a CRN, we suppose that there are SUs participating in cooperative spectrum sensing, then the signals collected by SUs form a signal matrix , where represents the signal acquired by the -th SU. Therefore, is a matrix.

3. Feature Extraction Based on EMD and Information Geometry

3.1. Feature Extraction Model

In a complex environment, it is first necessary to estimate the noise after EMD denoising to establish a representative reference point. The process of feature extraction based on empirical modal decomposition and information geometry is shown in Figure 2. To make the reference point more representative, we first collect enough noise signals, and then perform the EMD noise reduction, DAR, and covariance conversion (as shown in the dashed box in Figure 2). Then the Riemann mean solution method is used to solve the Riemann mean of these covariance matrices and is used as the reference point. Similarly, the signal matrix to be collected is also transformed by EMD, split recombination, and covariance matrix. Finally, the geodesic distances of these covariance matrices from the reference point are calculated and used as statistical features of the signal.

3.2. Empirical Modal Decomposition

Assuming the signal acquired by the -th SU is , and the aim is to reduce the noise and redundant information in the collected signal, thereby improving the overall sensing system performance. Firstly, the EMD is used to reduce the noise of the signal collected by the SU. EMD can adaptively decompose any complex signal into a series of IMFs, shown as follows.

where represents the residual. The specific steps of the EMD decomposition algorithm are listed as follows.

Step 1. Find all local maxima and local minima of signal .

Step 2. The cubic spline interpolation method is used to fit all local maxima and local minima, respectively, to construct the maximum envelope and the minimum envelope and then calculate the average of and . Therefore, the average value can be obtained as

Step 3. By subtracting from the original signal , the first component can be obtained asCheck whether satisfies the conditions of the IMF. If so, continue to Step 4; otherwise, redo Steps 1 and 2 for to obtain the envelope value and then obtain asExecute sequentially until the -th step satisfies the conditions of the IMF, then

Step 4. Subtracte from the component, resulting in the first residual asTreat as the original signal, and repeat Steps 1~4 above, to obtain , and so on, until becomes a monotonic function or constant.

According to the characteristics of EMD, after the original signal is decomposed by EMD, several IMF components and one residual are obtained. The main component of the low-order IMF component is the high-frequency part of the signal, which contains sharp signals and noise. The high-order IMF component has less noise components, mainly the low-frequency part of the signal. According to this feature, several IMF components in the low frequency band can be selected to reconstruct the original signal to achieve the purpose of noise reduction. The reconstructed signal is shown in (11).

In the process of selecting the IMF component of the low frequency band, there must be a certain critical component, so that the IMF component after the component is useful for the main part. Therefore, the ultimate goal of EMD decomposition is to find the component of the criticality. A continuous mean square error criterion is proposed in [10], shown as follows.

Therefore, the signal after the EMD processing can be obtained as

After all the SUs collected signal after the EMD noise reduction, we can get a new matrix , where represents the signal acquired by the -th SU after EMD. is an matrix.

3.3. Decomposition and Recombination

After the EMD processing, the signal matrix after EMD denoising is processed using ODAR and IDAR, respectively [27], as to logically increase the number of cooperative SUs and further acquire the characteristic information of the signal. Let and denote the signal matrix after ODAR and IDAR, respectively. Both matrices are . and are defined as follows.

where is the split parameter and is the length of the split signal vector after splitting. The covariance matrices and are then calculated using and , shown in

3.4. Information Geometry Theory

In information geometry theory, consider a set of probability density functions , where is an dimensional sample associated with a random variable , . is an dimensional parameter vector, . Therefore, the probability distribution space can be described by parameter set . The probability distribution function family is as shown in

is the covariance matrix composed of the signals collected by the SUs, shown aswhere denotes the signal matrix described above, denotes the number of samples, and denotes the transposition of . We can parameterize the probability distribution family by as

where is the open set of the dimensional vector space. According to the information geometry theory, forms a microscopic manifold structure, which is called a statistical manifold. is the coordinate of the manifold. Since the parameters of the manifold are covariance matrices, they can also be called matrix manifolds. Under the two hypotheses and , we can obtain two types of covariance matrices, and . At the same time, and correspond to two types of coordinates on the manifold. As the SNR increases, the difference between the two hypothetical sensing signals becomes larger, corresponding to an increase in the distance between the two points on the matrix manifold [14]. In the following, based on this, we will extract the signal characteristics.

3.4.1. Geodesic Distance

In information geometry, the distance between two probability distributions on a statistical manifold can be measured in a variety of ways. On the manifold, the geodesic distance is a widely accepted method for measuring the distance between two probability distributions. Due to the nature of the manifold curvature, the distance between the two points on the manifold depends on the choice of the curve between the two points. A curve that minimizes the distance between two points is defined as a geodesic, and the corresponding distance is called a geodesic distance.

Assuming any two points and on the manifold, the arbitrary curve between the two points is , , . Then the curve distance between points and is expressed as [28]:

where is the Fisher information matrix in the information geometry for statistical metrics on the manifold.

When the minimum value is obtained using (22), i.e., the minimum distance between and , the geodesic distance is obtained, and the corresponding curve is the geodesic.

In the cognitive radio spectrum sensing technology, the sensing signal matrix is a multivariate Gaussian distribution family with the same mean but different covariance matrices [29, 30]. Assume having the covariance matrices and , the geodesic distance [31] between them is as shown in

where is the -th eigenvalues of the matrix .

3.4.2. Riemann Mean

We perform ODAR and IDAR on the environmental noise matrices to obtain and and then find their Riemann mean. The objective function is shown in

where is the geodesic distance between two points on the manifold. The matrices and (which minimize the value of the objective function ) are Riemann mean [32], which can be expressed as

Assume that when , there are two points and on the manifold, and its Riemann mean is located at the midpoint of the geodesic of the two points and on the manifold. is calculated using

When there are points , the Riemann mean is difficult to calculate. In this case, [31, 33] have proposed the iterative calculation of the Riemann mean using the gradient descent algorithm, shown in

where is the iteration step size and is the iteration step. According to the above description, we can get the Riemann mean of and of .

The Riemann means and are selected as reference points, and the signal matrix to be collected is also transformed by EMD, DAR, and covariance matrix to obtain and . Finally, the geodesic distances from to and to are calculated separately, and the distance is taken as the statistical characteristic of the signal.

According to (24), we can calculate the corresponding geodesic distance as

where and represent the distance between the sensing signal and reference point and on the manifold, respectively. If the PU signal does not exist, then the distance from the reference points is small. On the contrary, if the PU signal exists, then the distance is large. According to such feature, we can intuitively analyze whether the PU is using the licensed spectrum. In order to facilitate the training of the clustering algorithm framework and test the spectrum sensing performance, a feature vector capable of reflecting the signal characteristics is needed. Therefore, the two geodesic and constitute a geodesic distance feature vector (GDFV) .

4. Cooperative Spectrum Sensing Based on K-Medoids Clustering

After extracting the features, we use the K-medoids clustering algorithm to perform spectrum sensing. The traditional spectrum sensing method needs to derive the threshold value to determine whether the PU exists. Nevertheless, these methods often have problems, such as inaccurate thresholds and difficulties in calculation. EMDIGK uses the K-medoids clustering algorithm as the learning framework. This algorithm is similar to the K-means clustering algorithm. Unlike K-means, the K-medoids clustering algorithm does not calculate the mean of the samples as the centroid point in the class. Instead, the actual sample point in the class is taken as the center point, and the center point has the smallest distance to all sample points in the class. Compared with the centroid point in K-means, the center point in K-medoids has the advantage of being less affected by extreme values [34]. The K-medoids algorithm is less sensitive to noise points, so that the outliers will not cause excessive deviation of the results of the division. The whole process is divided into two phases, the training and the spectrum sensing phase. The cooperative spectrum sensing system model based on K-medoids is shown in Figure 3.

Before training, we need to prepare a training set :

where is the feature vector extracted by the method described in the third section, and represents the number of feature vectors in the training set . Let denote the set of the training feature vectors belonging to class , where .Class has a centroid . The training process is as follows.

Step 1. Input the training sample set and the number of clusters .

Step 2. Initialize the centroids .

Step 3. Calculate the distance of each sample to each centroid and put it into the nearest class.

Step 4. Update with .

Step 5. Calculate ; if the result of the calculation is not changing, the algorithm stops; otherwise, return to Step 3.

Step 6. Output .

After training, according to the centroid of each class, we can construct a classifier for spectrum sensing, as shown in (35).

In (35), represents the feature extracted under the channel of the unknown PU state. If , it means the PU signal exists, and thus the channel is not available. Otherwise, means the PU signal does not exist, and the channel can be used. Parameter is used to control the false alarm probability and the missed detection probability in the spectrum sensing process.

The overall flow of the EMDIGK algorithm proposed in this paper is as shown in Algorithm 1.

Algorithm: EMDIGK
Input: Sensing signal matrix
Step 1: is denoised by EMD to obtain an matrix
Step 2: Perform ODAR and IDAR on to obtain and , then calculate the corresponding covariance matrix.
Step 3: Extract signal characteristics using information geometry theory to obtain .
Step 4: Construct a training vector set and train the K-medoids clustering algorithm to get Eq. (35)
Step 5: Import the training set and get test result according to Eq. (35)
Step 6: Calculate and by Eq. (3) and Eq. (4)
Output: and

5. Simulation Results and Performance Analysis

In this section, we used the multicomponent signal as the experimental simulation signal. To ensure the accuracy of the experiment, we obtained 2,000 signal features, of which 1,000 were used as the training set and the other 1,000 were used as the test set. The test set is used to verify the spectrum sensing performance of the proposed method. We compare the detection performance under different sensing methods, and the experimental results suggest that the proposed EMDIGK-based method can obtain a better sensing performance.

5.1. The Clustering Effect of Clustering Algorithm

Figures 4 and 5 show the effect of GFV before and after clustering by K-medoids at dB. It can be seen that under such condition, after the K-medoids clustering algorithm, the signal characteristics in two different states can be well distinguished. The red ‘’s indicate that the PU exists, while the blue ‘x’s represent the PU does not. The black circle and triangle in the figure represent the centroid of the noise class and the centroid of the containing PU signal class, respectively.

5.2. The ROC Curve under Different SNR Conditions

Figures 6 and 7 illustrate the ROC curves of different methods at two different SNR, respectively. IQMSE, IQDMM, IQMME, and IQRMET are the methods proposed in literature [14], and ED is a method using energy characteristics as a statistic.

Tables 2 and 3 show the detection probabilities of the methods under different SNR when the false alarm probability is a constant. We observe that the EMDIGK method has higher detection probability than other methods. It can be calculated that the detection probability of EMDIGK is 36.6%, 38.5%, 56.4%, 61.6%, and 136.0% higher than other methods under the condition of dB and . That is because the EMDIGK method reduces the noise in the communication signal, thereby reducing the impact of noise on the spectrum sensing system. At the same time, the feature of the information geometry method can be used to analyze the feature values in different states more intuitively, so that a better detection performance can be obtained.

5.3. The ROC Curve with Different Number of Cooperative SUs

Figure 8 shows the ROC curve of the EMDIGK method with different SU numbers. The number of sampling points . When dB, it shows that the performance is better with the increase of the number of cooperative SUs. This is because cooperative spectrum sensing can reduce the interference of various factors such as multipath fading and shadow in the propagation environment. Therefore, with the increase of the cooperative SU, the anti-interference ability of the system is stronger and the performance is better.

5.4. The ROC Curve with Different Sampling Points

Figure 9 shows the ROC curve with different sampling points for the EMDIGK method under dB. The number of cooperative SUs . As the number of sampling points increases, the information of the sensing signal is more complete, so the extracted features are more representative. As a result, from the figure we can see that when the number of sampling points increases, the detection performance of the EMDIGK method is also improved.

6. Conclusions

This paper proposes an EMDIGK method to effectively improve the spectrum sensing performance in a complex electromagnetic environment. In terms of feature extraction, firstly, the EMD algorithm is used to denoise the signals collected by the SUs, thereby reducing the impact of noise on subsequent sensing process. Further, in order to analyze this kind of problem more intuitively and simply, we adopt the information geometry theory. The signal after EMD is mapped into the statistical manifold, thereby transforming the signal detection problem into a geometric problem. Then, the corresponding geometric tools are used to extract the signal features as statistical features. Finally, the K-medoids clustering algorithm is used as a learning framework to construct a spectrum-aware classifier to achieve spectrum sensing. In the experiments, the performance of the EMDIGK algorithm is verified and analyzed, which shows its superiority over other popular methods. In our future work, we will study the combination cyclostationary feature detection with clustering algorithms and information geometry. In future work, research could be extended in the area of how to reduce the complexity of the algorithm to achieve spectrum sensing more efficiently.

Data Availability

The data used to support the findings of this study are currently under embargo while the research findings are commercialized. Requests for data, 12 months after publication of this article, will be considered by the corresponding author.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This work was supported in part by special funds from the central finance to support the development of local universities (Grant No. 400170044, Grant No. 400180004), the project supported by the State Key Laboratory of Management and Control for Complex Systems, Institute of Automation, Chinese Academy of Sciences (Grant No. 20180106), the degree and graduate education reform project of Guangdong Province (Grant No. 2016JGXM_MS_26), the foundation of key laboratory of machine intelligence and advanced computing of the Ministry of Education (Grant No. MSC-201706A), and the higher education quality projects of Guangdong Province and Guangdong University of Technology.