Abstract

When massive numbers of wireless IoT devices are being deployed, cognitive spectrum management is critical to satisfy the explosive broadband requirements of IoT applications. Heterogeneous the target of the spatial spectrum estimation which is part of array signal processing is to achieve the evaluation of space signal parameters and source location, which result in that the spatial spectrum estimation becomes the most basic content of the array signal processing. It needs a method by which the large dimensional array still has its consistency. Therefore, this paper studies an improved large dimensional array parameterized spatial spectrum estimation method based on Pisarenko method, named G-Pisarenko method. Firstly, an improved estimator about the logarithm of the covariance matrix of a certain bilinear form is analyzed which is based on the theory of large dimension random matrix. We can find out a relatively better method, i.e., MW method. The method will become the primitive method for us to improve. Then, aimed at the relating covariance matrix in MW, we use an improved large dimensional array estimation method which can improve the logarithm of the covariance matrix estimation. Finally, we compare the improved method and the original method by simulation, and it can be seen the clear advantage of G-Pisarenko method when the sample number and observed dimensions are in the same order of magnitude.

1. Introduction

The development of Industrial Internet of Things (IIoT) has been limited due to the shortage of spectrum resources. Also, the traditional ground industrial Internet of Things (IIoT) cannot supply wireless interconnections anywhere due to its small-scale communication coverage, meanwhile the Internet of Things (IoT) is facing the shortage of spectrum resources due to the rapid growth of IoT terminals and big data services. Fifth generation (5G) network owns sufficient spectrum resources and supplies large data volume business, which can help to expand the communication resources of the IoT by combing IoT with 5G network [13].

Also known as spatial signal processing, array signal processing utilizes some properties of signals in space to process signals [4]. The specific implementation process is to put multiple sensors in different directions in space to form an array, then utilize the specific array of the sensors to receive and process signals, and extract interested signals and related parameters. At the same time, noise and other interference information are suppressed and eliminated. At present, this processing method has occupied a very important position in the field of signal processing, and it is also of great significance in the fields of radar and sonar [5]. Array signal processing has many advantages, which are strong anti-interference ability, higher gain, strong spatial superresolution, and flexible beam control. Therefore, array signal processing occupies a very important position in the current research [6, 7].

The earliest spatial spectrum estimation algorithm is called conventional beamforming (CBF) method, which is based on the Fourier spectrum estimation method in time domain. It mainly replaces the data processed in the traditional time domain with the data received by each array element in the spatial domain. It is also called Bartlett beamforming method, but it has a fatal disadvantage. Namely, the resolution of the array angle would be limited by its physical aperture, and the target located in a beam width space could be often indistinguishable. Subsequently, there are many high-resolution estimation methods that extend nonlinear signal processing technology to spatial domain. There are Burg’s maximum entropy method (MEM), Pisarenko’s method, and Capon’s minimum variance method (MVM) [810]. All these methods have a common premise. Namely, they are based on the continuous distribution of signal sources in space, but they are not satisfied in practice. Therefore, there are great limitations. Later, there would be algorithms to mathematically decompose the data received by the array, such as MUSIC algorithm, which divides the received data into two subspaces of mutually orthogonal signal and noise. Hereafter, the spatial spectrum peak is constructed to improve the resolution [11].

However, no matter what method is used for spatial spectrum estimation, multivariate statistical knowledge is needed. Moreover, in multivariate statistical analysis, there are two kinds of limit results which are quite different. One is that the assumed dimension is very small and the sample capacity is relatively much larger than the dimension. At this time, the theory of satisfaction is called classical limit theory. The other is the large-dimensional limit theory. When the dimension is very high, it might even fail according to the situation described by the classical limit theory. Therefore, the large-dimensional limit theory is proposed. At this moment, it is necessary to combine the large-dimensional random matrix with the spatial spectrum estimation to find an improved estimation method, which could satisfy both the dimension and the observed value geometry. With the maturity of spatial spectrum estimation technology, spatial spectrum estimation based on large array would inevitably become the mainstream of development, and the application of random matrix theory in spectral estimation is also a great progress and breakthrough.

Random matrix theory (RMT) refers to the matrix, whose elements are real, complex, or quaternion of random variables, while the analytical and statistical properties of eigenvalues and eigenvectors are mainly researched in random matrix theory. In the recent decades, with the rapid development of computer science and technology, large-scale data analysis is becoming more and more important. There are statistical problems of massive data in the fields of biological microarray data, wireless communication network, financial stock analysis, spatial spectrum estimation, and so on. However, the classical statistical method is based on the case that the dimension is much larger than the sample capacity. This is not satisfied when the dimension sample capacity tends to infinity at the same time, so the large dimension random matrix theory is used.

At the earliest, random matrix theory was researched by Wishart and Hsu. The main work was the joint distributions of matrix elements and eigenvalues in multivariate statistical analysis. Subsequently, physicists introduced the theory into atomic physics and thought that a random Hermitian matrix could replace the Shrodinger operator, and its eigenvalues were used to describe the observed energy level. Nowadays, there are more and more kinds of random matrix, and the common basic research object is the limit spectrum distribution. This is helpful to research the more precise properties of matrix eigenvalues, such as the increase and decrease of eigenvalues and the distribution of maximum and minimum eigenvalues. At present, many scholars have conducted more in-depth research.

Recently, based on the technique of contour integral in random matrix theory, an improved estimator for determining the logarithm of covariance matrix in bilinear form has been proposed [12]. This newly demonstrated estimator is applicable not only to high sample capacity (such as traditional estimator) but also to the case that both observed dimension and sample capacity increase at the same rate. In practical application, sample capacity and observed dimension always have the same order of magnitude. In this case, this feature shows very good performance. Better fitting estimation could be obtained through using this estimator in spatial spectrum estimation.

In conclusion, array signal processing has already become a very important branch of signal processing field. Thereinto, the spatial spectrum estimation technology could be used to estimate some parameters of signal source position or signal spatial domain, so it becomes the most basic content of array signal processing. At present, many spatial spectrum estimation methods have been proposed, but they have a common drawback. It is not considered that the number of samples and the observed dimension (the number of array elements) are in the same order of magnitude, but only the number of samples is much larger than the observed dimension under the assumption. Therefore, it is urgent to research an estimation method that could still be consistent in the case of large arrays. This is also the important research content in this paper.

2. Theoretical Analysis

Through the analysis of traditional Pisarenko algorithm, it is found that these estimates have a common feature, which is only for the case that the number of samples is far greater than the observed dimension. When the number of samples and the number of sensors is at the same level, there are large errors in all estimates, but the actual situation is like this. In this section, an improved estimator about the logarithm of the covariance matrix of a certain bilinear form is analyzed. This new estimator is applicable not only to high sample capacity (such as traditional estimator) but also to observed dimension and sample capacity increasing at the same rate. This estimate value is based on the theory of large dimension random matrix.

2.1. Mathematical Modeling

A set of observations [13] with a definite zero mean of dimensions (,different) are considered. That is and that is to say . It is assumed that these observations are equally distributed with zero mean and form a covariance matrix together. and are used to represent the eigenvalues and related eigenvectors of matrix . is used to represent the logarithm of matrix . Then, that is

The logarithm of the covariance matrix plays an important role in many signal processing applications, such as spectrum estimation, parameter identification, and spectrum representation. For example, in the spectrum estimation, Pisarenko proposed the following form of spectrum estimation: . In this formula, is a definite reversible function with inverse . is a column vector of , and its th value is equal to the value of . In the spectrum estimation, when there is , there would be power estimators:

Namely, the power depends on the quadratic form of the covariance matrix logarithm of the observations.

In practice, the real covariance matrix is unknown and must be estimated from the samples. The common method is to construct a sample covariance matrix, such as , and then, is estimated from . If there is a process from to infinity, the estimated value of is always satisfied. It could be assumed that is infinitely close to , when is large enough. However, in practice, the number of the sample may not be very large compared with the actual dimension of the matrix , so the accuracy is quite low in this case.

To solve this problem, a generalized progressive form would be focused on. Namely, and could be regarded as large but comparable numbers. The consistent estimation of definite bilinear form of would be considered, when and increase to infinity at the same rate. More specifically, we would think that the numerical estimates like are consistent in this new progressive form, if and are two universal column vectors with the same unit standard. If and are, respectively, determined to be the -th column and the -th column in the unit matrix , the value of would be equal to the value of the position in the matrix . In some applications (such as spectrum power estimation and spectrum coefficient determination), is a more meaningful thing. Thereinto, is defined above. In this case, the main target is to estimate the value of .

2.2. Method of Estimation

The number of samples is regarded as a function of the observed dimension. That is . is introduced. It is strictly assumed that the number of samples is higher than the observed dimension. That is . The following assumptions [14] are made: (i)(Hypothesis 1). According to the zero mean Gauss law of covariance matrix, the set of dimension vectors is distributed independently and identically(ii)(Hypothesis 2). The minimum eigenvalue of (or the maximum eigenvalue) is not equal to zero or infinite in (iii)(Hypothesis 3). The sequence is not equal to 0 or 1

The main conclusion under this hypothesis is that the estimator of the universal bilinear form is generally applicable, when and tend to infinity at the same rate. The above definition of sample covariance is considered. and are used to represent its eigenvalues and corresponding eigenvectors, respectively.

Theorem 1. The both unit standards could be considered to be the certain column vector sequences and of . could be satisfied. There is , when all three hypotheses are satisfied, Thereinto: The coefficient is Thereinto, the value of is equal to the value in the following equation:

It is worth pointing out here that the new estimate value could converge to the classical value, if is taken as a fixed value and tends to infinity. Actually, in the observation in equation (5), when tends to infinity, could be obtained in the expression, in which and could apply this progressivity with in formula (3).

2.3. Theorem Proving

The proof of the theorem is based on the contour integral method. The eigenvalue of the covariance matrix is consistent with the estimate value of the eigenvector, when it is proved that and tend to infinity and the value of is a constant. Firstly, the estimated quantity needs to be expressed in the form of complex contour integral [15] of real covariance matrix decomposition function. This is defined as a matrix . Thereinto, the contour of the integral is parameterized by a very convenient function , which would be described in the following content. This parameterization allows us to use a definite algebraic expression of consistent estimation of decomposition function of sample covariance matrix to express the integrand function. Finally, the expression could be obtained through using the classical Cauchy residue theorem.

2.3.1. Formulation of Contour Integral and Parameterization of Contour

From the observed value, Cauchy residue theorem could be used to express as an integral form:

Thereinto, includes the clockwise direction of all eigenvalues’ oriented contour, and is a holomorphic logarithmic branch above . The contour is regarded as a curve in . For arbitrary , could satisfy:

for the rectangle ’s periphery, is a result of the following polynomial equation:

In particular, seen in Figure 1, it is worth noting that is the only root which makes both and true, if there is. If there isto take , that the contour does not intersect the negative real axis could be deduced because of.

The parameterization of the contour is used, and formula (8) is rewritten as follows:

Thereinto, is the complex derivative of . That 0 does not belong to the rectangle R could be seen (because of ), and there is periphery distance between of the distance 0 of and the eigenvalue of . The both holomorphic functions and could be defined as follows: makes,

Thereinto, is the derivative of .

2.3.2. Consistent Estimation of Integrand

Under Hypothesis 1 and Hypothesis 3 [16], and are almost applied to all , when and tend to infinity. Thereinto, and are defined as follows:

According to the convergence theorem, the results in the relevant data could be reused, and the random quantities and in the above integral sum could replace . Moreover, when and tend to infinity, it is certain that there could be:

Thereinto,

Therefore, it is necessary that in (14) could also be expressed as in (4), if the proof of Theorem 1 could be obtained.

2.3.3. Processing the Produced Integral

Unfortunately, the nonisolated singularity could be appeared for the above integral in (due to the appearance of logarithms). Moreover, the integral could not be processed with the conventional complex integration method. To solve this, the following real-valued function is defined:

This function could prove that could be well-defined. In addition, and could be easily solved by Cauchy residue theorem. Actually, the integral of is meromorphic, has single and plural poles at the place , has double and plural poles at the place [17], and has single and plural poles at the place . In , the algebraic property could be utilized to know that is equal to defined in (4).

Next, for any function that could satisfy , this function is not only continuously differentiable but also satisfactory for:

For any arbitrary , this integral could also be calculated in the closed case, in which the conventional integration method is used. Actually, the above integral a meromorphic function, which has double complex poles at , single complex poles at , and single complex poles at and . Thereinto, is the solution that satisfies the following equation . Through calculating the remainder of these poles, it is easy to obtain the closest expression with respect to , which is defined as follows:

The first three terms of the above equation are a simple primitive function for . Thereinto, the appearance of the function makes the last item difficult to be processed. To simplify the last term, the variables in the above formula could be changed to make . Afterwards, the polynomial parameters produced in are expressed as the sum of simple fractions. The derived original function and the value derived above are used to obtain:

Finally, is taken, and some algebraic identities are used to obtain the value as shown in Theorem 1.

It could be seen that this method is indeed a good estimate, and it still has good properties in large-dimensional arrays.

3. G-Pisarenko Algorithm Design

In this section, the aforementioned logarithmic improved estimator is applied to the traditional Pisarenko algorithm to propose a new algorithm, namely, G-Pisarenko algorithm.

3.1. Signal Model

The array used in this paper is a linear array. With the origin as the reference point, the array element position is assumed to be . When the angle of the signal incidence is , there is then. When the array is a uniform linear array, there is only a relative delay difference among the different array elements, which receive signals in the same direction. At this moment, the first array element is regarded as the reference array element, and there is . Therefore, the common array models are combined to be:

Thereinto, is the dimension receiving signal vector of the array, is the dimension signal vector, is the dimension noise vector, is the dimension steering vector array of the array, and is the steering vector of the array.

3.2. G-Pisarenko Algorithm

All algorithms satisfying ( is the inverse function of ) belong to Pisarenko algorithm [18]. Thereinto, the best overall performance is the MW algorithm . Based on this, Theorem 1 in Section 2.2 is combined to obtain the following conclusion:

Since is the determined sequence of column vectors, could be satisfied; there is , when all three assumptions in Section 2 are satisfied. Thereinto,

The coefficient is:

Thereinto, the value of is equal to the value in the following formula:

The two expressions of are compared. In the case of large dimensions, could be used to represent,

According to equation (23), the relevant simulations are performed, and the following simulation results are obtained.

4. Simulation Results and Analysis

4.1. Spectrum Peak Search Results

In order to illustrate the validity of the proposed estimates, an array processing procedure with three signal sources is considered. The array element is composed of a uniform array. The distance between the array elements is half a wavelength. All three signal sources are subject to the standard of Gaussian distribution. The superimposed noise is a random process with a mean value being zero and a variance being one. The arrival direction of the signal source being relative to the periphery of the array is fixed to [-40, 0, 40] degree. The simulations are performed in the following cases:

The first case is and . Namely, the number of samples is much larger than the observation dimension (the number of arrays), and it is run once. At this moment, the spectrum peak search results of traditional Pisarenko and G-Pisarenko methods are shown in Figures 2 and 3.

It could be observed that at this moment, the both have the spectrum peaks at -40 degrees, 0 degree, and 40 degrees. However, there would be more serious pseudo peaks under the traditional method. In comparison, there are obvious peak values in the improved method. If the angular interval is very small, the estimation result would be inaccurate. This is also one problem in the traditional Pisarenko’s MW method. There are obvious peak values in the improved method, when the angular interval is small.

This experiment is repeated 50 times, and its results are shown in Figures 4 and 5.

It could be seen that in the traditional method, there are serious pseudo peaks, and the graphics are more chaotic around 0 degree. At -40 degrees and 40 degrees, its effect is not as clear as that of the improvement method also.

The second case is and . The number of samples and the value of the observation dimension are smaller, but the size is in the same order of magnitude. At the same time, it is a constant, which satisfies and is greater than 0 and less than 1. At this moment, the simulation results are shown in Figures 6 and 7.

It could be observed that in the traditional method, not only the pseudo peaks exist but also the values of the pseudo peaks are relatively larger in this case, which is similar to case 1. Its difference is not as obvious as that of the improved method. Therefore, the improved method still has its advantages in this case.

This experiment is repeated 50 times, and its simulation results are shown in Figures 8 and 9.

The third case is and . The number of samples and the value of the observation dimension are larger, but the size is in the same order of magnitude. At the same time, it is a constant, which satisfies and is greater than 0 and less than 1. Since the amount of calculation is especially large when the values and are larger, 100 and 200 are used to roughly fit the case of large dimension. At this moment, the simulation results are shown in Figures 10 and 11.

It could be observed that the pseudo peak values in Figure 9 are very close, and the distortion is serious. The spectrum peaks in Figure 10 are obvious. Due to the large amount of calculation, the simulation results after 50 repetitions would not be discussed here, but it could be inferred that there is not much difference between 50 times and 1 time. The estimated spectrum peaks are still more obvious in the improved algorithm.

4.2. Comparison for Root Mean Square Error

Next, the root mean square errors generated in the traditional method and the improved method could be compared, when the different values are taken for the signal-to-noise ratio. Here, the arrival direction of the signal source could be fixed to the degree, which is relative to the periphery of the array. The signal-to-noise ratio is between -15 and 20. When running 100 times, the simulation results are as follows:

The first case: when there are and , the simulation results at this moment are shown in Figure 12. It could be found that when the number of array elements and the number of samples are in the same order of magnitude, the root mean square error of the G-Pisarenko method is much lower than that of the traditional Pisarenko method. As the signal-to-noise ratio increases, the mean square error of the improved method gradually approaches zero.

The second case: when there are and , the simulation results at this moment are shown in Figure 13. It could be found that when the number of array elements is relatively small, the root mean square error of the G-Pisarenko method is still much lower than that of the traditional Pisarenko. As the signal-to-noise ratio increases, the root mean square error of the improved method gradually approaches zero. The spectrum estimation effect is very well.

5. Summary

In this paper, an improved Pisarenko parameterized spatial spectrum estimation method, namely, the G-Pisarenko method, is proposed based on random matrix theory. From the simulation, it could be found that when and values are very small, the spectrum peaks of the improved algorithm are more obvious, and there is no redundant pseudo peak. When the and values are larger, the spectral peaks in the traditional algorithm are chaotic, but the performance is still well in the improved algorithm. From the perspective of root mean square error, the error of the improved algorithm is much smaller than that of the traditional algorithm. So, this is a very good improved estimation. According to the principle of the improved estimation, the methods in this paper could be used in other algorithms to achieve an improved feature decomposition algorithm for large-dimensional arrays. The scheme can be applied to the Internet of things and play an effective improvement.

However, at the same time, it is not difficult to find that during the spectrum peak search, a large amount of calculation could be generated in the method proposed in this paper. The complexity of the calculation is so high that it is impossible to discuss the situation where the values of and tend to be infinite. Therefore, this paper also needs to be improved on this basis. A simpler and more effective algorithm could be found. In addition, the situations researched in this paper are based on the ideal situations of many assumptions, which could only be ideally approximated in practical occasions. The effectiveness needs to be improved.

Data Availability

The (data type) data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

This paper has clear and legal intellectual property rights or property rights.

Acknowledgments

This work was supported by the Beijing City Board of education project (No. KM202010858005 and 2020Z006-KXZ).