Abstract
To make up for the deficiency in the accuracy of temperature profile observation of existing ground-based microwave radiometers, the application of hyperspectral techniques to the microwave band was attempted. To develop a ground-based hyperspectral microwave radiometer, we must first select the detection channel. According to the degrees of signal freedom (DFS) based on information content of atmospheric temperature, the current study selected 200 channels containing 80% of the information to be selected from 343 candidate channels with oxygen absorption bands of 50~70 GHz, 110~130 GHz, and 415~435 GHz. At the same time, a sensitivity analysis was performed on DFS, in which the variation of background field errors had less influence on the channel selection, but the variation of observation errors significantly affected the information content of the channel. In 2015, the BP neural network method was used to simulate the atmospheric temperature profile environment in Kunming and compare it with seven temperature detection channels selected from the currently used microwave radiometer RPG~HATPR0~G3. The inversion results indicate the following: (1) Selecting the channel with 80% information content does not reduce the inversion accuracy of the temperature profile. (2) The 200 channels selected are relative to the 7 channels of RPG~HATPR0~G3. The accuracy of inversion is increased by 0.5 K at the height of 0~8 km and increased by 0.5~1.2 K in the range of 8~10 km.
1. Introduction
Temperature and relative humidity are basic parameters that describe the state of the atmosphere. Real-time detection of temperature and relative humidity are essential for numerical weather prediction (NWP) and climate research [1, 2]. The ground-based microwave radiometer can obtain data of the atmospheric temperature and relative humidity in the range of 0 to 10 km. Because of its simple operation and portability, it plays an important role in the atmospheric detection system. However, with the development of NWP, the accuracy of atmospheric temperature detected by existing ground-based microwave radiometers is increasingly unable to meet the demand for accurate NWP [3–5].
Previously, hyperspectral technology was applied to atmospheric detectors in the infrared band and greatly improved the detection accuracy of atmospheric temperature. For example, Atmospheric Infrared Sounder (AIRS [6]) equipped with Aqua satellites and Infrared Atmospheric Sounding Interferometer (IASI [7, 8]) mounted on MetOp-A satellites have improved the inversion accuracy of the temperature profile to 1 K and the vertical resolution to 1 km in clear sky. However, the infrared band has its own limitations. The infrared band cannot penetrate the cloud layer and can only obtain the atmospheric information above the cloud layer. Different from the infrared band, the microwave band is less affected by the cloud layer and can achieve all-weather observation. Therefore, the application of hyperspectral techniques to the microwave band can be considered to improve the detection accuracy of the ground-based microwave radiometer and the accuracy of NWP [9]. W.J. Blackwell et al. [10–12] first published an article on the use of microwave hyperspectral radiometers for inversion of atmospheric temperature and humidity wet profiles, demonstrating the feasibility of applying hyperspectral techniques to the microwave band. Xie et al. [13] designed a K-band microwave hyperspectral radiometer that can improve the detection accuracy of atmospheric humidity profiles under high humidity conditions. Aires F et al. [14] studied microwave hyperspectral measurements for temperature and humidity atmospheric profiling from satellite.
To develop a ground-based hyperspectral microwave radiometer, we must first select the appropriate detection channel. In the infrared band, the bands are usually divided equally and several thousand channels with the same frequency interval are obtained [7, 8]. Since hyperspectral microwave radiometers cannot achieve spectral resolution of the same order of magnitude as hyperspectral infrared radiometers, the number of channels that can be set at the current receiver technology level is limited, so careful determination of the channel position by simulation studies is very important [15]. Refer to research on information content and information content in atmospheric remote sensing by Rogers et al. [16–18]. The current study used DFS, which can represent information content, as an indicator to perform channel selection. This article is divided into two parts: (1) With the DFS as an indicator, the basis and results of channel selection are introduced. At the same time, the sensitivity of DFS to the background field error and the observation error is analyzed, and (2) either the selected channels or channels of ground-based microwave radiometer RPG~HATPR0~G3 were used for the forward and inverse simulation experiments, and the results of simulation experiments were compared.
2. Channel Selection Basis and Analysis of Results
2.1. Information Theory
According to Shannon's information theory [19], the amount of information is related to the elimination of uncertainty as well as how much uncertainty is eliminated and how much information is obtained. Information entropy is used to represent this uncertainty, which is expressed by formula (1):
where represents the state vector, represents the probability density function (p.d.f.) of the state vector, and represents information entropy. If the state vector is observed and the probability density function of the observed state vector becomes , then the amount of information obtained by this observation can be represented by formula (2):
where represents information entropy before observation and represents information entropy after observation.
When information content is discussed, the problem of atmospheric radiation in the microwave band can be regarded as a linearization problem [16]. We use formula (3) to represent the relationship of observation vector and the state vector:
where represents the atmospheric state, a matrix of dimensions, in which m represents the number of atmospheric layers. is the Jacobian matrix, an matrix, with representing the number of observation channels, is a matrix of dimensions that represents the observed brightness temperature, and represents the observation error.
Information theory will be used in the context of a Bayesian or optimal estimation approach to retrieval, in which information content is conserved. Before observation, is set as the prior p.d.f. of the atmospheric state. is the posterior p.d.f. of the atmospheric state after the observation, and is the p.d.f. of measurement after the state is given. is the prior p.d.f. of observation. Their relationship can be represented by formula (4):
When one assumes that the p.d.f. conforms to a Gaussian distribution, the above equation can be rewritten as formula (5):
where is the a priori state vector and is the background field error covariance.
With this model, the posterior estimate of x has a Gaussian p.d.f. with a covariance given by formula (6):
where denotes the observed error covariance.
The information content obtained by the observation is calculated from (2) and can be expressed in DFS. DFS are defined by following formula (7):
DFS represent the amount of information obtained from the atmosphere before and after inversion. The greater the value of DFS, the more atmospheric information the channel obtains.
2.2. Selection of Candidate Channels
Oxygen has three main absorption bands, namely, 50~70, 118, and 425 GHz. The satellite-based microwave radiometer has used these three bands to detect atmospheric temperature. The existing ground-based microwave radiometer typically selects the detection channel in the oxygen absorption band of 50~70 GHz. The current study selects the detection channels in all of these three bands. The observation noise of a channel increases with the increase of the frequency and decreases with the increase of the bandwidth. To reduce the measurement noise of channels in 110~130 and 415~435 GHz, the bandwidth can be set as relatively large. For the three bands 50~70, 110~130, and 415~435 GHz, the bands were divided equally into 100, 200, and 500 MHz bandwidths, yielding a total of 343 candidate channels.
2.3. Channel Selection Method
Channel selection methods typically include the following: (1) Selection methods are based on weight functions and channel characteristics, such as the singular value decomposition method proposed by Prunet [20] and the principal component analysis method proposed by Mitchell [21]. The first method is based on the radiation characteristics of atmospheric parameters in each channel and does not fully consider the effects of background field errors, observation errors, etc. (2). Based on the information content and atmospheric inversion, the second method considers the channel inversion capability when selecting channels and quantifies the contribution of channels to inversion parameters.
The current study uses the second method for channel selection. With DFS as an indicator, channels with large information content are selected channel by channel. The specific selection process is as follows: In the first round of selection, the channel with the largest DFS value was selected. This is also the channel with the highest content of information among all of the candidate channels. In the second round of selection, the channel selected in the first round was eliminated, and the calculated in the first round was set as in the second round of operations, the channel with the second highest information content is selected, and so on. The advantage of this selection method is that not only is the channel with a higher information content selected but also the level of each information content is ranked; this facilitates the addition or elimination of the subsequent channels.
2.4. Calculation of the Background Field Error Covariance Matrix
By referring to a study by Zhang et al. [22], the experimental background field error covariance matrix was obtained by statistically calculating the deviation of the atmospheric temperature profile data. The filtered clear-sky data was averaged, and each profile was subtracted from the average value to get deviation matrix. The background field error covariance matrix is obtained by multiplying the deviation matrix by the matrix transposition and then dividing the product by the total number of profiles. It is represented by formula (8):
is a matrix of dimensions consisting of atmospheric temperature profiles, and represents the number of atmospheric profiles. is the average of the atmospheric profile.
2.5. Calculation of the Observed Error Covariance Matrix
The observation error typically includes the error caused by the radiation transmission model and the error of each channel of the instrument noise. The error covariance matrix is typically treated as a diagonal matrix. Each element on the diagonal is the square of the root-mean-square error of the measured error for each channel. Because there are currently no observed data of the temperature and humidity profile observed by the hyperspectral ground-based microwave radiation, the experiment can only be estimated based on the observation error of each channel of the existing ground-based microwave radiometer and reference [14]. We assume that the root mean squared error of the channels is 0.2 K in 50~70 GHz, 0.3 K in 110~130 GHz, and 0.4K in 415~435 GHz.
2.6. Calculation of the Jacobian Matrix
The Jacobian matrix represents the sensitivity of the observation brightness temperature to temperature profile changes. Therefore, the current study uses the perturbation method to calculate the Jacobian matrix. The perturbation method refers to changing the temperature of the atmospheric layer from the first layer to the top layer by 1 K and observing the variation of the brightness temperature of each channel. It should be noted that when the temperature of the next layer changes, the temperature of the previous layer is restored.
2.7. Channel Selection Results and Analysis
Using DFS as an indicator, the information content of each channel is illustrated in Figure 1. Figure 2 illustrates the oxygen absorption coefficient from 50 to 450 GHz.


As can be seen in Figures 1 and 2, the overall distribution of information content in each channel is similar to that of oxygen absorption coefficient in each channel. The information content of a channel with a larger value and oxygen absorption coefficient is also higher. According to Kirchhoff's law, the stronger the oxygen absorption capacity, the stronger the ability of oxygen to release energy. Therefore, more energy is emitted in the channel with a large absorption coefficient, and the corresponding information content is larger. At the same time, attention was paid to the 55.7~64 GHz band. The information content of this band was not as much as the distribution of oxygen absorption coefficient. Instead, it decreased. This is because the information content not only includes the information of the atmospheric parameters, but also relates to background field errors, measurement errors, and the Jacobian matrix. Considering that the root-mean-square error of all channels in this band is set to 0.2 K and the background field error is determined in this experiment, the reason for this distribution is due to the Jacobian matrix in this band. The Jacobi weight function graph for 50~70 GHz is illustrated in Figure 3.

As can be seen in Figure 3, the distributions of the weight functions of the blue line and the red line are significantly different. The red line gradually decreases from the low altitude to the high altitude. However, there is still a relatively small value in the high altitude, so the channels corresponding to the red line can detect atmospheric parameters from low altitude to high altitude. By contrast, the weight of the blue line is close to 0 in the high altitude, but at the lower level, there are almost more weights for the blue line than for the red line, so the blue line is more advantageous for low altitude detection. Because the blue line lacks some of high altitude information, its information content is decreased.
When the 343 candidate channels by the information content are arranged from highest to lowest, the results are illustrated in Figure 4. The 90 channels with the highest information content are distributed in the 50~70 GHz band, followed by the channels in 110~130 GHz, and the channels in 410~430 GHz have lowest information content. This could be the reason why traditional ground-based microwave radiometers do not select detection channels at 118 and 425 GHz. The information content of channels in these two bands is lower relative to the channels in 50~70 GHz. However, for a hyperspectral ground-based microwave radiometer, the number of channels can be set to several tens to several hundreds; although the information content in the 118, 425 GHz band is relatively small, these two bands are far apart from the 50~70 GHz and do not interfere with each other. Therefore, the channels can be selected in these two oxygen absorption bands. The information content of channels in 50~70GHz is relatively large. When the channel is selected at 50 ~70 GHz, the frequency resolution of the candidate channel can be further increased to obtain more information. At the same time, it should be noted that an excessively high frequency resolution increases the interference between the channels, which in turn reduces the inversion accuracy. Therefore, there is an optimal problem in the resolution selection of the channel to be selected.

2.8. Sensitivity Analysis
2.8.1. Sensitivity to Background Field Errors
By considering the sensitivity of the information content of each channel to the variation of the background field error, it is assumed that the background field error conforms to the Gaussian distribution. The experiments set the background field error variance to 0.5 K, 1 K, and 2 K, respectively, and their channel selection results are shown in Figures 5(a), 5(b), and 5(c), respectively. As can be seen in Figure 4, when the error variance of the background field changes, the order of the channels ranked by the information content from the largest to the smallest changes slightly. The ranks of the very few channels in 110~130 GHz have slightly increased, but most of the channels have not changed. Therefore, it is believed that variation of the background field error has little effect on channel selection.

(a)

(b)

(c)

(d)

(e)

(f)
2.8.2. Sensitivity to Observation Errors
To study the influence of the observation error on the choice of channels, the experiment adopts the control variable method. It is assumed that the observation errors in the 50~70 GHz and 415~435 GHz frequency bands remain unchanged, and the observation errors of channels in 110~130 GHz are set to 0.2, 0.3, and 0.4 K, respectively. The channel selection results are illustrated in Figures 5(d), 5(e), and 5(f). As can be seen in the figure, as the observation error increases, the information content of the channels in 110~130 GHz generally decreases. When the observation error is 0.2 K, there are a large number of channels in the band from the 50th position. When the error is 0.4 K, the channel within the band does not appear until approximately the 125th ranking. The observation error typically consists of two parts: one is the error generated by the instrument noise, and the other is the error caused by the radiation transmission mode. The same radiation transmission mode and the same band were used in the experiment, so the error caused by the radiation transmission mode was not considered. Then, the increase in the observation error represents the increase in instrument noise, and the observed atmospheric information is easily submerged by noise, resulting in a decrease in the information content of the channel. Therefore, in designing ground-based hyperspectral microwave radiometer, channel observation errors should be minimized to obtain more information.
3. Data and Results of the Simulation Experiment
3.1. Selection of Simulation Data
The selection of an atmospheric profile database is very important for simulation experiments. The data used in current study are the atmospheric profile data over Kunming from May to December 2015. To provide neural network inversion with training data under the same atmospheric environment, in current study filter 300 clear-sky data from the May–December atmospheric profile data for simulation experiments. When identifying a cloudy or cloudless weather, we analyze the historical data of Kunming, took the cloudy conditions as those with 85% of relative humidity, and removed the cloudy weather to get clear-sky conditions.
Because the lower atmosphere contributes significantly to the bright temperature and the atmosphere contains much information, it mainly inverts the atmosphere from the ground to the atmosphere at a height of 10 km. The atmosphere is finely stratified, and 100 m of the 0 to 10 km region is one layer, and a total of 100 layers were obtained.
3.2. The Radiation Transmission Model
The research in this paper is based on the classical microwave radiation transmission model pwr. The pwr contains 49 absorption lines for oxygen at 50~70 GHz, etc. and 15 absorption lines for water vapor at 22 GHz, etc. Pwr uses a line-by-line integration method to calculate the absorption coefficient, and the simulation results are more accurate. During the research, the zenith angle of the microwave transmission was set to 90 degrees, and the effects of various scatterings were not considered.
The radiation brightness temperature of the atmosphere and space to the ground can be expressed as formula (9):
Among them, is the radiation bright temperature of space in the universe, about 2.7 K, is the zenith angle, is the frequency of the channel, is the oxygen absorption coefficient, and it is related to height and frequency.
3.3. The Inversion Method
For hyperspectral radiometers, the use of BP neural networks to invert atmospheric temperatures has been shown to be feasible [23]. The input layer corresponds to the brightness temperature of the channel simulated by the radiation transmission model and the pressure, temperature, and humidity near the ground, and the output layer corresponds to the temperature of the atmosphere. The number of input layer nodes is the number of channels, the number of output layer nodes is the number of atmospheric layers, and the number of hidden layer nodes is related to the number of other two layer nodes. Due to the large number of channels under hyperspectral conditions, the number of input layer nodes is also large, and it takes longer to train the parameters of the BP neural network. Therefore, the atmosphere is divided into 5 parts from bottom to top. Each part has 20 layers, and the thickness of each layer is 100 m. Five BP networks were used to invert the temperature of each part of the atmosphere. For these five BP neural networks, the number of nodes in the input layer of each network is 200, the number of nodes in the output layer is 20. Referred to related literature [24], the number of nodes in the hidden layer is finally determined to be 80. In the case of the 7 channels of RPG~HATPR0~G3, one BP neural network is used, the number of input layer nodes is 7, the number of output layer nodes is 100, and the number of hidden layer nodes is 25.
In addition to the number of nodes in each layer, the performance of the BP neural network is largely related to the transfer function and training method used by the network. In this network, the hidden layer uses a hyperbolic tangent sigmoid transfer function (tansig). The input of the function is arbitrary, and the output is between -1 and 1. The output layer takes a linear transfer function (purelin) whose input and output can be arbitrary values. The training method uses a Bayesian regularization algorithm (trainbr). The algorithm can limit the network weight to eliminate overfitting and has good generalization performance.
When 200 channels are taken as an example, the main flow for inversion of atmospheric temperature using a BP neural network is as follows. First, using the simulated 200-channel brightness temperature from May to November 2015 in Kunming and the corresponding atmospheric temperature information as learning samples, training the neural network to determine the parameters of the neural network; then using 30 sets of simulated brightness temperature data in December as the test samples that are input into a trained neural network to invert the atmospheric temperature. Finally, the sounding data observed in December were used to evaluate the inversion accuracy.
3.4. The Channels Selected by the Simulation Experiment
Figure 6 illustrates the cumulative information content of the selected channel. When added, the channels are arranged in descending order of information content. We selected the first 200 channels that contained 80% of the cumulative information content of all channels for simulation experiments.

The seven temperature detection channels of the RPG-HATPR0-G3 microwave radiometer were selected for comparison experiments. The frequencies were 51.26, 52.28, 53.86, 54.94, 56.66, 57.3, and 58 GHz.
3.5. Simulation Results
3.5.1. Forward Simulation Results
Calculate the average of 300 sets of clear-sky profile data from May to December 2015 in Kunming, and substitute the obtained mean profile into pwr to calculate the downward radiation brightness temperature of the selected 200 channels and the 7 channels of RPG-HATPR0-G3, and calculate the Jacobian weight function in both cases. The results are shown in Figures 7 and 8:

(a)

(b)

(a)

(b)
As shown in Figure 7(a), the distribution of the downward radiation brightness temperature of the selected 200 channels is similar to that of the oxygen absorption coefficient, and the value of the downward radiation brightness temperature corresponding to the channel with a large oxygen absorption coefficient is also large. Among the 200 channels, some of the channels have relatively close values of radiation brightness temperature, so ground-based hyperspectral microwave radiometers require higher radiation resolution. The Jacobian weight function of each channel is shown in Figure 8. The weight function of 200 channels relative to 7 channels can be seen as a function of interpolating many weight functions in the weight function of 7 channels. The atmosphere information corresponding to each weight curve is different. The 200 channels have more dense sampling of atmosphere temperature.
3.5.2. Inverse Simulation Results
In this paper, the sounding data are used as the criterion, and the inversion results at each height level and the deviation between the inversion result and the true profile are calculated. ME is defined as formula (10):
In the formula, is the total number of samples retrieved by a certain height layer, where is 30, is the sounding data at this height, and is the inversion result.
Figure 9 shows the average deviation of atmospheric temperature inversion at each altitude. The inversion result of 200 channels is compared with 7 channels. In the range of 0 to 8 km in height, the inversion accuracy increased by 0.5 K. In the range of 8 to 10 km, the inversion accuracy improved by 0.5 to 1.2 K. The inversion results of 200 channels were significantly better than those of the 7 channels inversion. The reason might be that when the observation channel increases, the atmospheric temperature vertical resolution was increased. Therefore, it is better to measure the distribution of atmospheric temperatures. Comparing the inversion results of 200 channels and 343 channels, it was found that 200 channels with an information content of 80% did not significantly reduce the inversion accuracy.

As shown in Figure 10, it is a typical inversion temperature profile selected from the inversion results. The time of sounding data is 23:15 on December 1, 2015. The black line is the true atmospheric profile, the red line is the 200-channel inverted atmospheric profile, and the blue line is the 7-channel inverted atmospheric profile. It can be seen that the 200-channel result is more in line with the true atmospheric profile.

3.5.3. Analysis of Influence of Observation Noise on Inversion Accuracy
Although hyperspectral technology has the advantages of multiple detection channels, multiple channels also bring about the problem of channel interference. When the observation noise is too large, the inversion accuracy is reduced. To study the effect of noise on the inversion accuracy, Gaussian white noise is added after the brightness temperature of each channel is obtained. The noise variance is set to 0 K, 0.1 K, and 0.3 K, respectively, and the 200-channel inversion deviation is shown in Figure 11. When the noise variance increases, the inversion deviation of the temperature profile increases. However, compared with Figure 9, the inversion deviation is still smaller than the 7-channel inversion deviation. This shows the superiority of the hyperspectral channel, and also puts forward higher requirements for the manufacture of hyperspectral instruments. The smaller the noise of each channel, the better.

3.5.4. Analysis of Influence of Water Vapor Content on Inversion Accuracy
Water vapor also absorbs radiation at 50~70,118, and 425GHz. The more water vapor there is, the more radiation is absorbed. Therefore, it is necessary to analyze the influence of water vapor. Adding 0.1 g/m3 and 0.5 g/m3 water vapor density (vapden) to each layer of the atmosphere, respectively, calculate the downward radiant brightness temperature under two conditions and substitute it into the inversion model to obtain the 200-channel inversion of atmospheric temperature. It can be seen from Figure 12 that when the vapden increases by 0.1 g/m3, the inversion accuracy has almost no change; when the vapden increases by 0.5 g/m3, the inversion accuracy is reduced.

4. Conclusion
Based on the information content, the current study set 343 candidate channels for selection. We arranged the channels by information content from the highest to the lowest, and the first 200 channels containing 80% of the total information content were selected for simulation experiments. The 7-channel simulation experiments of the radiometer RPG-HATPR0-G3 were compared. It was found that the use of hyperspectral technology can significantly improve the inversion accuracy of atmospheric temperature profile and that selecting 80% of information does not reduce the temperature profile inversion accuracy. The factors affecting the information content of each channel were analyzed. The background field error had little effect on information content, and larger observation error resulted in lower the information content. The factors affecting the accuracy of inversion are analyzed. Water vapor content has effect on temperature inversion accuracy when water vapor content changes too much. The observation noise will reduce the inversion accuracy of temperature, but the inversion results of the 200 channels of high measurement noise are still better than the inversion results of the 7 channels of low measurement noise. Therefore, it is meaningful to develop ground-based hyperspectral microwave radiometers. In addition, the current study is based on the conditions of a clear sky, and future research will further study the atmosphere under cloudy conditions.
Data Availability
The data used to support the findings of this study are available from the corresponding author upon request.
Conflicts of Interest
The authors declare no conflicts of interest.
Authors’ Contributions
Rui Wang, Wei Yan, Wen Lu, Shuo Ma, Chengming Gu, and Xianbin Zhao conceived and designed the structure of the paper. Yuxun Wang conducted the literature research, analyzed the data, and wrote the main part of the paper. All authors contributed to the discussion of the results and have read and approved the final manuscript.
Acknowledgments
This research was funded by National Natural Science Foundation of China grants numbers 41605016, 41575028, 41705007, and 41475019 and Natural Science Foundation of Hunan Province of China grant number 2019JJ50719]. We sincerely thank Professor Jie Zhang and Ms. Huiling Xie of Chengdu University of Information Technology for their sounding data from May to December in Kunming Changshui Airport in 2016.