Abstract

In this study, radiosonde observations during the period of 2012-2013 from three stations in the Hunan region, China, were used to establish regional models (RTMs) that are a fitting function of multiple meteorological factors (, , and ). One-factor, two-factor, and three-factor RTMs were assessed by comparing their against the radiosonde-derived (as the truth) during the period of 2013-2014. Statistical results showed that the bias and RMS of the one-factor RTM, in comparison to the BTM result, were reduced by 88% and 28%, respectively. The two-factor and three-factor RTMs showed similar accuracy and both outperformed the one-factor RTM, with an improvement of 7% in RMS. The bias and RMS of all the four seasonal two-factor RTMs were smaller than the yearly two-factor RTM, with the improvements of 3%, 10%, 2%, and 3% in RMS. The improvement of the conversion factors in mean bias and RMS resulting from the seasonal two-factor RTM is 92% and 31%. The bias and RMS of the PWV resulting from the seasonal two-factor RTM are improved by 37% and 12%, respectively. Therefore, the seasonal two-factor RTMs are recommended for the research and applications of GNSS meteorology in the Hunan region, China.

1. Introduction

Atmospheric water vapor is a minor constituent of the atmospheric mass distributed in the lower atmosphere layer, but it is one of the main driving forces to weather changes and atmospheric circulation [1]. A large amount of water vapor may lead to thunderstorms and other weather disasters after a period of accumulation in an area. The dynamical variation trend of water vapor is an important factor for climate prediction and weather forecasting [2]. However, it is not practical to use traditional meteorological sensors (e.g., radiosonde balloons and water vapor radiometers) to observe water vapor at a high spatiotemporal resolution due to their high operational costs and low spatiotemporal resolution; for example, most of radiosonde balloons are usually launched twice per day [3]. Moreover, it is more difficult to trace the dynamical variation trend of water vapor on small- and medium-scale weather systems timely and accurately [47], especially for the detection and prediction of sudden rainstorms.

Nowadays, Global Navigation Satellite Systems (GNSS) have heralded a new era for retrieval of atmospheric precipitable water vapor (PWV) due to its 24-hour availability, global coverage, high accuracy, high spatiotemporal resolution, and low cost [710]. The atmospheric parameter directly estimated from the GNSS measurement is the zenith tropospheric delay (ZTD) of the GNSS signal. The ZTD is comprised of zenith hydrostatic delay (ZHD) and zenith wet delay (ZWD). PWV can be converted from the ZWD together with other atmospheric variables. Although the GNSS-derived ZTD can be directly assimilated into numerical weather prediction (NWP) [4, 5], PWV has the potential to be used for the studies of severe weather [1114] and climate changes [15, 16]. Previous studies [13, 14, 17] have shown that most severe rainfall events occurred in the descending trends of time series PWV over a station after a long ascending period. It is likely to rain after a steep ascent and sudden descent in PWV. Moreover, Benevides et al. [18] suggested that the reliability and accuracy of severe weather forecast could be improved by analyzing 2D or 3D variation in PWV fields with the aid of other meteorological data [13, 1820].

GNSS-derived PWV is obtained from the ZWD multiplying by a conversion factor [21, 22] which is a function of weighted-mean temperature over the station [8]. Therefore, the accuracy of affects the accuracy of its resultant PWV [23, 24]. The most accurate method for obtaining is to use both temperature and humidity profiles from radiosonde data [8, 25]. However, at most GNSS stations, there are no colocated radiosonde stations available due to their high operational costs and low spatiotemporal resolution. The commonly used method is to use the surface temperature and the relationship between and [2629], for example, the empirical Bevis model (BTM): established in 1992 mainly for real-time applications. The BTM was derived from the profiles of vapor pressure and dew point temperature from 8718 observations over 13 mid-latitude radiosonde stations (27° to 65°N) in North America over a 2-year period. If observed by a meteorological sensor is applied, the BTM can achieve a good accuracy. However, due to the rapid spatiotemporal variation in atmospheric wet profiles, the relationship may vary with location and season; it is found that the BTM performs unevenly globally, such as in China; its systematic bias of the BTM relative to radiosonde-derived is generally above 4 K, with an extreme value of 8 K in some regions or seasons [30, 31]. Under an extreme weather condition with large amounts of water vapor, the BTM’s error can result in a several millimeters error in resultant PWV [23].

Researchers have studied the numerical relationship using the linear or nonlinear regression methods based on local radiosonde observations. Many regional models (RTMs) have been established all over the world [3135]. Some RTMs have been established in China [3645]; for example, the RTM established by Liu et al. [37] outperformed the BTM in the Hong Kong region. Li and Mao [36] verified the good accuracy of the relationship using radiosonde data from the Beijing observatory and obtained monthly coefficients of the RTM for its use in eastern China. The RTM was obtained by Lv et al. [38] for the Chengdu region using radiosonde data. Yu and Liu [39] found that the accuracy of derived from the BTM was correlated with altitude. The accuracies of all these RTMs were found better than the BTM.

Different from the above one-factor (i.e., ) model, some researchers established multifactor RTMs by adding air pressure () and water vapor pressure () as new variables into the model [40, 44, 45]. Gong [40] analyzed the relationship between and each of three above-mentioned factors based on 123 radiosonde stations all over China during 2008–2011 and established both one-factor and multifactor linear RTMs for different climate regions and seasons. He found that the multifactor RTM slightly outperformed the one-factor RTM. Nevertheless, Wang et al. [45] claimed no significant difference between one-factor and multifactor (e.g., and ) linear regression results in Hong Kong.

Since 2011, the Hunan Continuously Operating Reference Stations (HNCORS) network has been established by the Meteorological Bureau and National Land Agency of Hunan Province, China. It consists of 93 CORS stations covering whole Hunan region [46], and the meteorological applications of the HNCORS have been put on the agenda. If local radiosonde data are available, a new RTM may be developed and it may perform better than the BTM. There are three radiosonde stations (Changsha, Huaihua, and Chenzhou) in the Hunan area and a three-year period (2012–2014) of radiosonde data recorded at these stations was available. Thus in this study we used the data during the period of 2012-2013 to establish a new RTM, and its accuracy was evaluated using the following two years’ (i.e., 2013-2014) radiosonde-derived (as the truth). This may provide the foundation for the applications of the ground-based GNSS meteorology in Hunan region.

The outline of this paper is as follows. The methodology of obtaining from radiosonde and the relationship between and other meteorological factors will be analyzed in the second section, followed by several multifactor RTMs established; then several sets of derived from the BTM, a one-factor RTM, a two-factor RTM, a three-factor RTM, and seasonal two-factor RTMs are compared against radiosonde-derived (as the truth) for their performance assessment in the third section. Conclusions are given in the last section.

2. Methods and Materials

2.1. Obtaining

The Constant, Bevis formula, numerical integration, and approximate integration methods can be used to obtain . The required factors, the difficulty in realisation, and the accuracy level of these four methods are compared in Table 1.

Among these methods, the Constant method is the simplest but has the lowest accuracy; the Bevis formula is most widely used and especially for real-time applications, but its accuracy may vary with location and season; the approximate integration needs the temperature lapse rate () and vapor pressure decline rate (), which cannot be obtained easily, and usually leads to a low accuracy; the numerical integration is the most accurate method and also easy to be implemented [37]. Hence the numerical integration is adopted in this study. Its mathematical expression is [47]where is the th pressure level and , , and are the partial pressure (in hPa) of water vapor, atmospheric temperature (in Kelvin), and thickness (m) of the layer, respectively, and

In fact, vapor pressure is a measure of the amount of moisture in air. Technically, it is the pressure of water vapor above water surface. When air reaches the saturated condition, the water vapor in air will condense, and dew point temperature is the same as air temperature at this time. Therefore, vapor pressure under the condition of saturation is the saturated vapor pressure (), which is the function of dew point temperature expressed by [15, 40]where is in Celsius. Equation (3) is given by the World Meteorological Organization in 2008 [48].

2.2. Obtaining PWV from GNSS-ZWD

Generally, the ZTD of GNSS signals can be estimated using undifferenced precise point positioning (PPP) or differential strategies. In this study, undifferential PPP is adopted [14]. The ZTD is usually divided into two parts: the ZWD and zenith hydrostatical delay (ZHD), and 90% of the ZTD is induced by dry air in the atmosphere. The ZWD is mainly caused by the atmospheric water vapor which varies rapidly in both spatial and temporal domains, so if it is obtained from an empirical model, its error or accuracy is at levels of 10%–20% of the ZWD value.

The ZHD can be calculated using the most commonly used Saastamoinen model as expressed below [49]:where , , and are the surface pressure (hPa), geographic latitude, and altitude of the station, respectively (km).

The accuracy of surface pressure measured by meteorological sensors is generally 0.2~0.5 hPa, and the accuracy of the ZHD calculated from the Saastamoinen model can be millimeters. Thus the accuracy of GNSS-PPP-derived ZTD can be also at a level of millimeters, and the accuracy of the ZWD, calculated by , is also at millimeters [50].

The ZWD can be converted to PWV by the following formulas: where is the conversion factor; is the density of liquid water (1 g/cm3); is the specific gas constant for water vapor (461 J/K/kg), and is the ratio of the molar masses of PWV to dry air. The values of the three physical constants are  K/mb,  K/mb, and  K2/mb, and the constant set to  K/mb by Bevis et al. [8] is most commonly used.

2.3. Data Analysis
2.3.1. Data Source

In this study, radiosonde data collected from balloon-borne instrument platforms with radio transmitting twice per day during the period 2012–2014 from the aforementioned three radiosonde stations, Changsha, Huaihua, and Chenzhou in the Hunan region (Figure 1), were used to calculate the time series of over the three stations using (1). The time series of (two values per day) and its corresponding multiple meteorological factors (e.g., , , and ) from the three stations during the period 2012-2013 were used to establish multifactor RTMs, and the resulting from the radiosonde in the period 2013-2014 were used to assess the performance of the new RTMs.

The overall performance of the new multifactor RTMs was measured by the bias and RMS of the time series at each station, as defined below:where is the new RTM-derived (or predicted) and is the radiosonde-derived .

2.3.2. Analysis of Multiple Meteorological Factors

In this part, the correlations between and each of the three meteorological factors , , and are investigated. The scatter plots and their correlation coefficients are shown in Figure 2. The linear regression analysis shows the three correlation coefficients are 0.90, 0.88, and −0.55. Therefore, has very high positive correlation to and , while it has a weak negative correlation to .

2.3.3. Collinearity Test

Collinearity, also called multicollinearity, is a phenomenon that two or more factors in a regression model are highly correlated. It refers to nonindependence of the predictor factors, usually in a regression-type analysis. As for a set of factors , there exist coefficients to make the following equation hold [51]:

Supposing that there exists a factor and it can be expressed by a linear combination of the other factors, as expressed in (7), and then are full collinearity. Otherwise, there is no collinearity among .

Collinearity can be a problem for parameter estimation because it inflates the variance of regression parameters and hence potentially leads to the wrong identification of relevant predictors in a regression model. It is a severe problem when a model is trained on data from one region or time and predicted to another with a different or unknown structure of collinearity. Parameter estimates may be unstable, making standard errors on estimates inflated and consequently inference statistics biased.

In this study, linear regression was used to establish multifactor RTMs and collinearity must be tested whether pairwise linear correlations exist among the three variables , , and . Hence, the following tolerance value () was used to test the collinearity among any two of them before the modelling of was performed. where is the square of the two correlation coefficients between the two factors.

As a matter of experience, a threshold value of 0.1 for is often adopted [51]. If value exceeds 0.1, it means there is no collinearity problem between the two factors involved.

From the results shown in Figure 2, we can obtain the following tolerance values among , , and :

We can see that all the tolerance values are much larger than 0.1, meaning there are no collinearity problems among , , and . Thus, we can use all the three factors in the modelling of the RTMs.

3. Results and Discussion

3.1. One-Factor RTM
3.1.1. Establishing RTM

The two-dimensional linear fitting method for the one-factor RTM has the same expression as the BTM; that is, . The radiosonde-derived and from all the aforementioned three stations during the period of 2012-2013 were used in the following observation equation matrix:where is the residual vector and coefficients and were estimated using the least squares estimation method.

The obtained one-factor RTM is

3.1.2. Accuracy of the RTM and BTM

The statistical histograms for the differences of the BTM and the RTM from the radiosonde-derived during the period of 2013-2014 at the three stations are shown in Figure 3. The differences of the BTM are mainly in the range of about 0~4 K (Figure 3(a)), while the differences of the RTM are mainly in the range of about −3~4 K (Figure 3(b)). This indicates that the systematic difference, which reflects the accuracy, of the RTM result is much less than that of the BTM in the Hunan region.

The left panes in Figure 4 show the three time series (2013-2014) resulting from radiosonde (truth) and the BTM and RTM at each of the three stations; and the right panes show the differences of the two models’ results from the radiosonde-derived .

The statistical results for the bias and RMS of the above time series at each station are listed in Table 2, in which the last row is the mean of all the three stations’ results. Both the bias and RMS of the RTM results are significantly less than that of the BTM counterparts at all the three stations. The last row indicates the overall improvements of the RTM over the BTM are 88% (in bias) and 28% (in RMS).

3.2. Multifactor RTMs

According to the analysis in Section 2.3.2, has a very high positive correlation with and . And it also has a weak negative correlation with . Therefore, in this section, multifactor regression will be used to establish two-factor and three-factor RTMs, and their performances will be compared to that of the one-factor RTM.

3.2.1. Establishing RTMs

The multiple linear fitting method was adopted to model the multifactor RTMs. The radiosonde-derived, , , and from all the aforementioned three stations in 2012-2013 were used in the following observation equation system: where is the residual vector and coefficients , , , and were estimated using the least squares method.

If only and are taken into consideration in (10), a two-factor RTM can be obtained using observations from three radiosonde stations and expressed as follows:

When all the three factors are all taken into consideration, the resulting three-factor RTM is

3.2.2. Accuracy of Multifactor RTMs

Figure 5 shows the statistical histograms for the differences of the above two-factor and three-factor RTMs from the radiosonde-derived (as the truth). We can see that the differences of the two RTMs are mostly in the range of about −3 K~3 K. As shown in Figure 6 (left panes), the time series of predicted from the two RTMs have a very good agreement with the truth. The time series difference values in the right panes of Figure 6 show a similar variation trend and difference values in winter (Dec.–Feb.) are larger than summer (Jun.–Aug.).

The statistical results of the above time series are listed in Table 3. The bias and RMS of the one-factor, two-factor, and three-factor RTMs are compared with each other. The last row is the mean of the model results of the three stations. We can see that biases of the three RTMs results are all near zero; two-factor and three-factor RTMs show a similar performance, in terms of RMS and both are better than the one-factor RTM, with an improvement of 7% (in terms of RMS). In practical applications, the selection of an optimal RTM is based on the amount of available meteorological data of the stations.

3.3. Seasonal Two-Factor RTMs

As shown in Figure 6 (right panes), the accuracy of both two-factor and three-factor RTMs show a correlation with season. Due to the similar performance of the two RTMs, only the two-factor model was adopted for the investigation of the performance of seasonal RTMs in this section. The time series of , , and from the same three stations and the same period were divided into four seasons for establishing four seasonal two-factor RTMs. predicted from the RTMs for the period 2013-2014 were compared to the yearly two-factor RTM established in Section 3.

3.3.1. Establishing RTMs

Similar to Section 3.2, the multiple linear regression method was used to obtain the seasonal two-factor RTMs. The radiosonde-derived , surface temperatures , and water vapor pressures from all the three stations in the period 2012-2013 were used in the following observation equation:where is the residual vector and the coefficients , and were estimated using the least squares estimation. The four seasonal two-factor RTMs obtained are in Table 4.

3.3.2. Accuracy of Seasonal Two-Factor RTMs

The time series predicted for the period 2013-2014 using the above four seasonal two-factor RTMs and the yearly two-factor RTM and also the radiosonde-derived time series for the same period (as the truth for validation) are shown in the left panes in Figure 7. The right panes show the comparison of the accuracy between the seasonal and yearly RTMs. It is shown that all the seasonal RTMs outperform the yearly RTM.

The statistical results of the above time series are listed in Table 5. It can be seen that both biases and RMSs of all the seasonal RTMs are noticeably smaller than that of the yearly RTM, especially for the bias, meaning that the seasonal RTMs outperform the yearly RTM. The four seasonal two-factor RTMs slightly outperformed the yearly two-factor RTM, with reduction of 3%, 10%, 2%, and 3% in the RMS values.

3.4. Comparison of Conversion Factor and GNSS-PWV Resulting from Two Models
3.4.1. Comparison of Conversion Factor Resulting from Two Models

The conversion factor resulting from both BTM and seasonal two-factor RTM-derived are compared against its reference/truth (which is resulting from radiosonde-derived ) during the two years period of 2013-2014 are calculated and its monthly statistical results are listed in Table 6. The statistical result for each month listed in the table is based on the same month in the two years period. We can see the monthly mean of the BTM resultant range from 0.1536 (Jan.) to 0.1630 (Jul.) and the seasonal two-factor RTM resultant range from 0.1529 (Jan.) to 0.1613 (Jul.). The last row is the mean of all the monthly means over the two years period. The mean bias of the BTM resultant over the two years is 0.0012 and that of the seasonal two-factor RTM is 0.0001 and their corresponding mean RMSs are 0.0016 and 0.0011. This also means a 92% improvement/reduction to the mean bias and a 31% improvement to the mean RMS made by the seasonal two-factor RTM compared with the BTM, over the two years period.

3.4.2. Comparison of GNSS-PWV Resulting from Two-Model-Derived

The Chenzhou CORS station (named CZSQ) has a colocated radiosonde station with a horizontal distance of about 10 m (but some data are missing due to the instrument or data transmission failures of the CORS network during 2015). Its GNSS-ZTD is calculated using the PPP strategy as mentioned in Section 2.2. The ZHD is determined using (4) with pressure values from a pressure sensor mounted at CZSQ. The two sets of PWVs converted from the GNSS-ZTDs together with both the BTM and seasonal two-factor RTM-derived are compared against the true PWV (resulting from radiosonde data twice daily) in 2015, as shown in Figure 8.

Table 7 is for a comparison of the statistical bias and RMS of the two sets of PWVs resulting from the two models derived (against their truth). The seasonal two-factor PWV is improved by 37% and 12%, respectively. Compared to the improvements of and conversion factor , the PWV improvement is not enough. The most likely reason is the largely missing PWV at CZSQ station due to the instrument or data transmission failures of the Hunan CORS network.

4. Conclusion

In this study, several new RTMs were established using radiosonde data in the period of 2012-2013 from three stations in Hunan region. Numerical integration and least squares estimation methods were adopted to obtain the time series at the three stations and the coefficients of the regression models, respectively. The RTMs include a yearly one-factor RTM, a yearly two-factor RTM and a yearly three-factor RTM, and four seasonal two-factor RTMs. These RTMs were validated by comparing the time series predicted from the RTMs for the period of 2013-2014 against the same period’s radiosonde-derived . Results showed that the yearly one-factor RTM outperformed the BTM, with the improvements of 88% and 28% in bias and RMS, respectively. The two-factor and three-factor RTMs showed similar accuracy and both were better than the one-factor RTM, with an improvement of 7% in RMS. The four seasonal two-factor RTMs slightly outperformed the yearly two-factor RTM, with the improvements of 3%, 10%, 2%, and 3% in the RMS of the four seasons. The improvement of the conversion factors in mean bias and RMS resulting from the seasonal two-factor RTM is 92% and 31%. The bias and RMS of the PWV resulting from the seasonal two-factor RTM are improved by 37% and 12%, respectively. Therefore, the seasonal two-factor RTMs are recommended for the research and applications of GNSS meteorology in the Hunan region.

Competing Interests

The authors declare no conflict of interests.

Acknowledgments

This work was supported by the National Natural Science Foundation of China (no. 41304029), the Natural Science Foundation of Hunan Province (no. 2016JJ3061), the Key Project of Hunan Provincial Meteorological Bureau (no. XQKJ15A002), the Special Project for Forecaster in Hunan Provincial Meteorological Bureau (no. AQKJ16C019), the subproject (no. 2015BAC03B06) under the National Key Technology Research and Development Program of China (no. 2015BAC03B00), and the Key Project of Hunan Provincial Meteorological Bureau (no. XQKJ16A002). The authors would like to express their sincere gratitude to the Hunan Meteorological Bureau for the provision of GNSS and meteorological observations.