Abstract

The contribution degree of different surface factors (complexity and heterogeneity) in the urban interior to the urban thermal environment has become an issue of increasing concern under changing climate. In this paper, the multiple linear regression analysis methods to analyze the contribution degree of different surface factors to the urban thermal environment were based on seven urban built-up areas. At the same time, the LST of the same type of factors in the same city will have a difference of ±2.5°C due to the different surrounding features. At the same time, the LST of the same ground object in the same city will be ±2.5°C different because of the difference of the surrounding ground object. The environmental LST and the mean LST of other surface factors were significantly correlated, and the root mean square error was 3.52. This study first classifies the ground features with different attributes, conducts LST statistics for each category, and conducts multivariate linear analysis, instead of setting some fuzzy exponent and forcing a threshold to calculate. The purpose is to explore the contribution of different reflectivity ground objects to the urban thermal environment.

1. Introduction

The urban heat island (UHI) is mainly caused by the modification of land surfaces by urban development, which uses materials that effectively store short-wave radiation [1, 2]. The intensity and spatial pattern of UHI are largely exacerbated by population dynamics and the development of built-up areas. In urban built-up areas, different urban surface factors differ in their ability to absorb solar radiation and release heat [36] due to other properties (such as material, specific heat capacity, density, specific emissivity, and so on) [7, 8]. Therefore, the contribution degree of different surface factors to the thermal environment of the whole city is also different [9]. In large scenes, remote sensing technology records the land surface temperature (LST) of each surface factor [1014], which is supremely helpful for studying the contribution of various surface factors to the urban thermal environment.

The changing urban landscape can cause irretrievable changes to the biophysical environment, including changes in the spatiotemporal pattern of the LST. In the past, many researchers had studied large-scale regional urban thermal environments [15]. Some researchers examined the relationship between the thermal intensity of urban areas and different indexes, such as thermal intensity and normalized difference vegetation index (NDVI), normalized difference soil index (NDSI), normalized difference building index (NDBI), normalized difference water index (NDWI), and so on [1620]. In addition, their research results showed that LST had a nonlinear relationship with NDVI and NDWI. In contrast, NDBI and NDSI had a positive linear relationship with LST, and LST had a better correlation with NDSI than NDBI. Other researchers studied the impact of urban landscape composition and layout on the urban thermal environment [21, 22]. Furthermore, their study also showed that although landscape composition and arrangement both have an impact on LST, landscape composition is more important than landscape layout; thus, one component indicator (e.g., impervious surface) together with no more than four landscape layout indicators can well lead to the prediction of LST [2327]. These research results can help landscape ecologists effectively use landscape indicators and promote landscape planners to make balanced use of land use types (LUTs) in urban planning [2833]. Mainly, land surface characteristics are primarily represented by land cover and land use (LCLU). Mainly, land surface characteristics are primarily represented by land cover and land use (LCLU), and the relationship between the LST and LCLU has been the focus of numerous studies on the urban thermal environment [3437]. However, after establishing the index, the same index value can contain different types of surface factors. Although the fitting with the LST can obtain a particular trend and indicate a specific connection, we do not know precisely what type of surface factors respond to the thermal environment and how much they are affected. Due to the complexity of the formation of thermal remote sensing and the heterogeneity of urban surface thermal properties, it is still a great challenge to establish the actual relationship between urban thermal characteristics and surface factors. There is no literature on the contribution of different ground objects to the urban thermal environment. On the one hand, it is limited by the accuracy of using remote sensing data to classify ground objects in large scenes; on the other hand, different ground objects are staggered and complex, and the spatial resolution of the thermal infrared band is low, so it is not easy to accurately invert the LST of a single pixel. Although setting some undefined indexes and some mandatory thresholds can be representative to some extent, the contribution value of specific ground objects to the urban thermal environment cannot be truly explored.

For this status quo, we tried to solve this problem with a data-intensive, data-driven approach. Thanks to free and open Access to Google Earth Engine and widely shared GIS data and free retrieval of high spatial resolution images. The data provided powerful help for studying the urban thermal environment with large scale and long-time series in detail. Intensive data-driven methods include surface factor classification and thermal model analysis of different categories of surface factors. Unlike the traditional method, which directly divided the research area into a broad category, this method involves a more detailed and in-depth research procedure.

Researchers have demonstrated that 30 m and 90 m are the optimal resolutions to study the relationship between LST and landscape patterns at patch level and landscape levels, respectively [38]. Firstly, this study clustered 29 Landsat-8 multispectral remote sensing data covering seven urban built-up areas and then used high spatial resolution optical images and GIS data to interpret their surface factors. Secondly, considering the relationship between surface factors and urban thermal environment, we further classified the interpreted categories into seven categories: the objects of this study, namely, the seven surface factors. Then, given the anisotropy of the LST of each pixel, we employed the LST gradient to calculate the weighted LST of each pixel and its neighboring pixels, respectively, to realize the thermal mode analysis. Finally, we performed multiple linear regression of the ambient LST and the mean LST of various surface factors and detected their overall significance. On top of that, we used ridge regression to analyze and measure the contribution coefficient of the mean LST of various surface factors to the thermal environment. The main objective of this paper is to clarify the contribution of various land features to the urban environment through accurate land surface classification.

2. Study Areas and Data Sources

2.1. Study Areas

The study area selected seven cities in China: Beijing, Jinan, Xi’an, Nanchang, Changsha, Wuhan, and Zhengzhou. In the past 30 years, these cities have experienced rapid urbanization. For the basic information of the geographical location, climatic conditions, and built-up area of the study area, see Table 1.

2.2. Data Sources

This paper contains 29 Landsat-8 data covering seven urban built-up areas from 2013 to 2019 based on United States Geological Survey (USGS, https://www.usgs.gov/). The image information is shown in Table 2. The Landsat-8 satellite was successfully launched by the National Aeronautics and Space Administration (NASA) on February 11, 2013. It carries two sensors, namely, the Operational Land Imager (OLI) and the Thermal Infrared Sensor (TIRS). The satellite has eleven bands, of which seven multispectral bands have a spatial resolution of 30 m. Two thermal infrared bands have a spatial resolution of 100 m to 30 m by resampling. The revisit period is 16 days for global coverage. Surface reflectance is a potential factor in the study of surface thermal models and is greatly affected by atmospheric conditions. Therefore, when screening the data, we choose imaging data with clear weather and less than 2% cloud cover. The data on weather parameters were taken from the public daily measurement records. All images were named by the 8-digit number of years, months, and day of the collection date for the convenience of recording.

Intensive data-driven research usually requires a large amount of computing and storage space. Fortunately, Google Earth Engine (GEE), a multithreaded high-performance computing service platform, not only is available for users to access for free but also dramatically meets our needs for large-area image acquisition and significant computing resources. In addition, in relevant studies on the urban thermal environment, we found that the urban thermal environment was closely related to the significant built-up areas of the city. Therefore, based on the GEE platform combined with night light data, NDBI data, NDVI data, and NDWI data, this study adopted the Wake Cobweb algorithm to extract the research area and used the impervious layer dataset released by the China National Earth Observation Data Center to verify that the accuracy is above 80% and then extracted the outer contour of the built-up area through morphological corrosion and expansion operations. Note that the outer contour is extracted to include the water in the study area (see Figure 1).

3. Methodology

Dense data-driven methods include surface factor classification and regression of mean LST for each surface factor. In the surface factor classification module, we adopted the unsupervised K-means method to cluster each data separately on the GEE and put on manual interpretation for identification and post-classification processing. In the LST regression module, we employed Jiménez-Munoz’s single-channel algorithm [39] to invert the LST and then calculated the influence of a single pixel on the surrounding pixels based on the LST gradient. Finally, we performed an overall multivariate linearity regression analysis of the ambient LST and the mean LST of each surface factor and ridged regression analysis of the contribution coefficient.

3.1. Classification

When using the unsupervised K-means classification method, we first divided the utilization types of surface factors into 14 categories automatically and then derived the results before performing accurate manual interpretation of these categories through access to the ENVI 5.3 software. The references for the interpretation are Google Earth map high spatial resolution optical image and Landsat-8 RGB true-color image. Finally, we further clustered the 14 categories mentioned above into seven categories: water, vegetation, bright building, dark building, bright soil, dark soil, and alloy building. Reference for this classification comes from the literature [40]. The two clustering criteria of this study are the recognizability of ground objects at 30 m resolution; In this study, two clustering criteria are as follows: one is the recognizability of 30m resolution features; the other is the correlation between features type and urban thermal environment change. The interpretation and slice images of these seven categories are shown in Table 3.

3.2. Land Surface Temperature Calculation

In natural surfaces, every surface factor obeys the law of conservation of energy. In other words, the heat contained in each surface factor (except its direct solar radiation) is always transferred to and from the surrounding surface factors, from surface factors with high thermal energy to the ones with low thermal energy. It is worth exploring that heat transfer may occur between surface factors with different attributes or between surface and same attributes but with different energy. In order to further study the influence of various surface factors on the surrounding surface factors, radiometric calibration and atmospheric correction were first performed on the data, and then the single-window algorithm of Jiménez-Munoz [39] was used to retrieve the surface temperature. At the same time, we selected the LST of weather stations in 7 cities at 10 am as the measured LST to verify the LST inverted. The method to verify the inversion of LST is as follows: firstly, we add the meteorological station coordinates of each city to the image and find each station’s corresponding pixel on the image. Then, we calculate the mean value of LST retrieved from the corresponding pixel and the eight adjacent pixels around. Finally, the mean LST is subtracted from the measured LST of meteorological stations for comparison and verification. We calculate that, except for Beijing 20150907 data difference of −3.7°C, Wuhan 20171030 data difference of +3.9°C, and Xi 'an 20181029 data difference of +3.4°C, the difference of other data was all within 3°C demand for data accuracy, continue the next step of data processing.

We took each pixel as the object of study and defined the LST gradient of each pixel in a two-dimensional image to represent the influence between each pixel and the surrounding pixels. Based on the classification of ground surface factors and the LST of inversion, we counted the LST interval of each surface factor. We calculated the standard deviation of their LST range. This method can verify the stability and correctness of LST inversion and the accuracy of surface object classification.

In order to ensure that the statistical LST data can more accurately reflect the LST range of each type of feature utilization and avoid the uncertainty of the end value, we selected 90% of the LST data in the middle of each type of factor to calculate the average LST of various surface factors. Firstly, we attached the mean LST of each category as a new attribute to the ground object classification results and abstracted each new data with the LST attribute into the “a” matrix in Figure 2. Secondly, the edge data of the image were extended outwards by a pixel, and the extended pixel and the original edge data were the same property. Then, we used the Sobel operator to calculate the LST gradient of each pixel’s adjacent pixels in the X direction and Y direction. Finally, we calculated the modulus of the gradient of the two-dimensional image pixel, which is illustrated as follows:where is the gradient in the x direction and is the one in the y direction.

By calculation, we obtained a two-dimensional matrix that can represent the gradient relationship between the LST of each pixel and the LST of the surrounding pixels. It was abstracted as the “b” matrix in Figure 2. Finally, this study took advantage of the normalized “b” matrix as the weight to calculate the LST value of each pixel in the original image affected by 9 adjacent pixels, and it was abstracted into the “c” matrix in Figure 2. can be described as follows:where is the module in the b matrix and is the LST of each pixel in the original image.

3.3. Multiple Linear Factor Regression

The ambient LST is the result of the co-radiation of various surface factors after absorbing solar energy. Therefore, different surface factors have LST differences due to their different properties. Taking temperature as the object, combined with the radiation transfer equation and the specific heat capacity formula, it can be known that the specific emissivity, density, and specific heat capacity have the closest relationship with the change of temperature. The specific emissivity reflects the ratio of the emissivity of the surface factor to the blackbody emissivity, and the specific heat capacity reflects the ability of the surface factor to absorb and store heat. To explore the difference in contribution to environmental LST caused by different attributes of various surface factors, we assumed that the recorded value of each data was normally distributed and that the recorded value of each image met the requirement of n simultaneous independent observations. In addition, the following assumptions existed between the ambient LST and the mean LST of each surface factor:: the ambient LST has nothing to do with the mean LST of each surface factor, .: the ambient LST is related to the mean LST of each surface factor, .

The multiple linear regression equation is as follows:where and is the environmental LST at each imaging process. Landsat-8 passes over the Chinese mainland at about 02:02 (UTM) per cycle and takes satellite images. Therefore, we chose the data recorded at 02:00 (UTM) by each urban surface meteorological station as the ambient LST. is the average LST of the surface factor category after the influence of nearby pixels was calculated, denoted as . The random error is defined as . On the assumption that and is independent of each other, . X and Y then can be calculated as follows:

We utilized the normal linear model least squares estimation to calculate the surface factor contribution coefficient , so as to minimize the sun of squares . can be described as follows:

The multicollinearity of the data or the use of the non-full rank singular matrix as the coefficient matrix of the independent variable equation may lead to the existence of multidimensional zero vector in the basic solution system, the large variance of the regression coefficient estimation, and unstable estimate results. Therefore, the ridge regression method was used to analyze the contribution coefficient of surface factors. If a small penalty function were added to the coefficient matrix of the independent variable, the latter would be a nonsingular matrix of total rank. In this way, the primary solution system obtained by calculation was a multidimensional nonzero vector. The extremum solution function of ridge regression is as follows:where is the penalty function, which ensures that the value of does not become very large. We used formula (7) to estimate the minimum estimator LSE of :

Then, we calculated the sum of squares decomposition and checked the significance of the regression equation and regression coefficient:

Finally, we analyzed the residuals and calculated the sum of squared residuals . It was described as follows:

The overall technical roadmap for the current study is given in Figure 3.

4. Results

4.1. Analysis of Classification Results

In the classification results, we randomly selected 5,000 sample points for each category on each image. In the ROI tool, the statistical sample separability was within the range (1.86, 2) (when the value is more significant than 1.7, we think the classification effect is excellent), indicating that our classification results were highly feasible. We selected the data of Beijing on October 1, 2018, for displaying and observed the reflectance information of each type of sample point on the main band after image radiometric calibration. The results are given in Figure 4.

The reflectance of , , , and in the visible band and near infrared band after image radiometric calibration was counted, respectively.

According to Figure 4, we can see that the statistical reflectance ranges of the six categories of water, vegetation, bright building, dark building, bright soil, and dark soil in the blue, green, and red bands have similar trends. The mean reflectivity of alloy building in each waveband is the biggest; the reflectivity of water in bands SWIR 1 and SWIR 2 is smaller than that of other categories; vegetation in the near infrared band has a significantly greater reflectivity than other categories, while the mean reflectivity of bright building in blue and green bands is the second largest, next only to that of alloy building; the mean value of dark building in bands SWIR 1 and SWIR 2 is the largest; the reflectivity of bright soil and dark soil in the infrared band is contrary to that in R, G, and B bands. Therefore, our classification results have shown strong analyticity on the spectrum.

4.2. Analysis of LST

Single-window algorithm was used to perform LST inversion for 29 data. The results are given in Figure 5.

As shown in Figure 5, during the period from August to October, although each urban built-up area’s climate type and geographical location are different, they all show the “urban heat island effect” in terms of LST. Specifically, the LST is higher in the built-up areas where the facilities are well developed and lower in the built-up areas where the facilities are underdeveloped. Combined with the shape of urban built-up areas and urban heat islands, the urban built-up areas of Beijing, Wuhan, Xi’an, and Nanchang are approximately round in shape, and the urban built-up areas of Jinan, Zhengzhou, and Changsha are approximately long in shape. However, in terms of the heat island effect, the built-up areas of approximately round cities are more substantial than those that are approximately long in shape.

We counted the LST interval of each category and calculated the arithmetic square root of the deviation square for each pixel and the mean LST of its category, which was expressed as the standard deviation. Standard deviation is the most commonly used quantitative form to reflect the dispersion degree of a group of data. It is an important index to calculate accuracy. The standard deviation results of each category of LST are given in Table 4.

As could be seen from Table 4, the standard deviation of each category in each image was minimal, which indicated that the LST of each category was very stable in the interval with a small distance. It also showed that the LST reflected by each feature was mainly reflected by its properties, including the specific heat capacity, specific emissivity, and density of its material. In addition, the surrounding environment also had an impact on the LST reflected by the feature, which fluctuated within a small range. For similar ground features in the same city, their LSTs will vary within ±2.5°C at the same time of the day due to differences in surrounding ground features. However, they will vary significantly at different time points due to differences in weather conditions and environmental LST. However, the relative LST between different features remains was constant, among which the LST of the water is the lowest, followed by that of vegetation, and the LST of alloy building is the highest.

4.3. Regression Analysis of Multiple Surface Factors

This study applied SPSS software for regression analysis and verified regression equation and regression coefficient to quantify the contribution of various surface factors to the urban thermal environment. First of all, we performed a standard distribution hypothesis test on the regression data to obtain the residual histogram results (see Figure 6).

As shown in Figure 6, the data were distributed on or along the diagonal, indicating that the regression model met the standard assumption. Then, we used the statistical method of F distribution to inspect the global multiple linear regression of the equation. The calculated results are given in Table 5.

As could be seen from Table 5, this indicated that the linear relationship between the explained variables and all the explanatory variables in the population regression model was significant on the whole. In other words, the multiple linear regression equation between the ambient LST and the mean LST of each surface factor was highly significant. Therefore, there was a significant correlation between the ambient LST and the mean LST of each surface factor, so this experiment accepted the assumption that all regression coefficients were not equal to 0.

This study examined the contribution coefficient of each type of surface factor and the ridge trace map obtained by ridge regression (see Figure 7). When , the coefficients converge, which are the right ones to be taken.

When , , indicating that there was a strong correlation between the 7 regression coefficients, so the regression equation can be expressed as follows:

The greater the absolute value of the regression coefficient, the greater the influence of x on the rate of change of y. It can be seen from the regression equation that we have more than 95% confidence intervals, which can well illustrate that the category exerting the greatest impact on y, the variation rate of the environment LST, is , which refers to vegetation, whose contribution coefficient is about 0.17. The second is , the dark soil, whose contribution coefficient is 0.16. The minimum contribution coefficient to the environment LST comes from water and alloy building, with 0.1 in both cases. The contribution coefficients of bright building, dark building, and bright soil to ambient LST are very close, which are 0.13, 0.13, and 0.12, respectively.

To sum up, in terms of the contribution of surface factors to the urban thermal environment, the vegetation in the built-up area, such as forests, grasslands, and green belts, contributes the most. It is followed by the dark soil with low reflectivity in built-up areas, such as the dark soil with high water content, the area with sparse shrub growth or grassland, and the water border area. Although the LST of the water is the lowest in all categories, its contribution coefficient to the overall environmental LST is not as large as we thought. Similar to a water body, the LST of alloy building was the highest among all the categories, but its contribution coefficient to the overall environmental LST was not as great as we expected.

5. Discussion

In the classification of ground surface factors, this study has gone through three complicated data processing processes: clustering, interpretation, and classification. Separability of the spectral signatures of the selected seven major surface factors [40] was tested by Jeffries–Matusita pairwise separability measure in ENVI 5.3. All pairs of surface factors were found to be separable, with values ranging from 1.86 to 2.0. Derived by dense data, this paper adopts the unsupervised K-means method to classify surface factors and then interprets and classifies them by referencing high-resolution images, thus opening up a new research path for the research on the impact of urban thermal environment on land surface factors. However, it takes much time and a wealth of experience to obtain high precision classification results. There is an urgent need to introduce more intelligent and efficient models to implement data processing and classification to improve work efficiency.

The standard deviation of each category in each image was minimal, which indicated that the LST of each category was very stable in the interval with a small distance. It also showed that the LST reflected by each feature was mainly reflected by its properties, including the specific heat capacity, specific emissivity, and density of its material [2]. In addition, the surrounding environment also had an impact on the LST reflected by the feature, which fluctuated within a small range. For similar ground features in the same city, their LSTs will vary within ±2.5°C at the same time of the day due to differences in surrounding ground features. This result is not found in the previous research literature because no researchers have used the classified data for gradient processing and LST statistics. Due to the difference in weather conditions and ambient LST, the LST of the same ground object in the same city will be significantly different at different points in time. Everyone commonly recognizes this result. However, the relative LST between different ground objects remains constant. The category with the lowest LST is a water body, followed by vegetation. The category with the highest LST is high reflectivity buildings. This result is beyond normal cognition. It is generally believed that the LST of all ground objects varies due to the difference in the environment due to their specific heat capacity, specific emissivity, and other attributes. However, statistical results show that the relative LST between different ground objects remains unchanged no matter how the environment changes.

The results of the contribution of surface factors to the urban thermal environment show that the vegetation in the built-up area, such as forests, grasslands, and green belts, contributes the most. This is consistent with previous studies, which demonstrated correlations between LST and the abundance of green space measured by the normalized difference vegetation index [17, 19, 36]. Trees and other plants help cool the environment, making greenspace a simple and effective way to mitigate urban heat island effects. The multiple linear regression equations of various surface factor LSTs and ambient LSTs are highly significant. This is also consistent with some previous studies and is very similar to the proportion composition in LULC cover LCLU [34, 35] and the proportional composition of landscape (PLAND) [11, 21, 24]. Unlike in the past, when the thermal environment was explored by establishing various indices [1623, 41], we directly used the surface factor objects’ thermal characteristics in the present study. After the establishment of the indexes, the same index value can contain different types of surface factors. Although the fitting with the LST can point to a specific trend and indicate a particular connection [2430], we do not know precisely what type of surface factors respond to the thermal environment [47] and how much they are affected [810]. Through regression statistics, more than 95% of confidence level indicates that vegetation has the greatest influence on the rate of change of environmental temperature, with a contribution coefficient of about 0.17, followed by dark soil with a contribution coefficient of 0.16, and water bodies and buildings with high reflectivity have the least contribution coefficient to the thermal environment. The contribution coefficients of colorful buildings, dark buildings, and bright soil to ambient LST are very similar, which are 0.13, 0.13, and 0.12, respectively. The contribution coefficient of water to the overall ambient LST is not as significant as expected. The LST of buildings made of high reflectivity materials is the highest in all the LST categories. However, the contribution coefficient of buildings made of high reflectivity materials to the overall ambient LST is not as significant as expected.

Therefore, when considering the impact of urban development and construction on the urban thermal environment, on the one hand, the contribution degree of each surface factor to the urban thermal environment can be taken into account. On the other hand, our research has strategic significance in improving the urban thermal environment [2327]. It can directly provide the contribution degree of each type of ground feature to the thermal environment. In the process of new urban expansion and old urban reconstruction, it can be considered to breed flowers and plants on bright buildings or choose dark building materials, speed up development on bright soil, maintain the proportion of dark soil and vegetation, and reasonably build alloy buildings to alleviate the urban thermal environment for the city [3033, 39].

In addition, we use the statistical ridge regression analysis method to discuss the contribution coefficient of each type of land surface factor. The result obtained is the contribution degree of each type of land surface factor to environmental LST. However, it ignores the positive and negative correlation of environmental impact. Therefore, in future studies, more sophisticated regression models can be used for in-depth exploration.

6. Conclusion

Based on 29 Landsat-8 images from 2013 to 2019, this paper studies the contribution of surface factors to urban thermal environments in seven urban built-up areas by using multiple linear correlation analysis methods. The results show that for similar surface factors in the same city, their LSTs will vary within ±2.5°C at the same time of day due to differences in surrounding surface factors. In contrast, the LSTs of surface factors will vary significantly at different time points due to differences in weather conditions and environmental LST. However, the relative LST between different surface factors remains constant, among which the LST of the water is the lowest, followed by that of vegetation, and the LST of alloy building is the highest. The multiple linear regression equations of ambient LST and various surface factors were highly significant, which indicated that the LST of surface factors and ambient LST are significantly correlated. The coefficient of the contribution of surface factors to ambient LST was obtained by the ridge regression method. When , the coefficients converge, so there is a strong correlation between the seven regression coefficients. As a result, we had more than 95% confidence intervals, which illustrated that the category that had the greatest impact on the variation rate of environment LST was vegetation, whose contribution coefficient was about 0.17. The second was dark soil, whose contribution coefficient was 0.16. The contribution coefficient of water and alloy building to the environment was the minimum. This study opens up a new research idea for the study of urban thermal environments. At the same time, it has a substantial reference value for the impact of different surface factors on the thermal environment during urban expansion and construction. The goal is to provide guiding suggestions for sustainable urban planning and development under future climate changes.

Data Availability

All data, models, and codes that support the findings of this study are available from the corresponding author upon reasonable request.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Acknowledgments

The program underpinning this work is National Key Research and Development Program of China (no. 2018YFC0407905).