Abstract

The soft computing models used for predicting land surface temperature (LST) changes are very useful to evaluate and forecast the rapidly changing climate of the world. In this study, four soft computing techniques, namely, multivariate adaptive regression splines (MARS), wavelet neural network (WNN), adaptive neurofuzzy inference system (ANFIS), and dynamic evolving neurofuzzy inference system (DENFIS), are applied and compared to find the best model that can be used to predict the LST changes of Beijing area. The topographic change is considered in this study to accurately predict LST; furthermore, Landsat 4/5 TM and Landsat 8OLI_TIRS images for four years (1995, 2004, 2010, and 2015) are used to study the LST changes of the research area. The four models are assessed using statistical analysis, coefficient of determination (R2), root mean square error (RMSE), and mean absolute error (MAE) in the training and testing stages, and MARS is used to estimate the important variables that should be considered in the design models. The results show that the LST for the studied area increases by 0.28°C/year due to the urban changes in the study area. In addition, the topographic changes and previously recorded temperature changes have a significant influence on the LST prediction of the study area. Moreover, the results of the models show that the MARS, ANFIS, and DENFIS models can be used to predict the LST of the study area. The ANFIS model showed the highest performances in the training (R2 = 0.99, RMSE = 0.78°C, MAE = 0.55°C) and testing (R2 = 0.99, RMSE = 0.36°C, MAE = 0.16°C) stages; therefore, the ANFIS model can be used to predict the LST changes in the Beijing area. The predicted LST shows that the change in climate and urban area will affect the LST changes of the Beijing area in the future.

1. Introduction

Climate change is one of the most critical challenges that the world faces. Previous studies have reported that climate change has a significant effect on the land surface temperature and its other parameters [1, 2]. The expansion of urban areas is considered a significant factor for the change in land use and land surface temperature [3, 4]. Previous studies show that, in China, urban growth and climate change including temperature change are serious problems resulting from economic and social development [5, 6].Therefore, it is vital to investigate ways to predict change in land surface temperature (LST) change.

LST is defined as the temperature felt when there is an exchange of long-wave radiation and turbulent heat fluxes within the surface-atmosphere interface [79]. The LST is being increasingly utilized to evaluate the climate changes in urban zones. Satellite images are used as sources of information for surface temperature trends and variability. Li et al. [8] summarized the basics and theoretical background for the models deployed for LST determination from satellite images. Avdan and Jovanovska (2016) presented some methods for calculating LST from Landsat images, and they concluded that the methods used were successful in retrieving the LST. Zhang et al. [10] evaluated the changes in LST of the Ebinur lake between 1998 and 2011 by applying Landsat image processing and found that Landsat image processing is a good tool to estimate the relationship between LST and land cover factors. Vlassova et al. [11] assessed three different techniques for Landsat image processing. They applied radiative transfer equation (RTE) inversion using the atmospheric correction parameters calculator (ACPC) tool and two algorithms based on the approximations of RTE, namely, the single-channel (SC) method and the mono-window (MW) method, and found that the three methods can be utilized for the estimation of LST from Landsat thermal images.

Recently, soft computing techniques are widely used for studying environmental changes. Valipou [12] examined neural network (NN) methods for detecting drought and wet years, and he found that NN methods are suitable for estimating the environmental changes. Tran et al. [13] applied a regression model to estimate the LST patterns in Hanoi city, Vietnam, and found that this method can be used in other cities that possess a similar urban growth. Ahmed et al. [14] used an NN model to simulate land cover changes between 2019 and 2029 in Dhaka, Bangladesh, and their results showed that the change in temperature would be greater than or equal to 30°C. Ranjan et al. [15] utilized NN in conjunction with geoinformatics technology to estimate LST, and they found that this model could precisely predict LST. Lee and Jung (2014) used an NN to approximate the LST in Korea and found that it can be applied to predict LST changes with high accuracy. Kestens et al. [16] utilized a general linear model based on Landsat images to model the LST change at Quebec, Canada, and found that the data estimated from satellite images were useful for detecting the LST changes. Yang et al. [17] applied four interpolation methods to predict the LST changes in England, namely, inverse distance weighting (IDW), spline, kriging, and cokriging, and recommended using kriging method when approximating LST from image processing. Li et al. [18] studied the effectiveness of image clearance on LST estimation and observed that the resolution of the Landsat images is important for accurately estimating the LST changes.

More recently, integrated soft computing models were found to be better to detect different environmental changes [19, 20]. Advanced regression and integrated NN models have shown good performance as prediction models for nonlinear performances changes of LST, water level changes, monthly temperature changes, and evaporation [11, 1921].For instance, the wavelet neural network (WNN) was developed to forecast water level changes, while adaptive neurofuzzy inference system (ANFIS) is used to forecast both water level changes and LST changes [21, 22].The ANFIS provides better performances for the dynamic behaviors [23, 24]. Kisi et al. [19] investigated different cases with the ANFIS model to predict temperature changes in Turkey. They established that the ANFIS with grid partition is the optimal model for change prediction. In addition, Kisi et al. [20] developed a dynamic evolving neurofuzzy inference system (DENFIS), which is a fuzzy integrated model, to predict the evaporation of water surface and found that the performance of DENFIS is superior to that of ANFIS as a prediction model for water evaporation. Wang et al. [25, 26] utilized the soft computing to model the solar radiation in China, and they found that the Multilayer Perceptron (MLP) and radial basis NN models are more accurate in detecting solar radiation at different climatic zones in China; furthermore, the ANFIS and M5 model tree are found to be better in predicting solar radiation at some stations in China. Meanwhile, the multivariate adaptive regression splines (MARS) technique is considered to be an effective tool for prediction and classification of input data for modeling, especially when the input data is limited [27]. MARS technique was found to be useful for land cover classification and prediction of solar radiation [28, 29].

Here, the input variables affect the accuracy of prediction models . The input variables for the prediction of LST can be classified into urban changes and uses and land geometry [14, 15]. The correlation between LST and normalized difference vegetation index (NDVI) has been documented in Weng [30] and Weng et al. [31]. Rinner and Hussain [32] and Chen et al. [33] showed a correlation between LST and the normalized difference built-up index (NDBI). The normalized difference water index (NDWI) and normalized difference bareness index (NDBaI) were analyzed for delineating the water content in vegetation and identifying the bareness of soil, respectively [33, 34]. Feng et al. [35] found a strong correlation between LST and NDBI followed by NDWI and NVDI in Suzhou city. Furthermore, the urban index (UI) was found to have a high correlation with surface temperature in Harare, Zimbabwe [36]. The intensity of LST is directly related to the rate of urbanization, land use patterns, and building density [14]. LST is related to the patterns of land use/cover changes, for example, the composition of the built-up area, vegetation, and water bodies [33]. Xiao et al. [37] reported that the impervious surface is positively correlated with LST in Beijing, China. Therefore, the digital elevation should be considered in this study. Other studies utilized geodetic coordinates for designing the LST prediction models. For instance, Ranjan et al. [15] used real-time LST measurements and geodetic coordinates to estimate the LST.

In this study, WNN, DENFIS, ANFIS, and MARS models are applied to accurately calculate the LST from Landsat images in order to predict the temperature changes in Beijing city. In addition, the input variables (NDVI, NDWI, NDBI, UI, and topographic changes) are evaluated and assessed for LST prediction. Here, it should be mentioned that the WNN, DENFIS, and MARS models have not been used so far to predict the LST changes from satellite data; in addition, the ANFIS model is limited to the prediction of LST changes [22]. Therefore, the main aim of this paper is to investigate LST modeling using novel design of MARS, ANFIS, DENFIS, and WNN models for predicting LST changes from visible Landsat, near-infrared, and TIR imagery obtained from Landsat. The four models are designed and compared to obtain an optimal model for detecting nonlinear LST changes of Beijing area.

2. Study Area and Data Resources

2.1. Study Area

The study area, Beijing, is the capital of China and it is one of the biggest cities in the world, covering 16 districts and two counties with a land area of 16,410 km2. The administrative region of Beijing is divided into three parts: the central city, suburbs, and outer suburbs. The central city includes two districts, Xicheng and Dongcheng. The suburbs are made up of four districts, Haidian, Chaoyang, Fengtai, and Shijingshan, and the outer suburbs comprise eight districts and two counties, namely, Tongzhou, Shunyi, Fangshan, Daxing, Changping, Mentougou, Huairou, and Pinggu districts and Miyun and Yanqing counties, as presented in Figure 1(a). Beijing has been identified as one of the rapidly developing cities in China, both economically and in area. The population distribution depends on the elevation of the land, as presented in Figure 1(b). The increasing population requires additional space, which necessitates new residential areas, thereby encouraging rapid expansion of the regions urbanized. Thus, the rapid population growth and expansion of built-up areas present a daunting challenge while estimating the cultivate land LST.

2.2. Data Resources

In this research, Landsat data is used as the main source of data. Landsat satellite images taken during the dry seasons of 1995, 2004, 2010, and 2015 were obtained from the US Geological Survey (USGS) website (Table 1). The years were considered as an indicator for the time that the images were taken. Path 123 and row 32 cover the entire research area. The map projection of the collected satellite images is Universal Transverse Mercator (UTM), and the images are of type Zone 50 N-Datum World Geodetic System (WGS) 84 with pixel size of 30 m. A total of 32 ground truth points were directly observed using a high-resolution handheld GPS (Garmin GPSMAP 62s) in field survey conducted during 1995–2015 to make the accuracy assessment highly accurate. The ancillary data used in this research include land surface parameters derived from the 90 m grid spacing DEM data of the Shuttle Radar Topographic Mission (SRTM) (Figure 1(b)).

3. Methodology

This section describes the general procedures such as image processing, LST estimation, and prediction of LST, which are described in the flowchart shown in Figure 2.

3.1. Land Use/Land Cover Maps

The acquired satellite images (1995, 2004, and 2015) were classified into seven land use and land cover (LULC) classes within a five-classification scheme: urban (high and low residential areas), vegetation (high-density vegetation and low-density vegetation), bareland, waterbodies, and forestland. A supervised image classification with the support vector machine (SVM) classification method was applied to classify the LULC classes [38]. In this method, prior knowledge of the study region is used to select the training sites and the spectral information contained in individual pixels is utilized to generate the LULC classes. Finally, the summed up image is renamed to create the final form of land cover maps for various years (Figure 3). Satisfactory results were obtained with regard to classification accuracy and Kappa coefficients. The lowest accuracy was obtained for the built-up areas in 1995, which reached 88.1%; this is expected because built-up areas have mixed pixels with trees, home gardens, and actual buildings. The overall accuracies were 95.15% for 1995, 90.93% for 2004, and 94.28% for 2015 (Table 2).

3.2. Retrieval of Land Surface Temperature from Landsat-5TM and Landsat-8
3.2.1. Retrieval of Satellite Brightness Temperature (BT)

(1) Retrieval of Satellite Brightness Temperature (BT) from the Landsat-5TM Images. Based on Chen et al. [39], a two-step process was applied to obtain the brightness temperature from the Landsat-5TM images in this research. In the first step, the digital numbers (DNs) of band 6 were transformed into radiation luminance (RTM6) using the following formula: where V represents the DN of band 6, and

In the second step, the radiation luminance was transformed into at-satellite brightness temperature (BT) in Celsius (°C), using the following equation:where K1 = 1260.56 K and K2 = 607.66 (mW ∗ cm−2 ∗ sr−1m−1), which are prelaunch calibration constants under an assumption of unity emissivity; and b represents the effective spectral range when the sensor’s response is considerably higher than 50%, b = 1.239 (m) [29].

(2) Retrieval of Satellite Brightness Temperature (BT) from the Landsat-8 Images. The first step is to transform the DN of the thermal infrared band into spectral radiance (Lλ) by using the following equation obtained from the Landsat user’s handbook [40]: where Lλ is the TOA spectral radiation in watts/(m2 ∗ ster ∗ m), ML is the band specific multiplicative rescaling factor from the metadata, AL is the band specific additive rescaling factor, and Qcal represents the symbolized values of quantized and calibrated standard product pixels (D).

The second step is to convert the band radiance into BT in Celsius using the following conversion formula:where BT is the satellite brightness temperature in Celsius, and K1 and K2 represent thermal conversion from the metadata.

3.2.2. Surface Emissivity () Retrieval

The land surface emissivity () values are obtained using equation (7) [41].

In this study, we used m = 0.004 and n = 0.986. PV is the proportion of vegetation extracted from equation (9).where NDVI is the normalized difference vegetation index. NDVImin and NDVImax are the minimum and maximum values of the NDVI, respectively. The emissivity-corrected LST values were then retrieved using equation (10).

Brightness temperatures assume that the Earth is a blackbody, which it is not, and this can lead to errors in surface temperature. In order to minimize these errors, emissivity correction is important and is done to retrieve the LST from Tb using equation (9) [42].where LST is Celsius, BT is the at-sensor brightness temperature in Celsius, λ (11.5 m) is the wavelength of the emitted radiance:  mK, σ is the Stefan–Boltzmann constant, h is Planck’s constant, c is the velocity of light, and ε is the land surface emissivity (LSE). The calculation of LST for the month of April is presented in Figure 4 and Table 3 for the years 1995, 2004, 2010, and 2015.

3.3. Design and Evaluation of Prediction Models

In this section, four methods are applied and compared for predicting temperature changes. The regression and integrated NN models are used to detect the LST. The MARS, WNN, ANFIS, and DENFIS models are applied and investigated. The summary of the four methods is presented below. In this study, the LST for 2010 and 2015 are utilized to design and evaluate the models. To predict the LST for 2010 and 2015, the data for the past two years and topographic changes for the estimated image nodes are used.

3.3.1. Multivariate Adaptive Regression Splines (MARS) Algorithm

The MARS model is a nonlinear and nonparametric regression model that can be used for predicting nonlinear LST measurements [28, 29]. A spline is a utility defined piecewise by polynomials, and it is applied to a class of functions used in input-output data interpolation [28]. A common spline function is the cubic spline or knote [43]. The knote numbers and placement are fixed for regression splines. The forward and backward processes are applied to establish the knote. Basic functions (BFs) are used to search for knotes. A linear combination of BFs for the MARS model is developed as a set of functions representing the relationship between the prediction () and target variables for the LST as follows:where, m = 0, 1, …, M, are unknown coefficients that can be estimated by the least square method and are m-th basis functions that denote reflected pairs [28]. These are linear BFs of the forms (x − t)+ and (t − x), with t being the knote [28].

Therefore, the MARS process is performed in three steps. First, the forward algorithm chooses all possible fundamental functions and their related knots. Second, the backward algorithm gets rid of all basic functions in order to produce the best combinations of existing knots, and, finally, the smoothing operation is performed to obtain continuous partition borders [43].

Quirós et al. [29] summarized the classification methods by MARS. The first one relates to pairwise classification, in which the output can be coded as 0 or 1, thereby treating the classification as a regression. The second possibility involves more than two classes, with the classification serving as a hybrid of MARS called Poly MARS [29]. This study adopts the first technique.

3.3.2. Wavelet Neural Network (WNN) Algorithm

The WNN was introduced by Zhang and Benveniste [44]. The algorithm of WNN connects the NN and wavelet decomposition with a highly nonlinear wavelet function [21]. The LST prediction () based on WNN can be obtained as follows:where are coefficient variables, are dilation variables, are translation variables, andis a wavelet function.

The WNN contains three layers, namely, input wavelet or hidden, as well as output, respectively. In this study, we use five input parameters, 25 hidden layers with Mexican hat wavelet function, and a single output layer for predicting LST values. The choice of the wavelet function is based on the application, and many wavelet functions can be applied in the WNN prediction models. In this study, the Mexican hat wavelet function is used to perform nonlinear LST measurements.

3.3.3. Adaptive Neurofuzzy Interference System (ANFIS) Algorithm

In the ANFIS, the FIS parameters are calculated using NN back-propagation algorithms. The system depends on FIS and has fuzzy IF-THEN rules. The ANFIS model introduced by Jang [45] is capable of approximating any real continuous function on a compact set to any level of precision. In other words, ANFIS is defined as a system that uses fuzzy rules and the associated fuzzy inference method for learning and rule optimization purposes [46]. In this study, we apply the combined learning rule, which uses conventional back-propagation and least square methods to estimate the parameters of the model accurately [47]. The model is made up of five layers. The process of each layer is presented in [20]. Kisis et al. [19] summarized that the input and output neurons can be fuzzified by a fuzzy quantization approach.

3.3.4. Dynamic Evolving Neurofuzzy Inference System (DENFIS) Algorithm

The DENFIS model is also an integrated model like the ANFIS model [19, 48]. In DANFIS, the FIS parameters are calculated using NN back-propagation algorithms by applying evolving fuzzy neural networks (EFuNN) in the model [20].Therefore, the major difference between ANFIS and DENFIS is the application of EFuNN [20]. The EFuNN employs Mamdani-type fuzzy rules [48].Therefore, the input variables are directly assigned to rules through fuzzy membership functions without weights layer [48]. In addition, the evolving connectionist system is a distance-based, fast, one-pass algorithm that performs dynamic approximation of the number of clusters in dataset and allocates the position of cluster centers in the input data space [20]. In this clustering method, the maximum distance between the cluster center and data point must be less than a user-defined threshold value. In this model, the described model architecture is carried out in an evolving approach based on cross-linking to Takagi-Sugeno fuzzy systems [19, 20].

3.3.5. Input Variables and Performance Evaluation of the Models

In this study, the NDVI, NDWI, NDBI, UI, and topographic changes of the selected points are used and assessed. The MARS method is used to classify the input variables for predicting the LST of Beijing city.

The performance analysis of the four applied models is conducted based on three statistical analyses methods, namely, coefficient of determination (R2), root mean square error (RMSE), and mean absolute error (MAE), given by equations (13)–(15).where y and f are the actual and predicted LST values, respectively, is the mean value of the actual values, and N is the number of measurements.

4. Results and Discussions

4.1. Observed LULC and Transitions from 1995 to 2015

The spatial patterns of LULC in the study area for 1995, 2004, and 2015 are illustrated in Figure 3. The forestlands and vegetation lands dominated the entire area; however, they diminished with the continuous increment of urban areas. In 1995, urban region, vegetation, barren land, and forest were the predominant land-use types, with rates of 14.35%, 15.14%, 13.18%, and 56.08%, respectively. It was found that there were dense built-up surfaces in the urban area, which were surrounded by farmlands and mountains. In 2004, high residential area, vegetation, barren land, and forests were the predominant land-use types, with rates of 18.13%, 13.74%, 15.79%, and 51.37%, respectively. In 2015, built-up surface regions were the dominant land-use type with a percentage of 23.02%, followed by barren land and vegetation (14.30% and 10.07%, respectively). The statistical postclassification comparison results are presented in Tables 46. This comparison shows a noteworthy increment in the region of built-up surfaces; there was a significant change between 1995 and 2004, and dense vegetation zones changed into urban regions or bareland. On analyzing the changes in land use and land cover between 1995 and 2015, it was observed that there was an increase in the built-up region by 8.68% and a decrease in vegetation by 5.07%. Here, the fundamental change is that the vegetation zones and barelands changed into developed areas.

4.2. LST in 1995, 2004, 2010, and 2015

The LST, which are presented in Figure 5, were extracted using ArcGIS10.3. According to Figures 1(b) and 4, the geometry of the city affects the LST, and the results cited in Xiao et al. [37] confirm this fact. Mushore et al. [36] evaluated the significance of the predicted changes in temperature over a study area using selection points. Herein, we tested 250 random distribution points, which were chosen to accurately reflect the distribution of the LST of the study area, as presented in Figures 4 and 6. The geodetic coordinates (latitude, longitude, and height) and LST of the research area for each year are extracted and presented in Figures 5 and 6. The existing urban areas are selected from points one to 180; the water surface and areas near water are selected from points 180 to 200; and the forest and farm lands, which are suitable for urban activities, are selected from points 200 to 250.

Table 3 lists the statistical parameters (mean, maximum, minimum, and standard deviation (STD)) of the LST for the four years of study. From Table 3, it is observed that there were LST changes from 1995 to 2015 in the Beijing area. In addition, it is shown that the standard deviations for all cases are close and the difference is small. Moreover, the temperature changes for the years 1995, 2004, 2010, and 2015 were 21.45, 21.35, 30.11, and 19.01°C, respectively. The results show that there was significant temperature change in the research area and that the topography of the study area has a considerable effect on the forecast of temperature changes. In addition, the measurement values show that the LST increased rapidly with a slope of change of 0.28°C/year based on the changes in mean value from 1995 to 2015. Meanwhile, from Figure 1, it can be observed that the topographic change is high. Therefore, the geodetic coordinates (latitude (Lat.), longitude (Long.), and orthometric height (H)) are used as input factors in this study.

4.3. Prediction of Land Surface Temperature

Firstly, the input variables should be evaluated and assessed. The MARS is a best classification algorithm for the prediction models [27, 29]. According to the previous studies, the NDVI, NDWI, NDBI, UI, and topographic changes are the main factors that affect the LST changes. From the literature, it can be seen that the influence of these variables depends on the case under consideration. Xiao et al. [37] showed a high correlation between LST and the change in land geometry of Beijing city. Here, NDVI, NDWI, NDBI, UI, topographic changes, and previous temperatures records are evaluated and assessed using MARS classification. Thus, the NDVI, NDWI, NDBI, and UI of the study area for 2010 are calculated and used to model the LST for 2010. In addition, the LST for 1995 and 2005 and the geodetic coordinates are used for the selection points. The important factor (IF) is calculated from 0 to 100 for each variable. Figure 7 shows the IF of the variables. From Figure 7, it can be seen that the impact of topographic changes is high, and this confirms the conclusion made by Xiao et al. [37] for Beijing area. In addition, the influence of previous temperature records is greater than 70% with regard to the prediction of LST. It means that the previous LST records can be used to estimate the future LST. Table 7 presents the coefficient of determination values of different input variables used to predict the LST for 2010 using the MARS model. From Table 7, it can be seen that model 4 is suitable to use in our case. Secondly, the selected input parameters are used to design the models as presented below.

4.3.1. Training and Design Stage

In this study, the selected data are utilized to study the performance of MARS, WNN, ANFIS, and DENFIS in LST prediction; here, the models are built using Matlab software. The temperature at the areas in the vicinity of Beijing increased during the period from 1995 to 2015, as shown in Figure 4 and Table 3. Moreover, from Figure 4, it can be observed that the urban area increased by approximately 20% during the period from 1995 to 2015. The distribution of LST changes depending on land use and geography. In addition, the temperatures of the urban areas are higher than those of other land-use types. Therefore, the prediction models can be used for detecting future LST changes. The novel models designed in this research are applied to estimate the temperature change. The models are obtained using the geodetic coordinates of several selected points and previous LST changes. The data for 2010 is deployed as the target LST for the training stage, and the LST for 2015 is utilized as the target for the testing stage.

(1) MARS Model. To design the MARS model, five parameters are used as input: latitude, longitude, orthometric height, the LST for the period from 1995 to 2004, and the LST for 2010 (which is used as the target value). The number of BFs can be calculated using generalized cross-validation (GCV) [28]. Figure 8(a) shows the GCV for 199 BFs. From this figure, it is observed that the minimum GCV is observed at 151 BFs. However, the model designed in this instance used a constant parameter 0.7414 and 151 BFs. The performance of the designed model is demonstrated in Figure 8(b). In addition, the statistical analysis for the prediction model is carried out, and R2 for the model is found to be 0.97, while the RMSE and MAE are 0.78 and 0.55°C, respectively. From the statistical analysis, it is found that the correlation between LST measurements and predictions is high, and the model error is small. Therefore, the model can be used for detecting LST changes over the selected points of the study area.

(2) WNN Model. For the WNN design, many wavelet network models have tried to optimize the structure of the wavelet network for the training stage. It was found that the optimum model that can be applied in this study is presented in [21]. This structure enables the use of statistical analysis. R2, RMSE, and MAE for WNN are 0.67, 2.57°C, and 1.86°C, respectively. Therefore, the input year has five values selected for the geodetic coordinates and previous observations of LST; in addition, the hidden layer of 25 wavelet neurons is employed to forecast the future values of the LST measurements. Figure 9 illustrates the measurements and predicted LST for year 2010. From Figure 9 and the statistical analysis, it observed that the performance of the WNN model is lower than that of the MARS model in the training stage.

(3) ANFIS Model. For the ANFIS model, the number and type of membership functions (MFs) for the model are the main factors that determine the accuracy of the model. Eight different MFs can be applied in the ANFIS model. The eight MFs are evaluated with two MFs and ten epochs to reduce the CPU time; the RMSE for eight MFs are also calculated. The RMSE for the Gaussian, trapezoidal, triangular, d sigmoid, p sigmoid, pi-shaped, two Gaussian, and generalized bell are 1.71, 1.64, 2.38, 1.65, 1.52, 1.61, 1.45, and 1.62°C, respectively. However, the two-Gaussian function is deployed to predict the LST. In addition, two and three MFs are assessed to limit CPU and memory usage. The RMSE for two and three MFs are 1.45 and 1.42°C, respectively. In this work, three MFs are selected to predict the LST. Finally, the ANFIS model based on three MFs and two Gaussian functions with 50 epochs was applied in this study. The designed model is shown in Figure 10(a). The performance of the ANFIS model is evaluated using equations (13) to (15). R2, RMSE, and MAE for the designed model are 0.99, 0.46°C, and 0.26°C, respectively. Figure 10(b) shows the measurements and prediction values for the training data of LST. From Figure 10(b), it is seen that there is no loses of information during the LST measurements. Also, the correlation between the measurements and the predicted LST is high, and minimum RMSE and MAE are observed. From these results, it can be concluded that the ANFIS model is better than MARS and WNN for predicting the LST from training data and can therefore be used to predict the Beijing LST.

(4) DENFIS Model. In the training stage, the major difference between ANFIS and DENFIS is the application of EFuNN, as mentioned previously. Moreover, the maximum distance between the cluster center and data point must be less than a user-defined threshold value [20]. In this study, the distance threshold is 0.1. Therefore, the EFuNN is applied to estimate the next epochs. Figure 11(a) illustrates the mean square error (MSE) for the training epochs of EFuNN. From this figure, it can be observed that 16 epochs are required to obtain the best performance for the DENFIS model. Otherwise, the same parameters that were used for the MFs of the ANFIS model are used in DENFIS model, that is, three MFS and a Gaussian function. The performance of the designed DENFIS design model is shown in Figure 11(b). The statistical evaluation of the model error was conducted, and the values of R2, RMSE, and MAE were found to be 0.91, 1.42°C, and 0.90°C, respectively. Furthermore, Akaike’s information criterion (AIC) [49] of the models was calculated and found to be −11.72 and 4.23 for ANFIS and WNN models, respectively. The results for the training stage show that the ANFIS model is the optimal model for predicting the LST values, while the WNN is the worst model.

4.3.2. Testing and Comparison Stage

The performance of the four models was also evaluated in the testing stage. Figure 12 and Table 8 show a scatter plot for the observations and prediction values of LST and the statistical analysis of the model errors. The linear fitting is also shown in the figure. From Figure 12 and Table 8, it is can be seen that the correlation between the observation and prediction is high for the ANFIS and DENFIS models. Moreover, the worst model during the testing stage is the WNN model. The minimum model error is observed with the ANFIS model, and its RMSE and MAE are 0.36 and 0.16°C, respectively. The slope of linear fitting is found to be 0.98, 0.93, 0.87, and 0.70 for the ANFIS, DENFIS, MARS, and WNN models, respectively. It means that the ANFIS model optimally detects the LST in the testing stage.

Based on the performances in the training and testing stages, it is safe to argue that the ANFIS model is the optimal model to predict the LST for the Beijing area. The ANFIS model is applied to detect the LST for the years 2025 and 2035, and the results are presented in Figure 13(a). In addition, the linear fitting is calculated and presented for the mean values of the observations and prediction years, as demonstrated in Figure 13(b). The prediction values show that the LST will increase rapidly, and the slope of change is 0.33°C/year. It is also observed that R2 for the linear fitting is 0.94, and the prediction trend is similar to the LST measurements for the years 2010 and 2015. In addition, it is seen that the prediction values for the LST of water, forest, and farm areas do not show any significant changes compared to the urban areas. It means that the changes in climate and urban area will affect the LST changes of the Beijing area in the future.

5. Conclusions

Although the land surface temperature (LST) is widely evaluated in global temperature variation as a result of climate change, the application of integrated soft computing models is still limited to use for detecting LST. This study aimed to design an integrated soft computing model to detect the LST of Beijing area. Four models were developed and compared, namely, MARS, WNN, ANFIS, and DENFIS. In addition, the MARS method was used to classify the input variables. Landsat images and different software were utilized to mine the LST changes from April 1995 to 2015. In addition, a statistical analysis was conducted to assess the performance of the developed models. From the results, the following can be concluded.

First, the comparison of temperature values for the studied periods through image’s analyses shows that there has been a significant change in temperature in Beijing city; the temperature is increased by 0.28°C/year. In addition, the vegetation cover decreased due to the population explosion and economic activities. In addition, the classification of NDVI, NDWI, NDBI, UI, and geographic changes of Beijing and previous temperature records show that the surface and previous LST changes have a significant impact on the LST prediction of the study area.

Second, the evaluation of the prediction models shows that the integrated fuzzy models, ANFIS and DENFIS, are the optimum models for detecting the LST changes in the Beijing area. In addition, it is shown that the MARS model exhibits good performance with limited input parameters. The WNN model showed the weakest performance in predicting the LST changes. The performance of the ANFIS model for the training and testing stages is satisfactory because it has low minimum errors; the RMSE for the training and testing stages were 0.46 and 0.36°C, respectively.

Finally, the predicted LST values show that the changes in climate and urban area will affect the LST changes of the Beijing area in the future. Moreover, this approach could be used to evaluate and predict the LST of other areas from satellite images.

Data Availability

Landsat data were download from EarthExplorer: https://earthexplorer.usgs.gov/. The ancillary data included land surface parameters derived from the Shuttle Radar Topographic Mission (SRTM) 90 m grid spacing DEM data.

Conflicts of Interest

The authors declare no conflicts of interest.

Authors’ Contributions

All the authors have contributed equally to this work.

Acknowledgments

This work was supported by the National Key Research and Development Program of China under Grant no. 2016YFC0803105 and the National Natural Science Foundation of China under Grant no. 41771451.