Abstract

Despite the rapid growth of machine learning (ML) and its far-reaching applications in various fields such as healthcare, finance, and urban heat management, there are still some unresolved challenges in the field of climate change. Reliable subseasonal forecasts of summer temperatures would be a great benefit to society. Although numerical weather prediction (NWP) models are better at capturing relevant sources of predictability, such as temperatures, land, and sea surface conditions, the subseasonal potential is not fully exploited. One such challenge is accurate subseasonal temperature forecasting using cutting-edge ML technology. This study aims to assess and predict the changes in subseasonal temperature during the summer season (from March to June) in Senegal on 2-weeks time scales. Six ML techniques, including linear regression (LR), decision tree (DT), support vector machine (SVM), artificial neural network (ANN), long short-term memory (LSTM), and gated recurrent units (GRU), are used. The experiments utilize a multivariate approach by incorporating variables of the ERA-5 dataset from 1981 to 2022. The results compared all the performances of the methods to assess their overall effectiveness in forecasting air temperature (t2m) values over 2 weeks. Our analysis demonstrates that the GRU model outperforms the other ML models, achieving a Nash–Sutcliffe efficiency (NSE) score of 74.68% and a mean absolute percentage error (MAPE) of 2.51%. The GRU model effectively captures long-term dependencies and exhibits superior performance in temperature forecasting. Furthermore, a comparison between the observed and predicted values confirms the accuracy of the GRU model in aligning with actual temperature trends. Overall, this study contributes an impactful deep learning model to the field of subseasonal temperature forecasting in West Africa (Senegal), which offers local authorities the capability to anticipate climatic events and enact preventive measures accordingly.

1. Introduction

Monitoring and forecasting subseasonal temperature play a crucial role in investigating future climate patterns. The impacts of climate change and global warning have been identified as the greatest global health threat of the twenty-first century, exposing billions of people live to risks [1, 2], increasing the risks of water-related disasters (e.g., floods and severe droughts) [3, 4], affecting the water quality in many regions in the world [5], and might also have an impact on the urban climate environment, which can affect the urban climate, environment, and sustainable development [68].

Consequently, it requires a reassessment of how we address the protection of vulnerable populations [9, 10]. The World Health Organization (WHO) acknowledges the negative health impacts associated with climate change [11]. Moreover, regions with weaker healthcare systems face greater challenges in adapting, preparing, and responding to the escalating health risks arising from a changing climate [12, 13].

One of the manifestations of climate change is the increase in average temperatures and the occurrence of extreme temperature events, commonly referred to as heat waves [14]. This rise in temperature has been observed around the world and is expected to become more intense and frequent due to human activities [15, 16]. However, in less-developed countries like Senegal, there is a lack of comprehensive studies on extreme temperatures and their specific health impacts [17]. Senegal, a country in West Africa, is facing the effects of climate change, which have a significant impact on the health of its population [18, 19]. Sweltering temperatures, which are becoming more frequent and intense, represent one of the most evident impacts of climate change [20]. They can cause health problems such as heat strokes, dehydration, cardiovascular and respiratory diseases, and even death, especially among the most vulnerable individuals, such as the elderly and children [21]. According to the literature, an intensification and exacerbation of prolonged periods of extreme temperatures could be expected in the coming years, making the study and understanding of these phenomena crucial for preventing harmful effects on human health [22].

Accurate forecasting of summer temperatures with subseasonal lead times has become increasingly important for various applications, including agriculture, energy demand, and, most notably, public health [23]. Subseasonal prediction can provide valuable insights for planning and decision-making processes, especially in protecting vulnerable populations from heat-related risks [24]. Therefore, forecasting such temperatures with might not be possible, but the prediction error or computational speed can be minimized [25].

In the past, machine learning (ML) methods have proven to be the most promising tools for their capabilities in capturing complex patterns, nonlinearity, and accuracy to predict natural phenomena such as temperature, precipitation, drought, flood, and soil temperature [24, 2628]. ML-based neural networks such as WNN have also proven their ability to model photovoltaic power plants [29]. Subseasonal temperature forecasts allow for estimating temperatures for the upcoming weeks, offering the opportunity to proactively plan prevention and adaptation measures.

Linear regression (LR) is one of the oldest and most common ML methods used in different areas for regression tasks. LR is applicable to single or multiple variable problems [3032]. Decision tree (DT) regression is another common ML method used in the literature to solve regression problems [33, 34]. Support vector machine (SVM) regression is also a well-known and typical ML method for determining the link between features and targets [32, 3537]. Artificial neural network (ANN), one of the most popular ML methods, has been used to solve many ML problems in different areas [3841]. It is capable of identifying nonlinear patterns in the functional connection between features and targets. Deep neural network methods, named long short-term memory (LSTM) and gated recurrent unit (GRU), also used in this study, have proven effective in time series prediction by capturing patterns from sequential data and retaining them in internal state variables [4244]. The ability of these methods to retain crucial information over time enables them to effectively classify, process, and predict complex dynamic sequences.

The aim of this study is to demonstrate the capability and effectiveness of ML methods for modeling subseasonal summer temperatures in the Sahel (Senegal). Although most studies have already been conducted using traditional complex approaches and single machine learning methods for temperature prediction, few studies have discussed the impact of changes in summer air temperatures on a subseasonal scale. Furthermore, these studies discussed the change in summer temperature using a univariate approach, the numerical weather prediction (NWP) approach, simple ML methods (e.g., RF and LR), or simply used artificial neural networks (e.g., ANN) to predict air temperature. Among these studies, only a few researchers have combined multiple ML methods, compared their performance, or used deep learning methods with multivariate approaches.

Forecasting changes in summer temperature on a subseasonal scale using different (advanced) ML models is therefore a new and crucial task. In this investigation, this topic is further investigated on the basis of previous studies. This study primarily improves the detection and forecasting of subseasonal temperatures during the summer season from March to June (MAMJ) in the Senegal region by comparing the performance of six machine learning and deep learning models, including LR, DT, SVM, ANN, LSTM, and GRU based on multivariate approaches.

The remainder of this paper is summarized as follows: Section 2 reviews related work. Section 3 describes the study area, the data, the methods, and the evaluation metrics used. Section 4 describes the experiments and results. Section 5 describes the discussion of the results, and Section 6 draws some conclusions and suggestions for future work.

Machine learning, especially deep neural networks, has emerged as a powerful tool for subseasonal forecasting, as it can effectively capture complex and nonlinear relationships between climatic variables [45]. Recent studies such as Guigma et al. [46] and Domeisen et al. [47] have examined the ability to forecast heat waves or extreme temperatures in the Sahel with subseasonal lead time and the importance of atmospheric tropical variability modes using reanalysis data from the European Center for Medium-Range Weather Forecasts (ECMWF). They found that temperatures in the Sahel can be predicted up to 3-4 weeks in advance with better accuracy and that the predictive capacity can be improved by incorporating atmospheric tropical variability modes such as the Madden-Julian Oscillation (MJO) and the African Easterly Jet (AEJ) [46, 48, 49].

Van Straaten et al. [50] and Benet et al. [51] used machine learning-based explainable forecasting (random forest) to discover subseasonal drivers of high summer temperatures in western and central Europe. They found that atmospheric circulation patterns, soil moisture, and sea surface temperature anomalies are important predictors of high summer temperatures [52]. Zhang et al. [44] demonstrated the effectiveness of artificial neural networks (ANN) in predicting LULC and land surface temperature (LST).

Benet et al. [51] investigated the subseasonal forecasting of summer heat waves in Central Europe using linear and random forest (RF) machine learning models. They found that the RF model outperformed the linear model and that anomalies in sea surface temperature were important predictors of central European summer heat waves. Sharaff and Roy [53] and Anjali et al. [54] analyzed the performance of multilinear regression (MLR), support vector machine (SVM), ANN, and regression tree methods to predict daily temperature values using data collected from Mumbai Chhatrapati Shivaji Airport (2001–2016) and Central Kerala (2007–2015). They compared the results based on mean squared error (MSE), root mean squared error (RMSE), mean absolute error (MAE), and correlation coefficient metrics. ANN and MLR revealed the best performances on the Mumbai (MSE = 3.46) and Central Kerala datasets, respectively.

The linear regression (LR), random forest (RF), gradient boosting machine (GBM), decision tree (DT), ANN, and SVM regression methods have also shown their capability in estimating the daily air temperature [55]. Among these ML models, GBM showed the best performance (coefficient of determination  = 0.68, MAE = 1.60, and RMSE = 2.03). Zhang et al. [56] developed a seasonal forecasting of the frequency of heat waves or extreme temperatures in the Eurasian summer using convolutional neural networks (CNN). They showed that the deep learning model can predict the frequency of heat waves with high accuracy up to 3 months in advance and that the North Atlantic Oscillation and the Western Pacific Subtropical High are important predictors of the frequency of heat waves in the Eurasian summer [57]. Lu et al. [58] developed a data-driven global subseasonal forecast model (GSFM) for intraseasonal oscillation components using deep learning. They found that GSFM outperformed the traditional statistical model and that the MJO is a key driver of intraseasonal oscillation [48].

Weyn et al. [59] investigated subseasonal forecasting with a large ensemble of deep learning weather prediction models. The results of their study showed that the ensemble model outperformed the individual models and that sea surface temperature anomalies were important predictors of subseasonal weather. In summary, the literature suggests that deep learning models can effectively predict subseasonal summer temperature and that the inclusion of additional predictors such as atmospheric circulation patterns, soil moisture, sea surface temperature anomalies, and intraseasonal oscillations can improve prediction accuracy.

More research is needed to investigate the optimal combination of predictors and deep learning (DL) models for subseasonal summer temperature forecasting in specific regions, such as West Africa (Senegal). Such a study has not yet been investigated in this particular area using advanced machine learning techniques. Therefore, the main contribution of this work is to compare six machine learning techniques named LR, DT, SVR, ANN, LSTM [60], and GRU [61, 62] based on the multivariate approach to forecast summer temperature in the region of West Africa (Senegal).

In the following sections, further explanation about the data and methods, experiments, and results is provided, followed by a discussion and conclusion with future work.

3. Study Area, Data, and Methods

3.1. Study Area

Senegal, located on the West coast of Africa [63], serves as a captivating study area for our research focused on the prediction of subseasonal summer temperatures with machine learning (ML) techniques. With a diverse landscape ranging from the arid regions of the Sahel to the verdant Casamance in the south, Senegal has unique climatic conditions that make accurate temperature prediction difficult. It is important to be able to anticipate these climate events and take the appropriate measures. The study seeks to leverage advanced ML techniques to enhance our understanding of subseasonal variations in summer temperatures, exploring patterns, trends, and influencing factors. The implications of successful prediction models extend beyond meteorological understanding, encompassing applications in agriculture, water resource management, and climate change adaptation. As Senegal grapples with the impacts of a changing climate, this research aims to contribute valuable information on the dynamics of subseasonal temperature, ultimately aiding in the development of more resilient and adaptive strategies for the region. Additionally, health authorities constantly require information on heat conditions, preferably with a lead time of at least 7 days to one month, in order to implement necessary measures to protect vulnerable populations [64]. Figure 1 highlights the location of the study area.

3.2. Data

The data used for this study are from the ERA-5 dataset provided by the European Center for Medium-Range Weather Forecasts (ECMWF, [65]). The parameters used, which are given in Table 1, were collected in the ERA-5 database from 1981 to 2022. The data with the highest correlation are selected using the Pearson correlation coefficient [66] and entered into the prediction model. Figure 2(a) shows the development of temperatures from 1981 to 2022 in the form of stripes, which show an exponential rise in temperature over West Africa (Senegal). Figure 2(b), on the other hand, shows the distribution of temperatures according to monthly averages. This distribution clearly shows that April, May, and June are the hottest months of the year. This is the reason why we have chosen the months of March to June for the predictions. Figure 3 shows the anomaly of the temperature, the trends, and the climatological reference in Senegal from 1981 to 2022, which is interpreted as follows:(i)The blue curve represents the temperature anomaly compared to the climatology (1991–2020). It shows how temperatures vary relative to the climatic average during this period. Positive values indicate temperatures above the average, while negative values indicate temperatures below the average.(ii)The horizontal dashed line at represents the climatological reference. Points above this line indicate temperatures higher than the climatic average, while points below indicate temperatures lower than the average.

In summary, the figure illustrates how temperatures in Senegal have evolved over the years, highlighting variations compared to the climatic average and indicating the general trend using the linear regression.

3.3. Machine Learning Methods

Machine learning (ML) can improve subseasonal forecasts by incorporating large amounts of historical data and learning patterns that can be used to make more accurate predictions. ML techniques [67] can capture predictability on subseasonal time scales and can outperform climatological baselines [51]. Machine learning models, called linear regression (LR), decision tree (DT), support vector machine (SVR), artificial neural network (ANN), long short-term memory (LSTM), and gated recurrent unit (GRU), are used for prediction with a dataset of size 2749026, using data from , , and for training, validation, and test sets, respectively. For these datasets, we use the shift function to transform the time series data into a supervised learning problem. Figure 4 shows an overview and a simple representation of the ML models (LR, DT, SVR, and ANN).

3.3.1. Linear Regression (LR)

Linear regression (LR) is the most common predictive machine learning (ML) model to find a linear relationship between one or more predictors. The LR, described by equation (1), can be single regression or multilinear regression [30, 31]. The single regression minimizes a single objective function based on the single variable.where is the dependent variable, which can be either a continuous or categorical value, and is the independent variable.

In this case, multilinear regression (MLR), also known as multiple regression, is used. MLR is the most common form of linear regression analysis. In contrast to simple linear regression, MLR establishes the relationship between the response variable (target) and the predictors. With multiple variable regression, the method not only uses a single objective function but also combines the individual objective functions of each variable using a weighting scheme [68]. The parameters L1 and L2 are used to measure the elasticity in the weighting scheme. When configured, the approach minimizes the difference between the line and the best-fit criterion, yielding the best-fit line for prediction. Multiple regression is represented by the following equation: where y is the number of observations is the number of inputs is the constant or intercept coefficient for the first and explanatory variables are the slope coefficients for the first and explanatory variables. represent the observation of each variable , respectively. is the remaining unexplained noise in the data, i.e., a random error term representing the remaining effects on y of variables not explicitly included in the model and are unknown and, therefore, are estimated from the data. So, these values and are estimated by employing the least square criterion with minimum sum of square of error terms (MSS); this allows to find the values and that minimize MSS. represent the observation : simulated values must satisfy the following equation:

Different values of the hyperparameter are tested during the training of the LR model and the values are selected for the final model.

3.3.2. Decision Tree Regression (DTR)

The decision tree regression (DTR) model [69] is a machine learning algorithm used for predicting continuous numerical outcomes. DTR works by recursively partitioning the dataset into subsets based on the values of input features, creating a tree-like structure [70]. At each internal node of the tree, a decision is made based on a specific feature, and the dataset is split into two or more branches [71]. This process is repeated until a stopping criterion, such as maximum depth or a minimum number of samples per leaf, is reached. The leaf nodes of the tree represent the final predicted values. The prediction for a new instance is then the average of the target values in the leaf node to which it belongs.

3.3.3. Support Vector Machine

Support vector machine (SVM) regression, a machine learning technique, finds the hyperplane that best separates the optimal trade-off between fitting the data closely and maintaining a margin. The vector lengths and variance between the feature and the plane are minimized. Additionally, SVM regression can incorporate kernel functions to map the input features into a higher-dimensional space, allowing it to capture nonlinear relationships. The kernels include Euclidean, Gaussian, Exponential, and Dirichlet kernels [72]. The following equation describes the general form of SVR. The SVR method is trained with four different kernels: linear , rbf (), poly, and sigmoid. The rbf kernel is then selected for the final model.where is the base or bias is the weight is the number of inputs is the target

Equation (5) is also called the equation of a hyperplane. The SVR model is trained by solving the optimization problem states in equation (6) according to the constraints described in equations (7)–(9).subject towhere controls the smoothness of the model are slack variables is the function of projection of the input space to the feature space is the parameter of bias is the target value to be estimated

3.3.4. Artificial Neural Network (ANN)

The artificial neural network (ANN) is a machine learning technique commonly used in temperature prediction [38, 44, 54]. The error backpropagation learning algorithm, commonly known as the least-mean-square algorithm (L.M.S. algorithm) [30, 73] is used by the ANN to capture the nonlinear pattern between the input and target series. The widely used multilayer feed-forward ANN, called a multilayer perceptron (MLP) which consists of one input layer, one or more hidden layers of computational nodes, and one output layer, is used in this study due to its popularity [7476]. The ANN trained with a hidden layer (equation (10)) size of {}, the ReLu () activation function, and the widely used ADAM optimizer [77] is selected as the final model configuration.where is the predicted kth output (t2m) at time step t. is the weight that connects the ith neuron in the input layer and the jth neuron in the hidden layer. is the activation function for the output neuron. is the activation function of the hidden neuron. is the weight that connects jth neuron in the hidden layer and the kth neuron in the output layer. is the bias for the hidden jth hidden neuron. is the bias for the kth neuron. is the number of hidden neurons. is the number of samples. is the ith output variable at the time step used.

3.3.5. Recurrent Neural Networks (RNNs)

The RNNs [78, 79], one of the deep learning (DL) methods, are deep forward neural networks (DFNN) with one-way information flow during the training phase. Some experience ([80]) has shown that RNNs can encounter significant challenges when training sequences with long-term dependencies, where temporal information between input and output sequences spans a long period of time. Accordingly, this can lead to difficulties related to backpropagation errors as the time span of long-term dependencies increases. As a result of the vanishing gradient problem in RNN training, updates to the weights become insignificant when the error is backpropagated [81].

Unlike canonical RNN, the LSTM (Section 3.3.6) and GRU (Section 3.3.7) models do not have these difficulties because they have extended memory components, such as memory cells and gates, that store long sequences of information over long periods of time. For time series, as in its generality, a simple encoder-decoder for multilevel forecasting with RNNs such as LSTM and GRU is used. Let us assume a set of time series. The setting consists of a multivariate time series , , with a feature vector and a target variable . Assuming that at time , the values at the time points that depend on the previous T values in the time series are predicted. Let us denote as the training dataset. The input feature matrix contains all past and current features , as well as all past targets . Thus, the function to approximate is given by the following equation:where is the t2m at point and is the additional regressor at point . For this case study, and . The architecture of our deep neural network models (LSTM 3.3.6 and GRU 3.3.7) derived from the general structure of RNN is given in Figure 5.

3.3.6. Long Short-Term Memory (LSTM)

The LSTM unit [60, 82, 83] has three different gates, namely, the input gate, the forget gate, and the output gate. The forget gate determines what relevant information is required from the previous steps. The input gate determines what relevant information can be added from the current step, and the output gate completes the next hidden state. The internal structure of an LSTM unit cell according to [84] is shown in Figure 6. The LSTM cell model is defined by the following mathematical functions:where is the time step; is the input; is the hidden state vector (output vector of the LSTM unit); is the previous hidden state; is the bias vector at t. is the weight matrix between forget gate and input gate. is the input gate at t; is the output gate activation vector at t; is the cell activation vector or value generated by tanh; is cell state information; and contain the weights of the input and recurrent connections, where the subscript refers to the input gate , the output gate , the forgetting gate , or the memory cell unit . In addition, , , and are weighting matrices and bias vector parameters to be learned during the training phase, where the indices and refer to the number of input features and the number of hidden units. denotes element-wise multiplication. and are the logistic sigmoid and hyperbolic tangent functions with the following mathematical formula:

3.3.7. Gated Recurrent Unit (GRU)

The gated recurrent unit (GRU) introduced by [61, 62] is like a special type of LSTM-based recurrent neural network that has fewer parameters than the LSTM [85]. Its internal unit is similar to the internal unit of the LSTM [62, 86] except that it has a separate cell state ; it has only one hidden state and is faster to train due to its simpler architecture. Therefore, other new systems based on GRU models have been developed, such as the multi-GRU prediction system after the study of the authors in [87]. The GRU [82, 83] helps avoid the vanishing gradient problem that usually occurs in LSTM and requires less computation to update the external states. The update gate determines how much of the previous state’s information is retained in the current state, while the reset gate determines whether to combine the current state with the previous information. The structure of the GRU unit cell [88] is shown in Figure 7. The mathematical functions used to characterize the GRU unit are as follows:where is the sigmoid activation function; is the update gate; is the reset gate, : denotes the candidate hidden layer; , , and are the weighting matrices for the corresponding connected input vector; , , and are the weighting matrices of the previous time step; , , and are biases. The following paragraph describes the approach to model development.

3.3.8. Multivariate Approach

The machine learning framework with linear regression (LR), decision tree (DT) regression, support vector machine (SVM) regression, artificial neural networks (ANN), recurrent neural networks based on long short-term memory (LSTM), and gated recurrent units (GRU) offers multilevel predictions that are generated for two time steps (horizons = 2 weeks). According to [89], forecasting is the prediction of time series from past data into future data on a longer time scale. Thus, in our multivariate approach, data from the entire West African (Senegal) study region are used to examine the relationship between the different futures. Thus, the selection of Pearson correlation features ([66]) is examined to characterize the degree of correlation of each feature with the target feature, air temperature (t2m). Based on the result of the Pearson correlation matrix, six features are selected. Then, using the past information, the next two weeks are predicted.

3.3.9. Model Training

The LR, DT regression, SVM regression (SVR), and ANN are implemented using the Sklearn framework. These models are trained and tested several times using the parameters and their value ranges provided in Table 2. The LSTM and GRU frameworks are written in Python and implemented using the Keras application of TensorFlow [90, 91]. The dataset is divided into a training set to train the model (1981–2010), a validation set (2010–2015) to improve model performance by fine-tuning the model after each epoch, and a test set (2015–2022) to determine the final result after training (Figure 8). The architecture of the deep learning models consists of one input layer, two bidirectional layers for LSTM and GRU, with fifty and hundred neurons in the hidden layers, respectively, and an output layer. All deep learning model configurations are trained several times with different activation functions (e.g., linear, tanh), numbers of epochs (e.g., 20, 50, 100), and neurons (e.g., 50, 100, 150). Additionally, the adaptive moment optimizer (Adam) is used for the following reasons:(i)Faster convergence(ii)Computational efficiency(iii)Low memory requirement(iv)Invariant to diagonal rescaling of gradients(v)It is well suited for problems with large data or parameters, according to the authors in [77].

The linear activation function is used for dense layers in deep learning methods. To evaluate the model’s performance, the mean squared error (MSE) is used as a loss function that quantifies the squared difference between the observed and predicted values. However, the overfitting phenomenon may occur during the testing phase because of the large starting weight of the network. Hence, to overcome this issue, optimizing the model by using the early stopping approach might be necessary, which automatically stops training at the step when performance on the validation set starts to degrade. Ensuring each experiment with LSTM and GRU models starts with a unique set of weights and biases representing the LSTM and GRU model parameters, enabling the model to deal with nonstationarity in the data. Tables 2 and 3 show the hyperparameters used for each method and their value ranges.

3.4. Model Evaluation

The training set and the independent test set are evaluated to assess the performance of the applied machine learning models. However, the following evaluation metrics are most commonly used: (1) coefficient of determination , (2) mean absolute error (MAE), (MSE), (3) root mean square error (RMSE), (4) mean absolute percent error (MAPE), (5) Pearson correlation, and (6) Nash–Sutcliffe Efficiency (NSE). The measures the degree of correlation between the observed and predicted values; the MAE provides the absolute error information; the RMSE (square root of the MSE) measures the average magnitude of the errors between the predicted and actual values and how the residuals are distributed; the MAPE quantifies the absolute difference between the observed and predicted values divided by the observed values; and the NSE measures the magnitude of the residual variance with respect to the variance of the observed t2m values. The models with the highest and the lowest {} show the best performance. These statistical performances are defined by the following equations:where is the number of samples used in the test set, are the observed t2m values, are the predicted t2m values, and is the mean value of the observed t2m values.

Table 4 outlines the criteria for the previously mentioned statistical methods used in the search for a better prediction model.

4. Experiments and Results

This section describes the experiments carried out to confirm the ability of machine learning (ML) models, especially neural network models, to learn. The results for each method are investigated more in detail.

4.1. Feature Correlations

Before the learning algorithm can be applied to the dataset, specific characterizations of the data must be confirmed. Because the current investigation is based on information from the samples, the idea of presenting the correlation that exists among the variables arose. Therefore, an examination of the visualization and analysis of the characteristic variables is carried out on the basis of the Pearson correlation coefficient (r) in the range between and 1 [66]. The information describing the relationship between features is then stored in Pearson’s correlation matrix (Figure 9a), where means positive correlation, no correlation, and negative correlation.

Figure 9(a) reveals that the features are correlated between themselves and with the target. Some variables have a high correlation, whereas others have a low correlation. This figure shows a (high) positive correlation between the features (rh1000, td, and z500) and the target. The t500 and t850 are also correlated and remain within the accepted range (). However, other variables have a lower correlation with the target. The five most correlated variables (rh1000, td, t850, t850, and z500) are selected for multivariate model development in the entire study region.

4.2. Assessment of Machine Learning Models

In this part of the study, models are developed to predict subseasonal temperatures over West Africa (Senegal) for the period March to June (MAMJ) using ERA-5 data and applied machine learning (ML) methods to gain a better understanding of the predicted results. There are six different ML models, including linear regression (LR), decision tree (DT), support vector regression (SVR), artificial neural networks (ANN), long short-term memory (LSTM), and gated recurrent unit (GRU). The developed ML and deep neural network models are used to predict the air temperature (t2m) pattern in Senegal using the provided dataset. The dataset contains 2749026 records of daily t2m, dew point, temperature, relative humidity, and geopotential from 1981 to 2022. The training set is split into two parts: for training and for validation. The data from are used for testing to decide on the best model. Approximately, the past three weeks were used to predict the next two weeks. Figure 9(b) shows the distribution of the data. From this histogram, it appears that the data follow the Gaussian distribution.

The Python packages Sklearn and Keras were used to train the ML and neural network models, respectively. Firstly, the data are split into training, validation, and test sets. Secondly, the training and validation data are ingested into the models, and the training process starts. Different model configurations were tested in the training using the error in order to obtain the best training configuration (best model). Finally, after obtaining the best training result, the test set is introduced to the trained models, and the results are then compared to the observed values.

The following statistical indicators were used to evaluate the relationship between the observed and predicted values: the coefficient of determination , the root mean squared error , the mean absolute percentage error , the mean absolute error , Pearson’s correlation (Pearson’s r), and the Nash–Sutcliffe efficiency . The detailed results for each model are given in Table 5, which shows the implementation and effectiveness of the developed models in the summer air temperature (t2m) estimation for West Africa (Senegal) with n_shift equal to 25 and two time steps ahead ( and ), i.e., the next 2 weeks. The optimal hyperparameters values for ML algorithms are obtained using hyperparameters tuning with GridSearchCV.

The first ML approach used to predict t2m values is LR with l2 regularization. The optimal value for alpha is 7, and the solver used is lsqr. Figure 10(a) shows the correlation (regression line) between the observation and the prediction of the LR method. The plot shows a slightly better fit, but the accuracy ( and ) of the t2m prediction is lower (Table 5).

The second ML method applied is DT. The following values are used for this study: , max _features = “auto”, max_leaf_nodes = 10, , and splitter = “best”. Figure 10(b) shows the regression line between the observed values and the test results of the DT method. Table 5 and Figure 10(b) indicate that the DT model provided the worst accuracy than the LR model. As can be seen from the picture, DT has problems predicting the t2m pattern.

Another ML method used is SVR. The kernel function, the radial basis function (RBF), the kernel coefficient (gamma = auto), the optimization tolerance , and the regularization parameter were the optimal hyperparameters selected for this study. Figure 10(c) indicates the correlation between the observed values and the predicted results of the SVR model in the test set.

The ANN method used in this study is a multilayer perceptron (MLP) with backpropagation, which is commonly used for the prediction of temperatures [38, 54]. The ReLU activation function, the Adam optimizer, the alpha value of 0.1, and the hidden size of 50 were used to obtain the best test results. Table 5 and Figure 10(d) show the correlation between the observed values and the predicted values of the ANN model in the test set.

The deep learning models used in this study are LSTM and GRU. Different model configurations are tested. The number of epochs, the neurons, and different activation functions (Figure 5) were tested through these configurations to obtain the best result for the test set. All LSTM and GRU model configurations were trained with the Adam optimizer. So, the best test results were obtained for the model configuration with 50 epochs, 50 neurons, tanh activation function in the hidden layers, and ReLU activation function in the dense layer. Table 5 and Figures 10(e) and 10(f) show the correlation between the observed and predicted t2m values of the LSTM and GRU models on test data. The plots show a better fit, and the results are highly accurate. These models performed approximately well, with values of and , respectively.

However, the GRU model (ReLU activation function, 50 epochs, and 50 neurons) shows the best predictions of t2m values (, , Pearsons'r = 0.8707) compared to the LSTM model (ReLU activation function, 50 epochs, and 50 neurons) with , , , and Pearsons'r = 0.8626. Overall, the GRU model with low variance is the best performing deep learning algorithm for n_sift = 25 ( weeks) in this study of summer temperature prediction (next 2 weeks) over West Africa, especially Senegal.

4.3. Visualization of Predictions: Comparison of Observed with Predicted Values

As can be seen from Table 5 and Figure 10, deep learning methods such as long short-term memory (LSTM) and gated recurrent units (GRU) are well trained and show acceptable test performance compared to other ML methods. The ANN method performs relatively better than LR, DT, and SVR, but it still underperforms LSTM and GRU. Among all the methods, GRU (, , and Pearsons'r = 0.8707) and DT (, , and Pearsons'r = 0.6492) show, respectively, the best and the worst performance. The GRU model is capable of predicting temperatures over 20 degrees Celsius, in contrast to the DT model.

Figure 11 shows the comparison of the predicted t2m values with the observed values for the GRU model. The figure reveals how the GRU model performs better in estimating summer air temperatures. Although the performance of most LSTM models was pretty good and accurate, the GRU model (with ReLu activation, 50 epochs, and 50 units) provides the best results for temperature predictions (MAPE = 2.51%, NSE = 74.68%) and is able to capture long-term dependencies and provide predicted t2m values that are consistent with the observed values than the other models. Table 6 shows the ranking of the models based on the performance of the prediction steps. Overall, the GRU model ranks first among all models, followed by the LSTM model.

Thus, as mentioned in the previous section, the model with the GRU algorithm provides the best prediction for the t2m value compared to the observed values, using the past 25 days ( weeks) to predict the next 1-2 weeks, with a low RMSE, a low MAPE, and a higher NSE according to the performance indices shown in Table 5. Overall, the GRU model showed a better ability to predict the t2m than the other models and can therefore be considered a useful method for modeling subseasonal summer temperatures and keeping the population safe from the risks associated with natural phenomena caused by high temperatures.

The results of this study show that, compared to other modern machine learning methods, deep learning algorithms are one of the best, most efficient, and most powerful tools that can be used for subseasonal summer temperature prediction on large datasets [51, 54]. However, these analyses show that when using different variables such as relative humidity, dew point, temperature, and geopotential data, there is a better possibility for t2m values with high accuracy.

5. Discussion

It is indeed possible to validate the effectiveness of deep networks for predicting time series data based on state-of-the-art studies. Thanks to the development of new models and advances in computing power, the field of prediction using deep neural networks for forecasting has grown rapidly. Machine learning (ML) approaches such as linear regression (LR), decision tree (DT), support vector machine (SVM), and artificial neural network (ANN) are commonly used for climate modeling, especially in temperature modeling. Deep neural networks (LSTM and GRU) models, two of the best subseasonal forecasting models, are growing in popularity, although most of the top models have more traditional architectures.

The GRU network, a modified and improved version of LSTM, is capable of supporting a higher resampling rate and can be trained on smaller and larger datasets. Therefore, our experiments carried out on more than 1.5 million training sets have indicated that(i)GRU neural network, which is easier and faster to train, is the best model that provided good results(ii)A deep neural network GRU with encoder and decoder layers has high performance and the ability to forecast subseasonal temperatures (next 1-2 weeks) with 2.51% MAPE precision

Based on the recent work in the literature, GRU showed promising results and was considered a good predictor [9294]. However, we recall that these studies were performed using a univariate approach and not multivariate approaches. Furthermore, Park et al. [95] and Du et al. [96] have proposed the LSTM model with encoder and decoder layers but not compared to the GRU model. Benet et al. [51], Van Straaten et al. [50], Sharaff and Roy [53], and dos Santos [55] have also proposed machine learning models (LR and RF) for subseasonal time series forecasting but not deep neural networks. Therefore, the present investigation compares LR, DT, SVR, ANN, LSTM, and GRU methods for subseasonal forecasting based on a multivariate approach to predict summer temperatures for the next 2 weeks. This multivariate approach for multistep subseasonal forecasting has not been investigated yet in recent work for West African regions.

Our experiments performed with the Adam optimizer, linear and tanh activation functions in the encoder and decoder layers, 50 epochs, and 50 units have shown the GRU neural network to be the best predictor with smaller prediction errors (MAPE = 2.51%, MAE = 0.0653, RMSE = 0.0959) than those indicated by the other ML methods (MAE = 1.60 and RMSE = 2.03) [53, 55]. Because the GRU neural network is more resistant to exploding or vanishing gradient problems, it can learn in a larger variety of configurations or long-range dependencies. The DT and SVR models demonstrated more unsatisfactory results and were therefore considered the worst models.

6. Conclusion

Nowadays, in many fields such as agriculture, meteorology, hydrology, and water resource management, the forecasting of subseasonal temperatures is of crucial importance [67]. Therefore, this study investigated the ability of state-of-the-art machine learning methods to estimate the subseasonal temperatures (1 to 2 weeks) for the summer (MAMJ) season.

The major contribution of this study is to compare the ability of different machine learning (ML) techniques to forecast subseasonal temperatures in West Africa, especially in the Senegal region. In addition, this study adds to the existing body of knowledge by providing insights into model performance, comparing different methods, considering region-specific factors, assessing subseasonal forecasting skills, and potentially introducing methodological advancements [97].

In this paper, six ML methods are used, including LR, DT, SVR, ANN, long short-term memory (LSTM), and gated recurrent unit (GRU) [62]. This study used data from 02/03/1891 to 29/06/2022. Data from 02/03/1981 to 31/05/2015 were selected for training, and 20% of the training set was used for validation. The remaining data from 01/06/2015 to 29/06/2022 were selected to test the ML models. The test results show that the deep neural network (LSTM and GRU) models were better trained and accurate than the other ML models (LR, DT, SVR, and ANN). The evaluation metrics, including RMSE, MAE, MAPE, , Pearson’s r, and NSE, help select the best model. However, these performance indicators reveal that deep neural network (LSTM and GRU) methods are the most suitable. Overall, the experiment results show that the GRU model is one of the most powerful deep learning methods for subseasonal time series forecasting in West Africa, with a precision of approximately 2.51% MAPE and 0.0959 RMSE, 87.07% Pearson’s r, and 74.68% of the NSE score.

In summary, ML methods, especially deep neural networks, a promising direction for future research in subseasonal forecasting, are shown to help extend the prediction time of summer temperature to subseasonal. Therefore, these contributions enhance our understanding of subseasonal climate forecasting, especially in the context of predicting summer temperatures in West Africa (Senegal).

Although deep neural networks have shown great promise, there are still more studies to be done in this area. Due to limited resources, the author of this article was unable to investigate studies in different regions of West Africa. For further research, one might plan to(i)Investigate the previous model over very long time scales or long-horizon forecasting (up to 6 weeks lead time)(ii)Build a global model and expand the study to other areas in West and Central Africa, especially in the Sahel regions(iii)Improve the accuracy of subseasonal temperature predictions during the rainy seasons(iv)Investigate the attention-based models and compare their performances with the previous model

However, these methods have limitations when it comes to subseasonal temperature forecasting, especially in the context of climate. This is because the climate is constantly changing and models may not have the capacity to capture the full extent of climate change, which can often lead to extreme phenomena occurring under previously stable conditions [98].

Data Availability

The data supporting the current study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Authors’ Contributions

All authors contributed to the preparation of this manuscript. Annine Duclaire Kenne proposed the methodology, analyzed data, designed the models, performed the experiments, and wrote, reviewed, and edited the manuscript. Mory Toure designed the study, visualized and analyzed climate patterns, supervised the study, and wrote, reviewed, and edited the manuscript. Lema Logamou Seknewna performed statistical analysis and investigated, reviewed, and supervised the study. Herve Landry Ketsemen performed the LR method and investigated, reviewed, and edited the manuscript.

Acknowledgments

The authors would like to express their sincere gratitude to the African Institute for Mathematical Sciences (AIMS) and African Master of Machine Intelligence (AMMI) programs, teaching staff, tutors, and advisors, especially Mr. Mory Toure. Their valuable suggestions and meticulous review have undoubtedly strengthened the quality and clarity of the content. The authors are truly appreciative of their commitment to the peer review process, which has significantly enriched the overall depth and coherence of the manuscript.