Abstract

The overall service quality level of Emergency Departments (EDs) can be improved by accurate forecasting of patient visits. Accordingly, this study aims to evaluate the use of three metaheuristic approaches integrated with Artificial Neural Network (ANN) in forecasting daily ED visits. To do this, five performance measures are used for evaluating the accuracy of the proposed approaches, including Bayesian ANN, Genetic Algorithm-based ANN (GA-ANN), and Particle Swarm Optimization algorithm-based ANN (PSO-ANN). The outputs of this study show that the PSO-ANN model provides the most dominant performance in both the training and testing process. The lowest error is obtained with a mean absolute percentage error (MAPE) of 6.3%, Mean Absolute Error (MAE) of 42.797, Mean Squared Error (MSE) of 2499.340, Root Mean Square Error (RMSE) of 49.933, and R-squared (R2) of 0.824 on the training dataset. The lowest error with an MAPE of 6.0%, MAE of 40.888, MSE of 2839.998, RMSE of 53.292, and R2 of 0.791 is also obtained on the testing process.

1. Introduction

Emergency Departments (EDs) are the units that perform very crucial duties within the hospital service system and provide uninterrupted service. Also, these departments are the sole units where the patient traffic and transfer is the most and overcrowding is felt too much [1]. When this is the case, it is vital to improving the provided service quality level by newly adopted methodologies. Improving the service quality means a decreased waiting time, decreased length of stay, and increased ED throughput. These key performance metrics are directly interrelated with the daily patient volume of the EDs. Patient visits at the emergency departments cover 40%–70% of all hospital care [2]. The density of patient visits at the EDs on hourly, daily, weekly, monthly, or yearly basis will assist in arranging and allocating human and material resources (number of doctors, nurses, receptionists, medical devices, ED bed, and medicines). Therefore, accurate forecasting of patient visits gains great importance to ED decision makers.

The main aim of forecasting ED patient visits is to inform about the pattern of changes in the density of visits in the future [3]. The forecasting studies regarding ED visits have several dimensions such as time frame, forecasting methodology, the independent variables used in the modeling, and measurement of the model’s accuracy [4, 5].

The remainder of the study is organized as follows: Section 2 presents both an overview on contemporary real-life case studies of ANN and ED patient visit forecast in the light of four dimensions. Section 3 includes ANN-based solution approaches with their pseudocodes. In section 4, the case study is demonstrated. Section 5 provides analysis results and a deep discussion. The final section presents the conclusion, future recommendations, and limitations of the study.

2. Literature Review

2.1. Overview on ED Patient Visit Forecast in the Light of Four Dimensions

ED patient visit forecasting, also called ED patient volume forecasting or ED patient admission forecasting, is the problem of forecasting the future patient arrival of an ED. For that purpose, the historical data demonstrated as a time series are gathered in a regular time frame of hourly [6, 7], daily [813], weekly [14], monthly [15] and yearly [16] basis. Based on the literature, the vast majority of studies focus daily [5, 17]. The time frame also has an impact on the accuracy of the forecasting model. The smaller the interval of the time frame, the lower the relative accuracy of the model as compared to the models with a higher interval of the time frame (for example, annual ED patient visit forecasting) [7, 17, 18]. Furthermore, daily patient visits are the mostly dealt topics for the researchers since the forecasts play an important role in scheduling ED medical personnel, which is one of the most considerable problems faced by hospital management.

Another dimension concerns forecasting methodology. In the work of Nas and Koyuncu [19], it is stated that the studies regarding modeling the ED patient visits generally apply two types of methods. While the first one analyzes the correlations between patient visits and several regression variables, such as calendar or climatic variables, the second one predicts future values from the past values considering patient visits follow a time series [19]. These two groups propose regression-based and time series-based models, respectively. Apart from this, there exist machine learning-based models such as Artificial Neural Network (ANN), Support Vector Machine (SVM), and Long Short-Term Memory Network (LSTM) that are applied to ED patient visit forecasting. A similar grouping is mentioned in Yousefi et al.’s [20] study. They distinguish methods used in ED patient visit forecasting under two groups, namely, linear and nonlinear methods. Linear methods include Holt–Winters, Multiple Linear Regression (MLR), Exponential Smoothing (ES), Autoregressive Integrated Moving Average (ARIMA), Seasonal Autoregressive Integrated Moving Average (SARIMA), and some other regression-based methods [18, 2123]. Nonlinear methods include the adaptive Neuro-Fuzzy Inference System (ANFIS), ANN, SVM, and LSTM [20, 2426]. In addition to these two groups of methods, some hybrid approaches are developed for this problem to benefit from the advantages of the usage of these methods either individually or integrated, improve accuracy, and decrease modeling errors [12, 14, 27]. Regression-based models are incorporated with machine learning-based (e.g., MLR-ANN) and time series-based models (MLR-ARIMA).

One of the most important dimensions of the ED patient visit forecasting studies are regarding the independent variables used in the model. In the literature, the variables include time-related (temporal or calendar), demographic, and climatic variables. In the literature, scholars mostly agree on a result that time-related variables have more impact than weather variables in forecasting ED patient visits [2729]. Most of the authors deal with temporal variables such as the day of the week, the month of the year, holidays (school or public), the day after the holiday, the day before the holiday, and soccer match day [19]. Climatic variables such as air temperature, humidity, and wind speed are considered secondarily by the scholars [2830]. Some other variables related to demography, transportation, epidemic, and hospital reputation are also studied [31].

The measurement of the accuracy of the model is also an important dimension in ED patient visit forecasting studies. Different measures, such as Mean Absolute Percentage Error (MAPE), Mean Squared Error (MSE), Mean Absolute Error (MAE), Receiver Operating Characteristic (ROC) curve value, Root Mean Square Error (RMSE), and R-squared, are used in the studies to test the accuracy of the models. The accuracy of the developed models is changed with respect to the chosen variables and some specific parameters (e.g., the number of hidden layers, learning rate, and momentum in ANN modeling). By setting up these, the developed models can be performed better. Most studies regarding ED patient visit forecasting prefer MAPE to measure the accuracy. In the Wargon et al.’s [17] review paper, the investigated studies that focus on daily ED visits result in an MAPE value of between 4.2% and 14.4%. That means an MAPE value which is lower than 10% or around 10% indicates good statistical predictability.

2.2. Research Gaps and Contributions of the Study

This study focuses on the dimension of “forecasting methodology” that is mentioned in the second order. We dealt with the applicability of metaheuristics integrated with ANN in ED patient visit forecasting. In this context, three hybridized ANN-based approaches are applied to the data of daily ED visits for the first time in the literature. These approaches are Bayesian ANN, Genetic Algorithm-based ANN (GA-ANN), and Particle Swarm Optimization algorithm-based ANN (PSO-ANN). Although plenty of hybridized approaches are proposed in the literature, the metaheuristic algorithms are not yet incorporated with regression, time series, and machine learning algorithm-based methods, which are appropriate for the nature of this problem. Therefore, this study will remedy the gap in the literature and contributes a lot by the following aspects:(i)Approaches utilizing metaheuristics algorithms merged with ANN are applied to the daily ED patient visit forecasting problem (a novelty for the methodological viewpoint)(ii)A comparative outline is produced by making a benchmark analysis between three approaches in terms of a common forecast accuracy measure “MAPE” (a novelty for the methodological viewpoint)(iii)A case study in a public hospital in Istanbul (Turkey) is carried out to demonstrate the applicability of the approaches (a novelty for the application viewpoint)

3. ANN-Based Solution Approaches

3.1. Bayesian ANN

ANN provides solutions to problems in many different areas from natural science to engineering [3234], social science [35, 36], and health science [37]. It has been developed for further improvement of targeted systems by mimicking the biological nervous systems that occur in the human brain. ANN has process nodes with a simple logic connected to each other. Each node has an activation function that collects an input signal, and the aggregated signal is converted to a different value in the specified transformation function. Thus, a converted output signal is generated. Although each function is implemented very slowly by each neuron, a network can effectively carry out an incredible number of tasks [38, 39]. Initial weights are determined randomly at the first iteration, and the output and error of each latent neuron are calculated. The weight change is calculated according to a specific function, and the change is used to update the weights. The next iteration is then performed with respect to the updated weights.

Let be an optimal weight vector, and it is most likely to catch the set of observed target data , given the inputs in the Bayesian training approach. is a vector of connection and bias weights that characterizes the data generating relationship. Bayesian training aims to gather the posterior probability distribution of the weights given the observed data . This process is carried out while updating any knowledge of the weight values before obtaining the data, with the information contained in the data, using Bayes theorem [40].

The prior weight distribution and the likelihood function are represented, respectively, by and , respectively. shows the prior probability of the training data. The flowchart of Bayesian ANN is shown in Figure 1.

3.2. Genetic Algorithm

GAs have been developed to mimic some of the processes observed in natural evolution [41]. GAs aims to create a competitive set of solutions, and these targeted solutions progress through the natural selection process, in which noneffective solutions emerge and more efficient solutions continue to be reproduced. This process is repeated until the optimal solution set is obtained.

3.2.1. GA-ANN

A hybrid GA-ANN is a backpropagation network, which is the only exception to obtaining the weight matrix from performing genetic processes under optimal convergence conditions [4244]. The flowchart of GA-ANN is shown in Figure 2. The initial weights are set at random in the first iteration, and the output and error of each latent neuron are calculated. Updated weights are computed according to GA by applying primary selection, reproduction, and mutation. The next iteration is then performed concerning the updated weights. Crossover and mutation operators in GA are used in the selection of weights and result in new offspring weights that offer better fitness value. While the weights are optimized here, the population number, crossover, and mutation values are also tried to be optimized by trial and error.

3.3. Particle Swarm Optimization

The PSO algorithm starts by creating a random population. It refers to a randomly generated population particle, the optimal values of these particles should be determined, and each particle is actually a different decision variable that must represent a vector in the problem-solving area. In the PSO algorithm, the movement of any particle affects the movement of the entire group, and ultimately, each member of the group can benefit from the discoveries and skills of other members [45, 46].

3.3.1. PSO-ANN

The PSO algorithm starts with the generation of starting particles, and the starting rates are assigned to the starting particles. At each iteration, each particle is updated based on the best values obtained. The comparison of the obtained values is performed with regard to the fitness value of the relevant iteration and the fitness value obtained so far during the iteration. One of them is the compliance value obtained so far, and this value is kept as the best solution. The other value is the best value reached by any particle in the population. is particle’s best-known position, and is the best position known to the swarm. The rand variable generates random values between 0 and 1. and equal to 1 and 2, respectively. The position values of the particles are obtained by means of the position equation affected by the velocity. Position values represent the weight values of the network. As the fitness function is optimized, the position values, i.e., the weight values of the net, are also optimized. Different combinations are tested on C1 and C2 parameters to obtain better weight values. The flowchart of PSO-ANN is shown in Figure 3.

Equation (2) shows the calculation of updating the velocity of the particle.

Equation (3) indicates the calculation of updating the position of the particle for the weights of networks.

4. Dataset and Analysis

We used two years’ data of a public hospital ED in Istanbul, Turkey. The data belongs to the years 2011 and 2012 (from January 1, 2011, to December 31, 2012). The time series used for modeling is shown in Figure 4. Temporal variables and a climatic variable of maximum temperature were used as dependent variables to forecast the daily ED patient visit. A total of 21 independent variables are used. Binary dummy variables are used in data regarding the month of the year, day of the week, and holiday (weekend holiday). While the value of 0 means that the related date does not belong to that month or that day or weekend holiday, “1” means belonging. The dummy variables are utilized in the dataset instead of the original categorical variable. The data for maximum temperature are obtained from a French meteorological association named Infoclimat (http://www.infoclimat.fr). A detailed description for each variable is provided in Table 1. Also, the time-series data of the dependent variable (daily ED visit) are presented in Figure 4. This figure shows that the highest number of visits at the ED occurred on January 14, 2012, with 1013 visits per day. The plot of daily ED visits shows the monthly fluctuations in the ED: there were more patients in January and December; there were slightly fewer patients in August. The trend of monthly ED visits on average is given as a box plot in Figure 5.

Also, the trend of the maximum temperature variable is demonstrated in Figure 6. According to this figure, the average maximum temperature is obtained as 19.34°C with a standard deviation of 8.98.

5. Results and Discussion

This paper performs the cross validation to the training dataset to prevent overfitting. It divides the training dataset into ten subsets. One of the ten subsets is held, and the rest of the subsets are trained regarding the fitness function. We applied different number of neurons (#n) for obtaining the best solution in Bayesian ANN. The number of neurons is increased one by one, ranging from 2 to 50. In this process, we used MSE for determining the best number of neurons. While logsig is used in the transfer function, purelin is also applied in the activation function. The minimum MSE is obtained with 45 neurons, and all solutions are presented in Table 2. Then, the results of Bayesian ANN are solved using this combination.

After obtaining the best configuration for Bayesian ANN with respect to the MSE, the results of five performance measures are presented in Table 3. The results of the training and testing are also illustrated in Figures 7 and 8, respectively.

The obtained MAPE for training and testing for Bayesian ANN is 7.3% and 8.8%, respectively. The other performance measures are also given in Table 3.

In the GA-ANN approach, we also applied the different combinations of parameters for obtaining the best solution. The number of populations, crossover rate, mutation rate, and the number of neurons are used. The number of populations is considered as 50. Six different crossover rates are used as 0.4, 0.5, 0.6, 0.7, 0.8, and 0.9. Four different mutations’ rate is also utilized as 0.1, 0.2, 0.3, and 0.4. The number of neurons is increased one by one, ranging from 2 to 50 as in Bayesian ANN. The logsig and purelin are used, respectively, in the transfer and activation functions in the ANN process. Therefore, 1176 different solutions (=:4 × 6 × 49) are totally obtained for GA-ANN. The best twenty combinations with respect to MSE are presented in Table 4.

After obtaining the best configuration for GA-ANN with respect to the MSE, the results of five performance measures are presented in Table 5. The results of the training and testing are also illustrated in Figures 9 and 10, respectively. The obtained MAPE for training and testing for GA-ANN are 7.8% and 6.9% respectively. The other performance measures are also given in Table 5.

In the PSO-ANN approach, we also applied different combinations of parameters for obtaining the best solution. The number of populations, the weighting coefficient for the local best solution (C1), the weighting coefficient for the global best solution (C2), and the number of neurons are used. The number of populations is considered as 50. Three different weighting coefficients for the local best solution are used as 1.0, 1.5, and 2.0. Nine different weighting coefficients for the global best solution are also utilized that is between 1 and 5 as 0.5 increasing. The number of neurons is increased one by one, ranging from 2 to 50 as in PSO-ANN. In the ANN process, the tansig are used for both transfer and activation functions. Therefore, 1182 different solutions are obtained for PSO-ANN. The best twenty combinations with respect to MSE are presented in Table 6.

After obtaining the best configuration for PSO-ANN with respect to the MSE, the results of five performance measures are presented in Table 7. The results of the training and testing are also illustrated in Figures 11 and 12, respectively. The obtained MAPE for training and testing for PSO-ANN is 6.3% and 6.9%, respectively. The other performance measures are also given in Table 7.

After the models are implemented for the forecasting of daily patient visits in emergency departments, their results are compared and evaluated together through the performance measures. A detailed evaluation of the proposed models based on both training and the testing dataset is implemented in this section. Table 8 presents the forecasting results of daily patient visits by the metaheuristic approaches integrated with ANN. The results of the performance measures are also illustrated in Figure 13. From Table 8, it can be seen that the PSO-ANN model provided the most dominant performance in the both the training and testing process. It obtained the lowest error with an MAPE of 6.3%, MAE of 42.797, MSE of 2499.340, RMSE of 49.933, and R2 of 0.824 on the training dataset. It also obtained the lowest error with an MAPE of 6.0%, MAE of 40.888, MSE of 2839.998, RMSE of 53.292, and R2 of 0.791 on the testing dataset. The Bayesian ANN and GA-ANN metaheuristics algorithms yielded lower performance in the ANN model optimization in the training process. The weakest model in this optimization process is the Bayesian ANN for both the training and testing process.

As a creative contribution to the literature, a comparison with some previously published papers is performed to highlight the studies in terms of data, applied method(s), compared methods, time frame, independent variables, performance measures, and analysis results. A total of twelve studied including the current study are investigated under these dimensions. Most of the studies focus on the daily basis of ED visits. Some studies tackle the hourly [6], weekly [14, 23], and monthly trend of ED visits [15, 23]. The data-gathering period varied from one study to another. The current study used two years’ data of daily arrivals. In light of the eleven studies from the literature summarized in Table 9, it is understood that the data collection period of the current study is sufficient. When analyzing the methodology used, the current study bridges the gap of the literature. Since it does not show any attempts that apply metaheuristics incorporated with ANN in forecasting daily ED visits, this study has novelty for this application domain.

In the work of Wargon et al. [17], variability in MAPE and RMSE ranging from 4.2% to 14.4% is considered acceptable in ED daily visit forecasting studies. In this context, the approaches used in the current study meet the criteria. The MAPE values obtained from the three approaches are between 6% and 8.8% that means acceptable.

6. Conclusions

In this study, we apply three approaches named Bayesian ANN, GA-based ANN, and PSO-based ANN to the daily ED visit forecasting problem. Two years of daily ED visit data are gathered to use in these models. Temporal, climatic, and holiday variables are used in the models as independent variables. Results of each model are analyzed under five different performance measures called MAPE, MAE, MSE, RMSE, and R-squared. Results of the approaches show that PSO-based ANN is superior according to all five performance measures. GA-based ANN yields more successful results compared to the Bayesian ANN model. We conclude that the use of metaheuristics integrated with ANN in ED visit forecasting improves the accuracy of the model considerably. The proposed approaches eliminate the problem of getting stuck in local extremums and crossing plateaus of the error function in classical ANN. The initial weights of the ANN are computed using GA and ANN instead of the trial and error process. There are some limitations. Firstly, the data come from only one institution and consist of two-year data. Second, the input variables for ED data are all generally utilized for model parameters in the ED literature. However, the number of parameters can be enriched to improve the accuracy of proposed approaches. Thirdly, the number of analyzed parameters of proposed approaches (Bayesian ANN, GA-ANN, and PSO-ANN) are limited. We have obtained a total of 2497 solutions with respect to the proposed approaches. A parameter optimization for the proposed approaches can be implemented. For future work, we plan to include more variables in our models and compare the performance of current models with some time-series and machine learning algorithms. This study is novel in the literature from the aspect of applying hybrid metaheuristics-based approaches for the first time. Findings of the current research will also contribute to ED decision makers in the practice to plan and schedule medical staff to reach an efficient resource planning and service quality.

Data Availability

No data were used to support this study.

Conflicts of Interest

The authors declare no conflicts of interest.