Journal of Healthcare Engineering

Volume 2019, Article ID 6123745, 6 pages

https://doi.org/10.1155/2019/6123745

## Comparison of Time Series Methods and Machine Learning Algorithms for Forecasting Taiwan Blood Services Foundation’s Blood Supply

Correspondence should be addressed to Suchithra Rajendran; ude.iruossim@snardnejar

Received 17 July 2019; Accepted 27 August 2019; Published 17 September 2019

Academic Editor: Feng-Huei Lin

Copyright © 2019 Han Shih and Suchithra Rajendran. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

#### Abstract

*Purpose*. The uncertainty in supply and the short shelf life of blood products have led to a substantial outdating of the collected donor blood. On the other hand, hospitals and blood centers experience severe blood shortage due to the very limited donor population. Therefore, the necessity to forecast the blood supply to minimize outdating as well as shortage is obvious. This study aims to efficiently forecast the supply of blood components at blood centers. *Methods*. Two different types of forecasting techniques, time series and machine learning algorithms, are developed and the best performing method for the given case study is determined. Under the time series, we consider the Autoregressive (AUTOREG), Autoregressive Moving Average (ARMA), Autoregressive Integrated Moving Average (ARIMA), Seasonal ARIMA, Seasonal Exponential Smoothing Method (ESM), and Holt-Winters models. Artificial neural network (ANN) and multiple regression are considered under the machine learning algorithms. *Results*. We leverage five years worth of historical blood supply data from the Taiwan Blood Services Foundation (TBSF) to conduct our study. On comparing the different techniques, we found that time series forecasting methods yield better results than machine learning algorithms. More specifically, the least value of the error measures is observed in seasonal ESM and ARIMA models. *Conclusions*. The models developed can act as a decision support system to administrators and pathologists at blood banks, blood donation centers, and hospitals to determine their inventory policy based on the estimated future blood supply. The forecasting models developed in this study can help healthcare managers to manage blood inventory control more efficiently, thus reducing blood shortage and blood wastage.

#### 1. Introduction

Blood performs several important functions in the human body such as transporting oxygen, carrying supplements to our cells, disposing ammonia, carbon dioxide, and other waste items. Four of the most critical elements are the red blood cells (RBC), white blood cells (WBC), plasma, and platelets [1]. The American Red Cross reported that over 35,000 RBC units, 10,000 plasma units, and 7,000 platelet units are required day-to-day within the US [2]. Due to the short shelf life of blood components, hospitals and blood centers are faced with the challenge of maintaining appropriate inventory levels to avoid outdating and shortage.

Managing blood supply and demand is the core part of the healthcare supply chain system as blood plays a very crucial role in saving human lives. Blood supply forecasting is essential for making supply chain decisions, such as donor drive scheduling, vehicle routing policies, and inventory management, at blood centers and hospitals. Accurate forecasts of the timing and amount of future blood requests have been considered as the key inputs to donor recruitment decision making and inventory control. It is important to gather data for several years to forecast monthly demand and to recognize seasonality in demand [3–6]. Lestari et al. [7] indicated that the forecasting can predict the data trend observed and future demand for blood components.

#### 2. Literature Review

Several studies have leveraged time series forecasting techniques for predicting the blood demand at hospitals and blood centers. For instance, Pereira [8] investigated and evaluated the autoregressive integrated moving average (ARIMA) model and Holt-Winters exponential smoothing model to predict monthly demand for red blood cell transfusions at a tertiary care. While these methods focused on using time series forecast, Bosnes et al. [9] used the statistical regression technique for the forecast of blood donor arrivals at the blood bank of Oslo and found that the most important factors among 18 explanatory variables were: donor age, time from making an appointment to arriving at the drive, contact methods used, number of prior donations, and donor no-show rate. Fortsch and Khapalova [10] introduced numerous practical methods to predict future demand of blood. Several forecasting models, including the naïve, exponential smoothing, moving average, and time series decomposition, were tested using the daily demand data from a blood center that were obtained for January 2006 to December 2012. They also compared the performance of these methods with an autoregressive moving average (ARMA) model. The results revealed that the ARMA forecasting model performed better for eight out of nine time series model settings. Similarly, Khaldi et al. [11] explored the capabilities of employing machine learning algorithms such as the artificial neural network (ANN) model to predict future demand for blood.

#### 3. Materials and Methods

As discussed earlier, the study aims to develop effective forecasting methods to predict the supply of RBCs using two different techniques: time series forecasting methods and machine learning algorithms.

##### 3.1. Time Series Forecasting

This section discusses the seven time series forecasting methods used in this study.

###### 3.1.1. Autoregressive (AUTOREG) Model [12, 13]

The AUTOREG procedure estimates and forecasts linear regression models for time series data when the errors are autocorrelated. The autoregressive model regresses the value of the series at time on the values during the time periods The mathematical formula is expressed as follows:where are the linear regression coefficients, is the forecasted value at time and is the random error variable and is generally assumed to have a normal distribution with mean 0 and variance (i.e., normal ).

###### 3.1.2. Autoregressive Moving Average (ARMA) Models [12–14]

ARMA model is one of the basic tools in time series modeling. Suppose the time series is a stationary stochastic process time series, the expression ARMA (*p*, *q*) represents the model with autoregressive order of and moving-average order of *q*. This model is a combination of the AR (*p*) and MA (*q*) models, where AR (*p*) is written as and MA (*q*) is written as .

As in the AUTOREG model, is the observation value at time . The ARMA (*p*, *q*) process is generally written as follows:where *a*, *b*, and *c* are constants, is the random error variable and is generally assumed to have a normal distribution with mean 0 and variance ; are the autoregressive coefficients to be estimated, and are the moving average coefficients to be estimated.

###### 3.1.3. Autoregressive Integrated Moving Average (ARIMA) Model [12–14]

The ARIMA (autoregressive integrated moving average) approach was made popular by Box–Jenkins models [11]. The ARIMA procedure is functioning as a linear combination of its current values, past values, past errors, and past values of other time series (predictor time series) to predict a future response value in a time series.

With time series nonstationary behavior, the above ARMA () model can be extended and written using difference which is defined as: , where is the index of time, is time series at time , and is the backward shift operator, which means that has the effect of shifting the data back one period (i.e., ).

###### 3.1.4. Seasonal ARIMA Model [12, 13, 15, 16]

Seasonal ARIMA model is written with the general expression ARIMA . The symbol is the order of the nonseasonal autoregressive component, is the order of the differencing, is the order of the nonseasonal moving-average process, is the order of the seasonal autoregressive part, is the order of the seasonal differencing, is the order of the seasonal moving-average process, and is the duration of the seasonal cycle.

Let be a dependent time series at time , then the mathematical formula for the seasonal ARIMA model is expressed as follows:where is the constant mean, is the seasonal backward shift operator, is the seasonal autoregressive component, and is the seasonal moving-average component.

###### 3.1.5. Seasonal Exponential Smoothing Model [12, 13, 15, 16]

In the seasonal exponential smoothing method (ESM), the equation of forecast value at time () is given by

The smoothing equations are as follows:where is given observation at time , and and are the level and seasonal smoothing parameters, respectively, is the estimated level component at time , is the estimated seasonal component at time , and is the periods after which the seasonal cycle repeats itself.

###### 3.1.6. Multiplicative Holt-Winters Model [12, 13, 15, 16]

The Holt-Winters model, also known as the triple exponential smoothing, applies three types of exponential smoothing to the time series—value, trend, and seasonality. The model equation for the Holt-Winters method can be either additive or multiplicative model. In this section, we present the multiplicative Holt-Winters model, whereas Section 3.1.7 presents the additive model.

The mathematical formula relevant to a time series with a trend and constant seasonal component using the Holt-Winters additive technique has the forecast at time () given by following equation:

The smoothing equations are given using the following equations: where is given observation at time and are the level, trend, and seasonal corresponding constants, respectively, is the estimated level at time , is the estimated trend at time , is the seasonality index at time , and is the periods after which the seasonal cycle repeats itself.

###### 3.1.7. Additive Holt-Winters Model [12, 13, 15, 16]

In this section, we present the additive Holt-Winters Model.

For the additive model, the forecasted supply estimate for time is given by the following equation:

The estimates of level, trend, and seasonal factors for additive model equations are given using the following equations:

##### 3.2. Machine Learning Algorithms

Machine learning is a technology exploring the algorithms to analyze a set of data, learn from the insights gathered, and make predictions on data [17]. For the blood supply forecasting, we leverage the two most widely used machine-learning techniques, artificial neural network and regression.

###### 3.2.1. Artificial Neural Networks (ANN)

ANN is a reinforcement learning method that is an adaptation of biological neural network. The network consists of several nodes that are distributed across numerous layers, and each layer is connected to its previous and subsequent layers within the network [17]. These interconnected elements work closely to process information that they receive from the nodes of the previous layers and transfer them to the next layer based on the sigmoid function. They are particularly useful for modeling complex relationships in high-dimensional data or where the relationship between the input and output variables is not easy to understand [17].

###### 3.2.2. Multiple Regression

Multiple regression is another class of problem in machine learning that is trying to predict a continuous value of a variable instead of a class unlike in classification problem [17]. Linear regression with ordinary least square is one of the classic machine learning algorithms in this domain. The mathematical formula for the regression model is represented as follows:where is the response variable, is an independent variable, is the intercept, is the slope of the coefficient (both and are unknown coefficients to be estimated by the model), and is the error variable.

##### 3.3. Evaluation of the Different Methods

We use four different measures of forecast errors for evaluating the model performance and the accuracy of the methods; they are MAE, MSE, BIAS, and MAPE [12, 15, 18].

Assume are actual data and are forecasted data, and then the values of forecast errors, , are given by .(a)Mean absolute error (MAE): it measures the average significance of the forecast errors, where all individual errors have equal weights:(b)Mean squared error (MSE): it also measures the significance of the forecast errors, and larger errors get penalized more due to squaring:(c)BIAS: this is an indication of whether the forecast is overestimating or underestimating the actual supply over the forecast horizon:(d)Mean absolute percentage error (MAPE): it measures the relative significance of forecasting errors in percentage terms:

#### 4. Results

##### 4.1. Data Collection

The historical supply data for five years from 2013 to 2017 are first gathered from the health records. The summary statistics are given in Table 1.