Forecasting the Severity of COVID-19 Pandemic Amidst the Emerging SARS-CoV-2 Variants: Adoption of ARIMA Model
Currently, the global report of COVID-19 cases is around 110 million, and more than 2.43 million related death cases as of February 18, 2021. Viruses continuously change through mutation; hence, different virus of SARS-CoV-2 has been reported globally. The United Kingdom (UK), South Africa, Brazil, and Nigeria are the countries from which these emerged variants have been notified and now spreading globally. Therefore, these countries have been selected as a research sample for the present study. The datasets analyzed in this study spanned from March 1, 2020, to January 31, 2021, and were obtained from the World Health Organization website. The study used the Autoregressive Integrated Moving Average (ARIMA) model to forecast coronavirus incidence in the UK, South Africa, Brazil, and Nigeria. ARIMA models with minimum Akaike Information Criterion Correction (AICc) and statistically significant parameters were chosen as the best models in this research. Accordingly, for the new confirmed cases, ARIMA (3,1,14), ARIMA (0,1,11), ARIMA (1,0,10), and ARIMA (1,1,14) models were chosen for the UK, South Africa, Brazil, and Nigeria, respectively. Also, the model specification for the confirmed death cases was ARIMA (3,0,4), ARIMA (0,1,4), ARIMA (1,0,7), and ARIMA (Brown); models were selected for the UK, South Africa, Brazil, and Nigeria, respectively. The results of the ARIMA model forecasting showed that if the required measures are not taken by the respective governments and health practitioners in the days to come, the magnitude of the coronavirus pandemic is expected to increase in the study’s selected countries.
Recently, there have been different reports globally concerning the variant of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) family that spreads even quicker than the coronavirus disease 2019 (COVID-19) . In Wuhan, China, a new virus (COVID-19) was detected in December 2019. The COVID-19 has spread throughout China and the rest of the world [2, 3]. Coronaviruses are highly contagious and rapidly transferred from person to person. The disease, which is now a global pandemic, has spread swiftly around the world, causing severe public health issues as well as an economic crisis. When unpredictable infectious diseases arise, causing an outbreak leads to an epidemic and eventually results in a pandemic . Researchers attempt to use modelling techniques to explain the observable trends and predict specific patterns in the future so that health practitioners can organize health care programs, and their responses can be planned to mitigate such situations [4, 5]. Wangari et al.  noted that epidemiological models are becoming increasingly useful for understanding the complex processes regulating infectious disease transmission.
Phan  reported that one of the characteristics of COVID-19 is its highly pathogenic nature and possibly a zoonotic agent that quickly spreads among people, which makes it very dangerous for the world and needs proper measures put in place to curb or control the spread by each country. Scientists are putting in measures by actively conducting empirical studies to make relevant decisions concerning COVID-19 that may end the pandemic. Anjorin  reported that the risk to global public health, including the extreme acute threat in 2002, is an outbreak of respiratory syndrome (SARS) that caused 800 deaths with approximately 8000 reported cases. Emerging infectious diseases continue to threaten humanity and cause many deaths, which reduce the world population drastically. According to estimates, the H1N1 pandemic of 2009 killed 18500 people; 800 people out of 2500 cases in 2012 died from the Middle East Respiratory Syndrome (MERS) [8, 9]. Ebola outbreak killed 11310 people out of 28616 cases in 2014, and the latest coronavirus disease (COVID-19) pandemic has killed more than 2.43 million people out of 110 million reported cases [8, 9]. This research focuses on four countries (United Kingdom, South Africa, Brazil, and Nigeria). The reason for selecting these countries was because, according to a report by the Center for Disease Control and Prevention (CDC), these countries recorded the first recorded cases of the SARS-CoV-2 new variant . The ending section of the year 2020 saw many countries around the globe recording different forms of SARS-CoV-2. According to a report released by the Centers for Disease Control and Prevention (CDC) in December 2020, several SARS-CoV-2 variants are circulating worldwide .
In the United Kingdom (UK), a new variant of COVID-19 (B.1.1.7) was detected in early 2021. Volz et al.  noted that the SARS-CoV-2 variant B.1.1.7 spread across the United Kingdom early this year. The Visual and Data Journalism Team  observed that there is a recent spike in coronavirus cases, and this increase is driven by the new variant (B.1.1.7). This lineage B.1.1.7 possesses a large amount of no-synonymous substitution of immunological significance. The World Health Organization report indicated that variant B.1.1.7 had spread globally to more than 50 countries [13, 14]. Rendana and Idris  reported that there had been reported new cases of about 50000 of the variant B.1.1.7 in the UK since it started speeding among the populace in the early part of this year.
Volz et al.  observed that the mortality rate of COVID-19 would rise due to the new variant. Similarly, Horby et al.  found that the COVID-19 variant B.1.1.7 is related to a higher risk of death than different variants. More quickly and rapidly than other varieties, this version spreads faster than COVID-19. Among 60-year-olds in the UK, the coronavirus death rate was around 10 per 10,000. Nevertheless, the UK currently records about 13 or 14 deaths in the same population with the new strain. Different symptoms from those associated with the original COVID-19 virus are often dominated by the more recent variant .
Brazil (BRA) publicly announced the emergence of the SARS-CoV-2 variant (P1) or “gamma” in January this year . The SARS-CoV-2 variant P1 has evolved, and health experts in Brazil suggested that the variant might contribute to the rise in the number reported in Manaus. The gamma lineage has a mutation that helps it control a person’s antibodies from previous infections, which indicates that there is a high possibility that it can easily reinfect people who had already had coronavirus . Silva et al.  reported that the new COVID-19 variant P1 might increase the risk of respiratory infections, increase the death rate, and even lead to the collapse of health care in Brazil. Similarly, Page and Hambly  opined that, compared to the original SARS-CoV-2 virus, the gamma coronavirus variant has several mutations, including the N501Y mutation, which is also present in the alpha or B.1.1.7 variant and the beta or B.1.351 variant. This mutation makes it easier for the virus’s spike proteins to bind to human cells, potentially making it more infectious. Madhi et al.  also indicated that the UK strain B.1.1.7 has been associated with an increase of 53% transmissibility rate.
The variant P1 adds to the worries since it seems to have hit a similar constellation of mutations and has appeared in a position with a high immunity level. In a case of a group in the Amazon region, Sabino et al.  found that 42% of the specimens sequenced from late December were correlated with the P1 variant. The area has, however, observed an increase in cases since mid-December. The advent of this variant poses questions about a possible rise in the transmissibility or tendency of individuals to reinfect SARS-CoV-2 .
South Africa (SA) health authorities reported a new variant of SARS-CoV-2 called B.1.351 (N501Y.V2). South Africa has also reported a new COVID-19 strain that appears to have mutated more than the UK’s variant. B.1.351 was first discovered in early October 2020 and shared specific mutations with B.1.1.7. This new strain may be responsible for driving the country’s current resurgence of the disease, although it is too early to confirm it. As the number of total confirmed cases exceeds one million, South African authorities have imposed stricter restrictions. According to health officials and scientists heading the country’s virus strategy, this version is dominant among newly reported infections in South Africa. It tends to be more infectious than the original virus .
A recent report by Mwenda et al.  indicated that variant B.1.351 was first found in the Eastern Cape Province of South Africa. The detection of the South Africa B.1.351 coincided with the rise in coronavirus new cases in Zambia, which is close to South Africa. The B.1.351 variant may be linked to higher viral loads, and it contains another spike protein mutation that may prevent antibody binding, reduce vaccine efficacy, or blunt naturally developed immunity . The B.1.351 variant is a cause for concern because it has increased disease transmission and decreased vaccine efficacy. There may be some form of improved immune pressure escape and onward transmission in B.1.351, resulting in a strength and conditioning advantage, but the evidence for this is still lacking .
The Nigeria Center for Disease Control (NCDC) confirmed that varying variants of SARS-CoV-2 are recognized to be percolating in Nigeria (NIG) as of February 14, 2021, and are rapidly evolving. The variation of SARS-CoV-2 strains suggests multiple virus introductions into Nigeria from various parts of the world, and it adds to evidence of community transmission in different Nigerian states . A new version of SARS-CoV2 that has been recorded in Nigeria is the most recent discovery. From the other mutations, it is of a different lineage. The first B.1.525 case in Nigeria was discovered in a sample taken from a patient in Lagos State. As a result, B.1.525 is a new strain but not yet a variant of concern, and more research is underway . In mid-December, B.1.525 was first discovered by genome sequence in Nigeria, but cases quickly followed in the United Kingdom, France, and other countries. B.1.525 represented over 20% of Nigerian genomes sequenced after only two months. Haseltine  reported that there have been over 200 reported cases of Nigeria variant B.1.525 around the globe.
Mathematical models have been identified as critical techniques that can help provide a framework for our understanding of infectious diseases . Statistical methods can be used to model and forecast COVID-19 transmission. The obtained forecasts should be used to execute controlling techniques and make necessary decisions to mitigate the impact of COVID-19 . Among the several univariate time series methodologies, Autoregressive Integrated Moving Average (ARIMA) and exponential smoothing techniques are commonly used for modeling. Aljandali  indicated in a recent study that statistical and forecasting models such as ARIMA and Seasonal Autoregressive Integrated Moving Average (SARIMA) for time series predictions for infectious disease patterns have been widely used and give accurate results. One of the most pressing issues in dealing with pandemics like COVID-19 is early detection and a short-term estimate of the pandemic’s eventual magnitude and peak time. Early prediction using mathematical and statistical models combined with existing data will successfully assist governments and public health experts in implementing suitable preventative and control initiatives .
Using the ARIMA model, Ceylan  predicted the pattern of coronavirus emergence in the most affected European nations, specifically France, Spain, and Italy. The study results offered insight into the epidemic’s patterns and explained these regions’ epidemiological levels. Predicting coronavirus incidence trends in Italy, Spain, and France could also aid other countries in preparing for the outbreak by assisting them in developing policies and precautions. To forecast the COVID-19 epidemiological data of reported daily cases, Perone  used the ARIMA model to analyze the COVID-19 epidemiological data of confirmed daily cases in Italy. The ARIMA model was used by Lman et al.  to forecast COVID-19 incidence in some African countries. The study’s results indicated that the virus’s spread would intensify in the coming days.
Furthermore, Ding et al.  studied the forecasts for Italy’s new confirmed and death cases. Their research proposed a time series analysis for the predictions based on the ARIMA model and discovered that the COVID-19 spread surge forecast could be realized with a preprocessed cumulative newly diagnosed scenario based on the ARIMA and FUZZY time series methodologies. Verma et al.  developed some models for predicting COVID-19 infections, mortality, and recovery in India and Maharashtra. The estimated values for Maharashtra and India as a whole were in exceptional alignment with actual values in all six COVID-19 scenarios.
Moreover, Yonar  forecasted the numbers of coronavirus pandemic cases in Turkey and particular G8 countries using ARIMA and other prediction models. This study indicated that certainly, more precise assessments could be made in future studies with more data. However, as this study provides data on the number of cases that may be increased in the absence of action in the current situation, it may direct countries to take the appropriate steps and intervene earlier.
The various literature reviewed indicated that the ARIMA model is more suitable for predicting pandemics like the coronavirus. The ARIMA model is one of several predictive methods that allow scientists to estimate and forecast the frequency and pattern of an event like COVID-19. The ARIMA and SARIMA models for disease prevalence have been employed by an outsized variety of researchers reducing outbreaks of various diseases [2, 31, 34–37]. COVID-19 pandemic has affected the world economy in different phases, leading to partial lockdown and total lockdown for some countries and the closure of factories worldwide; physicians, politicians, business people, operational directors, scientists, and civilians alike are all in disarray. Effective prediction of COVID-19 case trends is crucial for preparing the proper health care delivery system for affected individuals and the country in terms of pandemic management and resource planning.
The contribution of this study is that we innovatively used the ARIMA model to estimate whether SARS-CoV-2 will increase amidst the new COVID-19 variant reported in these countries. The study is aimed at helping monitor the pattern of the COVID-19 and the new variant of SARS-CoV-2 in the selected countries. The ARIMA model was chosen for forecasting in this study because it theoretically justifies the forecasting and optimizes the prediction performance of the COVID-19 variant. Likewise, the forecast also provides a formidable data platform to help the respective government or health professionals of the countries selected to make prudent decisions concerning the prevalence or spread of the pandemic.
This paper is organized as follows. Section 2 describes the materials and methods used in this study. Section 3 reports on the results of the article. The discussion of this study is embedded in Section 4. Concluding and recommendations are outlined in Section 5.
2. Materials and Methods
2.1. Description of Data
The research study period extends from March 1, 2020, to January 31, 2021 (337 days). This present study depends on daily cases and death cases of COVID-19 data gathered from (https://covid19.who.int). We fitted the model with the new cases and death cases due to the continued increase in the global recorded new case of COVID-19, which better can predict the trend. Moreover, the authors selected the new reported and death cases in this research to anticipate the future prevalence of COVID-19 in the specified countries. Through the lens of the new cases and death, it provides the raw data on the ground to enable policymakers to draw counter programs and policies to seal the source of infection emergence. Furthermore, several researchers, for instance [2, 38–41], used the daily new cases and death cases in the forecasting of COVID-19 cases around the globe.
The case definition by the World Health Organization for SARS-CoV-2 variant infection is as follows: (i)First, a patient must meet the criteria for a diagnosis and be a contact related to a suspected or confirmed case(ii)Second, there is a suspected case with symptoms indicative of COVID-19 illness on chest imaging(iii)Third, a person has recently developed anosmia (loss of smell) or ageusia (loss of taste) with no other known causes(iv)Finally, an adult with respiratory distress before death who was a contact of a possible or confirmed case or associated with a COVID-19 cluster died for no apparent reason 
The case definition for new COVID-19 cases refers to a viral test as the only way to designate a COVID-19 case as “confirmed.” If one tests positive, it is referred to as a “confirmed COVID-19 case.” This indicates that the person is afflicted with COVID-19 and is hence infectious. If a viral test comes out negative, the person is not currently infected with COVID-19 . For monitoring purposes, a COVID-19 death is defined as a death in a probable or confirmed COVID-19 case caused by a clinically compatible illness, unless a clear alternative cause of death cannot be linked to COVID-19 disease (e.g., trauma). There should be no time for total recuperation .
2.1.1. United Kingdom
The current population of the UK hovers around 68 million as of November 2021 . The first case of coronavirus in the United Kingdom was recorded on January 31, 2020. Currently, the UK has recorded over 9 million COVID-19 cases as of November 2021, with more than 140,000 death cases . Figure 1 indicates a graphical representation of new confirmed cases and death cases between March 1, 2020, and January 31, 2021.
2.1.2. South Africa
South Africa is a historically and culturally prosperous nation positioned at the African continent’s southern tip, bordering the Indian and South Atlantic Oceans. With a population of 56.5 million citizens, the region is a one-of-a-kind example of economic development, with several new advances that are more relevant than one might expect . South Africa confirmed the first case of COVID-19 on March 5, 2020. As of November 2020, South Africa has recorded over 2 million COVID-19 instances, with more than 80,000 deaths recorded. Figure 2 indicates a graphical representation of new confirmed cases and death cases for South Africa between March 31, 2020, and January 1, 2021.
Brazil’s first record of COVID-19 was on January 3, 2020. Brazil is one of the countries hardest hit by the COVID-19 pandemic, with over 21 million confirmed cases and over 600000 confirmed deaths by November 2021 . The beginning of 2021 was marked by the second wave of COVID-19, which differed from the first wave, with simultaneous explosive surges of COVID-19 cases across different regions of the country, adding enormous pressure to a health system already under strain after a year of the pandemic . Figure 3 indicates a graphical representation of new confirmed cases and death cases for Brazil between March 31, 2020, and January 1, 2021.
On February 27, 2020, the first case was confirmed in Nigeria. More than 200000 reported cases and over 2000 deaths as of November 2021 . COVID-19 testing rates in Nigeria have been significantly lower than in other African countries with comparable population sizes. This has been linked to delays in a testing kit and reagent supplies during border closures . Figure 4 indicates a graphical representation of new confirmed cases and death cases for Brazil between March 31, 2020, and January 1, 2021.
2.2. ARIMA Model
The Autoregressive Integrated Moving Average (ARIMA) models, developed by Box and Jenkins in the 1970s, are time series models commonly used today for predictions and making a relevant decision with the forecasted values . ARIMA is a time sequence model that is based on a given time series data. The three terms that make up an ARIMA model are , , and , where stands for the order of the autoregressive (AR) expression, for the order of the moving average term (MA), and for the amount of difference required to correct the time arrangement. The difference’s base number is the estimate, which is expected to correct the disparity.
The ARIMA model involves a complicated process, but it can be summarized in these four steps: (i)Identification of the ARIMA structure ()(ii)Estimation of the coefficient of the formulation(iii)Diagnostic test or fitting test of the estimated residuals(iv)Forecasting the future outcomes based on the historical data
The autoregressive (AR) model involves regressing the variable of interest of on its lagged values . The moving average (MA) process includes regressing the time series on the current residuals and its lagged residual . The integrated () denotes the difference between the actual time series data and its lagged values. The differencing can be done once or twice . The ARMA () model is typically expressed mathematically as follows: where and are the AR and MA parameters and is the constant term in the model. represents the confirmed and death cases at a day, and denotes the date of the first case of COVID-19 detected in a given country. are the values of the residual at time such that . The time series plot of the daily COVID-19 confirmed cases and death cases for all the selected countries is presented in Figures 1–4, respectively.
2.3. Akaike Information Criterion (AIC) and Akaike Information Criterion Correction (AICc)
The main issue with ARIMA modeling is choosing the best ARIMA model. Based on the model selection criteria, the best ARIMA models can be found. Model selection criteria typically used in ARIMA models include Akaike Information Criterion (AIC) and Akaike Information Criterion Correction (AICc). AIC provides a method for prediction accuracy. AIC estimates the corresponding volume of information lost by a given model: the less relevant data a model loses, the higher the model’s quality. In forecasting the volume of data lost by a model, AICc considers the trade-off between the model’s goodness of fit and its simplicity. Sen and Shitan  reported that model selection aims to discover a good predictor that describes a system. The Akaike Information Criterion (AIC) and AICc are standard model selection approach in the ARIMA model . where is the model’s likelihood and denotes the total number of estimated parameters. A good model is one with the lowest AIC among all other models.
2.4. Criteria for the Comparison of Goodness-of-Fit
The AR and MA expressions have been combined in an ARIMA model, which means the time series has been differentiated at least once to make it stationary. To determine the accuracy of our model, we performed three tests: root mean square error (RMSE), mean absolute error (MAE), and mean absolute percentage error (MAPE). in which between the actual values and are the predicted values of the ARIMA model at time .
The ARIMA model used for the research was first examined with the Augment Dickey-Fuller Test to check the data’s stationarity. The Autocorrelation Function (ACF) and Partial Autocorrelation Function (PACF) plots test the ARIMA model parameters. PACF estimations are used to depict the model during the evaluation process since they have different characteristics. The ACF for AR () falls off at the order of , but the PACF remains constant; for MA (), neither the ACF nor the PACF tails off. The ARIMA model predicts values with upper and lower bounds, and an estimated value 1- point confidence interval exists between the upper and lower limits. Any realization that falls within the gap will be accepted, according to the provided confidence.
The Ljung-Box (Q18) was used to assess the model fitness and determine how well it can make better predictions. The ARIMA model was used to forecast confirmed and death cases for the selected countries over the next 27 days to see if the current COVID-19 variant would increase confirmed new and death cases. MAPE values and root mean square error (RMSE) were evaluated to check the forecast’s accuracy and validity.
3.1. Descriptive Statistics for the Selected Countries
The descriptive statistics from the study indicate that an average of 4314, 16645, 27314, and 389 people have contacted the COVID-19 virus in South Africa, U.K, Brazil, and Nigeria, respectively. Also, an average of 131, 306, 666, and 5 people have succumbed to the COVID-19 pandemic in South Africa, U. K, Brazil, and Nigeria, respectively. The data’s skewness values for both confirmed new and daily confirmed death cases were more than one except for Brazil. This means that the daily reported cases and death cases are skewed to the right, as indicated in Table 1.
3.2. Unit Root Test (Augmented Dickey-Fuller Test)
The Augmented Dickey-Fuller (ADF) Test was used to verify that the time series was stationary (daily confirmed new cases and confirmed COVID-19 death cases). The test results are shown in Table 2. At the level difference of the ADF test, daily new cases and death cases of the data were not stationary. However, data for both confirmed new cases and daily confirmed death cases for each of the countries chosen became stable at the first level difference. As a result, the series is ready to be modeled with the Box-Jenken ARIMA mathematical model.
3.3. Autocorrelation Function (ACF) and Partial Autocorrelation Function (PACF)
The data were tested for stationarity and seasonality using the Autocorrelation Function (ACF) and Partial Autocorrelation Function (PACF) graphs. According to He and Tao , the PACF graph depicts the degree of correlation between a variable and a lag of that variable, where the correlation does not justify the low order lags. A diagnostic check of the fitted model residuals is required, which involves graphical analysis and statistical testing. The fitted model, histogram, ACF, and PACF were plotted to perform a visual investigation of residuals as indicated in Figures 5 and 6. Correlogram or ACF plots demonstrate that there is no autocorrelation of the residual error. We used the following diagnostic testing tools, ACF and PACF, to check the noise terms independence of the ARIMA model. A sequence plot of the residuals, the ACF, and the PACF sample showing the residuals from these ARIMA models follow the white noise method. Therefore, the estimated ARIMA model can capture the dependent structure of the new confirmed new cases and time series of death cases very well.
(a) United Kingdom ACF and PACF plot for confirmed new cases
(b) Brazil ACF and PACF plot for confirmed new cases
(c) South Africa ACF and PACF plot for confirmed new cases
(d) Nigeria ACF and PACF plot for confirmed new cases
(a) United Kingdom ACF and PACF plot for confirmed death cases
(b) Brazil ACF and PACF plot for confirmed death cases
(c) South Africa ACF and PACF plot for confirmed death cases
(d) Nigeria ACF and PACF plot for confirmed death cases
We then analyzed the parameters for the selected ARIMA models to check the coefficient of the MA and AR components of the model, standard error (s.e.), Akaike Information Criterion (AIC), Akaike Information Criterion Correction (AICc), and the value. The study used an ARIMA model with a minimum AICc and significant parameters. Accordingly, for the new confirmed cases, ARIMA (3,1,14), ARIMA (0,1,11), ARIMA (1,0,10), and ARIMA (1,1,14) models were chosen for the UK, South Africa, Brazil, and Nigeria, respectively, with a minimum AICc of 4805.208, 6682.536, 7626.332, and 5024.805. The values were also statistically significant for the model. Also, the model specification for the confirmed death cases, ARIMA (3,0,4), ARIMA (0,1,4), ARIMA (1,0,7), and ARIMA (Brown) models were selected for the UK, South Africa, Brazil, and Nigeria, respectively, with AICc of 2826.250, 4384.203, 5034.412, and 2081.057. The values of the AR and MA were statistically relevant with parameters of 0.05, meaning that the variables are significantly different from zero at the 95 percent confidence interval. The parameters also suggest that the selected model is the best for forecasting, as indicated in Table 3.
We then analyzed the parameters for the selected ARIMA models to check the coefficient of the Moving Average (ma) and autoregression component of the model, standard error (s.e.), Akaike Information Criterion, Akaike Information Criterion Correction (AICc), and the value. The study used an ARIMA model with a minimum Akaike Knowledge Criterion Correction (AICc) and significant parameters. For the new cases of COVID-19 time series data, the best models selected for all the countries were UK ARIMA (3,1,14), South Africa ARIMA (0,1,11), Brazil ARIMA (1,0,10), and Nigeria ARIMA (1,1,14), respectively, with a minimum Akaike Information Criterion Correction (AICc) of 4805.208, 6682.536, 7626.332, and 5024.805. The values were also statistically significant for the model. With regard to the death cases, the ARIMA model specification selected UK ARIMA (3,0,4), South Africa ARIMA (0,1,4), Brazil ARIMA (1,0,7), and Nigeria ARIMA (Brown), respectively, with minimum Akaike Information Criterion Correction (AICc) of 2826.250, 4384.203, 5034.412, and 2081.057. The values of the AR and MA were statistically relevant with parameters of 0.05, meaning that the words are significantly different from zero at the 95% confidence interval stage. The parameters also suggest that the selected model is the best for forecasting.
Table 4 displays the goodness of fit criteria values of the Box-Jenkins statistics for each selected nation for confirmed new cases and daily confirmed death cases. Generally, the models have high values for confirmed new cases and daily confirmed death cases. The values for confirmed new cases for the selected countries were 0.971, 0.952, 0.712, and 0.877 for South Africa, the UK, Brazil, and Nigeria. The values for confirmed death cases for the selected countries were 0.134, 0.205, 0.705, and 0.804 for South Africa, Brazil, and Nigeria. The value of the Box-Jenkins statistics for the model is very significant and shows that the ARIMA model selected is the best fit for forecasting.
The mean absolute percentage error (MAPE), which transforms absolute errors into a percentage of actual numbers, is a better measure of forecast performance. The MAPE was lowest in the UK, followed by South Africa, Brazil, and Nigeria, suggesting that the forecast follows the linear pattern and confirms the estimates’ accuracy. For the confirmed new cases, MAPEs for the UK, South Africa, Brazil, and Nigeria were 25.390, 28.867, 32.958, and 36.186. Also, for the predicted death cases, the mean absolute percentage errors (MAPE) were 60.955, 46.651, 30.343, and 80.946 for the UK, South Africa, Brazil, and Nigeria, respectively, as presented in Table 4.
Following the estimation of the ARIMA model for each country, we forecasted new cases and death cases for the selected countries over the next 27 days, thus from March 1, 2021, to March 27, 2021. The Ljung-Box test indicated that the expected values closely matched the actual values. The modified and forecast values are shown in Tables 5 and 6. Countries like the UK, South Africa, and Brazil had an unprecedented increase in possible COVID-19 incidents. Figure 7 indicates a graphical representation of the forecasted values for confirmed new cases and death cases for the selected countries.
We noticed from the forecasting that 95% confidence interval confirmed cases for all the countries selected for the studies might be between 21633 and 49637, 5264 and 16684, 1003 and 1936, and 42736 and 80043 for the UK, South Africa, Nigeria, and Brazil, respectively. For the death cases, the forecasted values revealed that the cases might be between 819 and 1439, 243 and 705, 16 and 26, and 930 and 1782 for the UK, South Africa, Nigeria, and Brazil, respectively. This study supports the recent findings of Horby et al. , who reported that the COVID-19 variant B1.1.7 is related to a higher risk of death than other variants.
The results indicate that the spread of the new variant of SARS-CoV-2 will increase the number of new cases in the UK. Also, concerning the death cases, analysis from the paper shows that if health officials and the government do not implement proper measures, the new variant will cause an increase in the death toll in the UK. A recent report by Betsy KIein  indicated that the UK variant B.1.1.7 is more contagious than the original strain (COVID-19), and it is also possibly more dangerous and associated with a higher risk of death. In Brazil, the study showed that there would be an increase in the number of confirmed new cases, as indicated in Table 5. Death cases in Brazil resulting from the coronavirus are expected to rise based on the analysis of this study. This result supports Taylor  in which this report estimated that almost 400000 Brazilians have died from COVID-19, indicating a 13% of the world’s total COVID-19 deaths, which is even more than the country’s entire AIDS epidemic.
Also, in South Africa, forecasting from the study revealed that new coronavirus cases would be appreciated in the days ahead. Death cases resulting from COVID-19 will be a bit lower than those in countries like the UK and Brazil. Researchers have indicated that the South Africa variant (B.1351) may be approximately 50% more contagious based on the faster rate of the virus’s structure that appears to make it simpler to infect human cells .
Interestingly, analysis from the study portrayed that in Nigeria, the prevalence of coronavirus daily cases may be slower than in the UK, Brazil, and South Africa. The death cases resulting from coronavirus are expected to reduce based on the results from this study. Ogundokun et al.  indicated that the Nigerian government made the right decision in enforcing the traveling restriction. This is because the results from their analysis revealed that traveling history and contact increase the chance of people being infected with coronavirus by 85% or 88%. The implication for the lower death rate may be because of stricter measures employed by the government in Nigeria. To the best of our knowledge, this is the first study to implement an ARIMA model to predict the incidence of COVID-19 in these countries since the new variant SARS-CoV-2 started spreading. The researchers perceived an increase in the prevalence of COVID-19 in the days ahead. If not handled by the government and health professionals, the death cases will escalate, and many people may continue to die from the virus.
5. Conclusion and Recommendation
COVID-19 has caused a lot of havoc and damage to humanity on the globe. Recently, several researchers have been attempting to examine the effects of COVID-19 on the economies of various countries and continents and forecast the virus’s spread, existence, and other issues related to this pandemic. In this paper, the researchers looked at time series data for daily cases confirmed and daily deaths confirmed of the novel coronavirus and SARS-CoV-2 variants recently spread across the world. In the UK, South Africa, Brazil, and Nigeria, the ARIMA model examined daily and death cases. Daily confirmed cases and daily confirmed death cases for these countries from March 1, 2020, to January 31, 2021, were selected for the prediction using the ARIMA model.
The ARIMA model was chosen for this paper’s study because of its widespread acceptance in the research field and the ease with which various stakeholders may act on these predictions. The study results show that the death cases in Nigeria are declining, which indicates that the government and experts have managed the situation well to achieve this feat. The other countries advised that firm measures need to be set up to curtail the virus from spreading. The researchers conclude that with the emergence of the novel COVID-19 variant spreading globally, if the necessary steps are not implemented to curb the prevalence rate, it is expected that there will be an increase in the new cases and death in the coming days of the selected countries.
Some practical recommendations suggested from the study to control the pandemic include the following: (i) the general public should be adequately informed about the coronavirus pandemic and its consequences for public health; (ii) expansion of the various health facilities in these countries has effective and efficient treatment and management of COVID-19 cases in the various hospitals in the selected countries; (iii) this paper would support the public health professionals in the respective countries to make prudent decisions concerning the prevalence of the pandemic; and (iv) proper measures should be put in place for testing and tracing of COVID-19 cases.
Finally, the researchers suggest that citizens in these countries must follow all the COVID-19 protocols (physical distancing, face-covering with nose masks, hand hygiene, coughing or sneezing hygiene, etc.). Various governments in these selected countries should put plans to educate the citizens on the need to take the coronavirus vaccination to help reduce the prevalence of COVID-19 in these countries and the rest of the world. Adhering to these measures will go a long way to help save humanity and the globe for a better living. Future work will be focused on the consequences of coronavirus mutations at the population levels to help policymakers implement measures to curb the prevalence of COVID-19 in the world.
We used the World Health Organization Covid-19 data which is freely available: https://covid19.who.int database.
This research is part of (1) research on correlation indicators of integrated development of urban ecological services and population quality of life, (2) perception of fairness in self-organized mass Entrepreneurship, and (3) research on high-quality development driven by scientific and technological innovation in Jiangsu.
Conflicts of Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Cai Li is responsible for the conceptualization and acquisition of data. Agyemang Kwasi Sampene is assigned to the methodology, software, and study design. Fredrick Oteng Agyeman did the writing—review. Abraham Lincoln Ayisi did the analysis and interpretation and software. Brenya Robert did the validation.
This research is part of ID no. 5611160004, ID no. 4061160, and ID no. 1721160306.
R. Yagoub and H. Eledum, Modeling of the COVID-19 Cases in Gulf Cooperation Council Countries Using ARIMA and MA-ARIMA Models, medRxiv, 2021.
S. K. Tamang, P. D. Singh, and D. B. Datta, “Forecasting of COVID-19 cases based on prediction using artificial neural network curve fitting technique,” Global Journal of Environmental Science and Management, vol. 6, pp. 53–64, 2020.View at: Google Scholar
I. M. Wangari, S. Sewe, G. Kimathi, M. Wainaina, V. Kitetu, and W. Kaluki, “Mathematical modelling of COVID-19 transmission in Kenya: a model with reinfection transmission mechanism,” Computational and Mathematical Methods in Medicine, vol. 2021, Article ID 5384481, 18 pages, 2021.View at: Publisher Site | Google Scholar
CDC, “New variants of the virus that causes COVID-19,” Online source: 1, 2021, February 2021, https://www.cdc.gov/coronavirus/2019-ncov/transmission/variant.html.View at: Google Scholar
I. Gareth, “COVID-19: new UK variant may be linked to increased death rate, early data indicate,” BMJ (Clinical research ed.), vol. 372, 2021.View at: Google Scholar
W. Booth, Boris Johnson Says British Coronavirus Variant May Be More Deadly-the Washington Post, 2021, July 2021, https://www.washingtonpost.com/world/europe/uk-variant-covid-mortality/2021/01/22/86023180-5cd6-11eb-a849-6f9423a75ffd_story.html.
M. L. Page and M. Hambly, “Gamma COVID-19 variant (P.1)|new scientist,” 2021, November 2021, https://www.newscientist.com/definition/brazil-covid-19-variant-p-1/.View at: Google Scholar
J. C. da Silva, V. B. Félix, S. A. B. F. Leão, E. M. Trindade Filho, and F. A. Scorza, “New Brazilian variant of the SARS-CoV-2 (P1/gamma) of COVID-19 in Alagoas State,” The Brazilian Journal of Infectious Diseases, vol. 25, no. 3, pp. 19–21, 2021.View at: Google Scholar
C. B. de Villiers, L. Blackburn, S. C. PHG, and J. Janus, SARS-CoV-2 Variants, PHG Foundation for FIND (the Foundation for Innovative New Diagnostics), 2021.
NCDC, “Statement on variants of Sars-Cov-2 in Nigeria,” pp. 1–3, 2021, https://emea.mitsubishielectric.com/ar/products-solutions/factory-automation/index.html.View at: Google Scholar
W. A. Haseltine, “A new COVID-19 variant from Nigeria raises increased concerns for containment and vaccination,” 2021, November 2021, https://www.forbes.com/sites/williamhaseltine/2021/02/24/new-nigerian-variant-continues-the-trend-of-dangerous-strains-threatening-covid-19-progress/?sh=7634ac5f4140.View at: Google Scholar
A. Aljandali, The Box-Jenkins Methodology, Springer, Cham, 2017.
G. Perone, Title: ARIMA Forecasting of COVID-19 Incidence in Italy, arXiv, Russia, and the USA, 2020.
G. Ding, X. Li, F. Jiao, and S. Yang, Brief Analysis of the ARIMA Model on the COVID-19 in Italy, medRxiv, 2020.
P. Verma, M. Khetan, S. Dwivedi, and S. Dixit, Forecasting the COVID-19 Outbreak: An Application of ARIMA and Fuzzy Time Series Models, Research Square, 2020.
O. D. Ilie, R.-O. Cojocariu, A. Ciobica, S.-I. Timofte, I. Mavroudis, and B. Doroftei, “Forecasting the spreading of COVID-19 across nine countries from Europe, Asia, and the American continents using the ARIMA models,” Microorganisms, vol. 8, no. 8, p. 1158, 2020.View at: Publisher Site | Google Scholar
Y. Peng, B. Yu, P. Wang, D. G. Kong, B. H. Chen, and X. B. Yang, “Application of seasonal auto-regressive integrated moving average model in forecasting the incidence of hand-foot-mouth disease in Wuhan, China,” Current Medical Science, vol. 37, no. 6, pp. 842–848, 2017.View at: Google Scholar
V. Sandhir, V. Kumar, and V. Kumar, “Prognosticating the spread of COVID-19 pandemic based on optimal ARIMA estimators,” Endocrine, Metabolic & Immune Disorders-Drug Targets, vol. 21, no. 4, pp. 586–591, 2021.View at: Google Scholar
C. T. B. Prasad, S. Rajamani, H. V. Dharshan et al., Coronavirus (COVID-19) Forecasting in India : Application of ARIMA and Periodic Regression Models Coronavirus (COVID-19) Forecasting in India : Application of ARIMA and Periodic Regression Models, ICAR, 2020.
World Health Organization, “WHO COVID-19 case definition,” Updated in Public Health Surveillance for COVID-19 (December 16), 2020, https://www.who.int/publications/i/item/WHO-2019-nCoV-Surveillance_Case_Definition-2020.2%0Ahttps://www.who.int/publications/i/item/WHO-2019-nCoV-Surveillance_Case_Definition-2020.2%0Ahttps://www.who.int/publications/i/item/WHO-2019-nCoV-Surveillance_Case.View at: Google Scholar
Pan American Health Organization, “Case definitions for COVID-19 surveillance – 16 December 2020- PAHO/WHO,” 2020, November 2021, https://www.paho.org/en/case-definitions-covid-19-surveillance-16-december-2020.View at: Google Scholar
Worldometer, “U.K. population (2021) - Worldometer,” 2021, November 2021, https://www.worldometers.info/world-population/uk-population/.View at: Google Scholar
Worldometer, “United Kingdom COVID: 9, 171, 660 cases and 141, 181 deaths-Worldometer,” 2021, November 2021, https://www.worldometers.info/coronavirus/country/uk/.View at: Google Scholar
A. Bittar, “Poverty on the rise in South Africa|The Borgen Project,” 2021, November 2021, https://borgenproject.org/poverty-in-south-africa/.View at: Google Scholar
L. K. Sen and M. Shitan, “The performance of AICC as an order selection criterion in ARMA time series models,” Pertanika Journal of Science and Technology, vol. 10, no. 1, pp. 25–33, 2002.View at: Google Scholar
Z. He and H. Tao, “Epidemiology and ARIMA model of positive-rate of influenza viruses among children in Wuhan, China: a nine-year retrospective study,” International Journal of Infectious Diseases, vol. 74, pp. 61–70, 2018.View at: Google Scholar
C. N. N. Betsy KIein, “UK variant is now the dominant coronavirus strain in the US, Walensky Says-CNN,” 2021, July 2021, https://edition.cnn.com/2021/04/07/us/uk-variant-dominant-coronavirus-strain/index.html.View at: Google Scholar
G. Steinhauser, “South Africa COVID-19 strain: what we know about the new variant-WSJ,” 2021, July 2021, https://www.wsj.com/articles/the-new-covid-19-strain-in-south-africa-what-we-know-11609971229.View at: Google Scholar