Abstract

Governments across the world rely on their Customs Administration to provide functions that include border security, intellectual property rights protection, environmental protection, and revenue mobilisation amongst others. Analyzing the trends in revenue being collected from Customs is necessary to direct government policies and decisions. Models that can capture the trends being purported from the nominal (nonreal) tax values with respect to the trade volumes (value) over the period are indispensable. Predominant amongst the existing models are the econometric models (the GDP-based model, the monthly receipts model, and the microsimulation model), which are laborious and sometimes unreliable when studying trends in time series data. In this study, we modelled monthly revenue data obtained from the Ghana Revenue Authority-Customs Division (GRA-CD) for the period January 2010 to December 2019 using two traditional time series models, ARIMA model and ARIMA Error Regression Model (ARIMAX), and two machine learning time series models, Bayesian Structural Time Series (BSTS) model and a Neural Network Autoregression model. The Neural Network Autoregression model of the form NNAR (1, 3) provided the best forecasts with the least Mean Squared Error (MSE) of 53.87 and relatively lower Mean Absolute Percentage Error (MAPE) of 0.08. Generally, the machine learning models (NNAR (1, 3) and BSTS) outperformed the traditional time series models (ARIMA and ARIMAX models). The forecast values from the NNAR (1, 3) indicated a potential decline in revenue and this emphasizes the need for relevant authorities to institute measures to improve revenue generation in the immediate future.

1. Introduction

The Ghana Revenue Authority Customs-Division (GRA-CD) mandates are revenue collection, border protection, collection of international trade statistics, and trade facilitation [1].

The revenue of the Government of Ghana is largely generated from direct taxes, taxes on international trade, indirect taxes, and other taxes which are collected by the VAT Service, the Internal Revenue Service, and the Customs Excise and Preventive Service [2].

In the twenty-first (21st) century, the reliance on Customs services for international trade facilitation and a seamless process flow in the supply chain, without compromising on the security of such trade, cannot be overemphasized.

The GRA-CD still has revenue collection as a top priority, as it continues to complement taxes generated from a limited domestic tax base [3]. The Ghana Revenue Authority (GRA) is constantly devising ways to widen the tax net, automate and simplify its processes, minimize revenue leakages and ensure maximum revenue is mobilised from both its Domestic and Customs Divisions. GRA envisions to achieve a 17.5% tax collection to GDP by the end of 2022. In 2018, Ghana’s tax-to-GDP was 14.1% which was lower than the regional rate of 16.5% in Africa [4]. However, as Ghana migrates to an upper middle-income economy, its trade (or customs taxes) as a proportion of total taxes has been dwindling gradually especially in the last 10 years [4]. According to Abrokwah et al. [5], recent years have seen substantial falls in customs revenue collections, with the share of taxes collected at the country’s ports falling from 42% in 2017 to 30% in 2019.

Some of the existing literature which sought to assess or investigate the causes and impact of the reduction in revenue from trade tax focused on issues, such as the impact of trade liberalization on reduction in revenue [6] and the impact of tax revenues on the development of countries [7].

Forecasting trade tax or customs revenue has been proven to be one sure way to mitigate the reduction in revenue. According to Anderson and Johnson [2], reliable and trusted revenue predictions provide the foundation for fiscal discipline and for the adoption of an executable public budget.

Whitfield and Duffy [8] described revenue forecasting as an important topic required to track performance and support-related decision-making processes. In the literature, Makridakis et al. [9] described four approaches used for forecasting revenue by both state and local governments, as follows:(i)Qualitative method: This is usually a combination of expert judgment and formal methods of forecasting.(ii)Time-series methods: The traditional approaches are simple trend approach, the autoregressive approach, and the mixed technique approach.(iii)Econometric modelling: This is usually represented by a single equation; a set of equations solved recursively; and sets of equations solved simultaneously.(iv)Microsimulation modelling: Revenue forecasting for attribute data (Personal Income tax, Particular Commodity) can be predicted and the policy impacts on certain groups or commodities can be predicted using the microsimulation model.

Anderson and Johnson [2] assessed revenue collection of the Customs Excise and Preventive Service (CEPS) for the period 2008–2012 to devise a reasonably accurate projection for the individual tax components. They indicated that the projection will assist in the design of an appropriate expenditure profile as a means of averting any future fiscal deficit in the country.

Molapo et al. [10] used a Bayesian Vector Autoregression model, an ARIMA model, and a State Space Exponential Smoothing Model to forecast quarterly tax revenue data in South Africa. All three times series models outperformed existing forecasting techniques with the Bayesian Vector Autoregression Model having the best forecast accuracy amongst the three.

Chen et al. [11] in their work proposed an automatic approach by harnessing the power of machine learning techniques to relieve the burden of customs targeting officers who set revenue targets. They introduced a novel model based on an off-the-shelf embedding encoder to identify the correctness of a Harmonized Commodity Description and Coding System (HS Code) without any human effort.

Currently, the GRA-CD uses the Duke University’s GDP-Based model and Monthly Receipts model, as well as an in-house GDP-Based model and the microsimulation model (see Jenkins et al. [12] for more information about the models). These are econometric and microsimulation models and hence can be inadequate or deficient for forecasting due to their inability to treat serially correlated time series data.

In quest to study the causes of decline in revenue from GRA-CD, this study will identify the important variables for forecasting revenue. The study will also assess the performance of some traditional and machine learning time series techniques which are suitable for time series data in modelling the revenue collected by the GRA-CD, as the selection of an appropriate forecasting technique is critical to the success of any new initiative or the decision-making process.

The rest of the paper is organised as follows: Section 2 (Material and Methods) discusses the source of data, the mathematical underpins of the adopted time series models and their implementation. In Section 3 (Results and Discussion), we compare the results of the traditional and machine learning time series models used to forecast revenue collected by the GRA-CD. Finally, we conclude by summarising the overall achievements of the study with some recommendations and directions for future developments in Section 4 (Conclusion and Recommendation).

2. Methods and Materials

2.1. Data Acquisition

Secondary data obtained from the GRA-CD and from the Bank of Ghana historical exchange rates records [13] were used for the study. The Customs data comes from a database containing financial indices of performance and growth of the Customs Division. The data captures records on the following tax variables: the value of imports (CIF) data, the Zero and Exempt portion of value of imports (ZE), the Taxable or Effective value portion of CIF (ECIF), the Effective Duty Rate (EDR), the import duty revenue (IDREV), Total Customs Revenue (TOTREV), and the average monthly Exchange Rate (XCHR) from 2010 to 2019.

2.2. Definition of Study Variables
Cost, Insurance, and Freight (CIF): This can be defined as the value of imports up to the port or location where the goods would be entered for clearance and customs purposes. This includes the cost of purchase of the item (FOB), freight payable/paid to country of destination (Ghana), and insurance (usually calculated as 0.875% of cost plus freight). In Ghana, the tax base for calculating Import Duty is CIF. Thus, CIF is the value of all goods imported under the customs regime “Home consumption.” This is the total value of goods imported under the import duty rates 0%, 5%, 10%, 20%, and 35%.Zero and Exempt (ZE): This connotes the value of imported goods whose import duties are zero-rated (0%) and the value of goods whose import duties have been exempted from payment of taxes.Effective CIF (ECIF): This represents CIF for goods whose tax rates are five percent (5%), ten percent (10%), twenty percent (20%), and thirty-five percent (35%) and are also not exempted from import duty taxes. In other words, this is CIF for which import duty taxes have been paid.Effective Duty Rate (EDR): This is the sum of the ratio of value of import per import duty rates to the total value of import (CIF). Mathematically, EDR can be represented asAverage Monthly Exchange Rate (XCHR): This is the historical monthly Ghanaian cedi to US dollar exchange rate.

Table 1 contains a sample of the study data (revenue in millions of Ghanaian cedis and their respective Ghanaian cedis to US dollar exchange rates) collected by Customs Division in Ghana from January 2010 to December 2019. The obtained data from January 2010 to December 2019 was used to fit the study models (training and testing of the models). In this study, 70% of the study data was used for the model building (training) and the remaining 30% was used for testing the fitted models. The fitted models were then used to forecast the revenue for the year 2020.

2.3. Autoregressive Integrated Moving Average (ARIMA) Models

The ARIMA model is a linear model in which the independent variables are the lags of the dependent variable and/or the lags of the errors from forecast. The general form of an ARIMA model using backshift notation iswhere is the univariate time series data, is the order of differencing, is the backshift notation, and are the number of lags of and the error, respectively, and is the error at time .

The Box–Jenkins methodology for fitting time series models is then used for model identification through to the estimation of model parameters.

2.4. ARIMA Error Regression Model (ARIMAX)

The assumption of the errors underpinning the ordinary least-squares (OLS) models (i.e., the errors are independent and identically distributed (iid)) is often violated as the residuals from time series analysis are usually autocorrelated. The ARIMAX model fixes this by modelling the error with an ARIMA process.where is the vector of coefficients of , is the error for the regression model in (3) and modelled as an ARIMA with error with .

2.5. Bayesian Structural Time Series (BSTS) Model

The Bayesian Structural Time Series Model can be defined as

The trend component is the local level model plus a term that can be defined as the extra expected as so that it is the slope of the local linear trend. The values are the errors from the local linear trend and the seasonal component.

2.6. Neural Network Autoregression (NNAR) Model

Neural networks are nonlinear models that can be trained on data to learn patterns. They have been applied as successful machine learning models for solving various tasks like pattern recognition. The three main types of neural network architecture are the Feedforward, Recurrent [14], and Convolutional Neural Networks [15].

The architecture is called feedforward because the flow of information takes place in the forward direction. A feedforward network defines a mapping and learns the value of the parameters that result in the best function approximation [16].

According to Chapman-Wardy et al. [17], a feedforward can either be single or multiple layered. The single-layer network consists of only one hidden layer. The multiple-layer neural network is also known as the deep learning network. The distinguishing feature of this network is the fact that it has multiple hidden layers for complex processing.

The data is first sent to an input layer where it is processed and forwarded to each neuron in the next layer until it reaches the output layer. Each neuron receives an input vector and multiplies it by a specific weight, and a bias is added to this product resulting in an output that is fed to the next layer with the functionwhere is the input feature from the data or a neuron, is the net input of each neuron, is the weight of the input, and is the bias. The net input is then passed through an activation function , to produce the output:

An activation function is chosen by the researcher according to the purposes of the research. In this study, the sigmoid activation function was chosen as it has the tendency to reduce the effect of extreme values, hence, robust to outliers.

2.7. Evaluating Time Series Models

The residuals of the fitted models are important in selecting the appropriate time series model. The more the residuals look like white noise, the better the fit of the model. A large p-value ( at level of significance) corresponding to a Ljung–Box portmanteau test value from analysis of the residual suggests the tendency of the data to be white noise.

Another important statistic or measure of good fit for a time series model is the Akaike Information Criteria (AIC), which is defined aswhere is the number of parameters in the model and signifies the maximum value of the likelihood estimates of the function. The AIC seeks to increase the likelihood or goodness of fit of the model, , while reducing the complexity of parameters, . Models with lower AIC values are preferred. The Bayesian Information Criterion (BIC) also known as the Schwarz Criterion is a similar statistic that is usually used hand in hand with AIC. The BIC, however, penalises the addition of more parameters than the AIC.where is the sample size of the training set.

A model with relatively small AIC or BIC value is always preferred.

Also, the Mean Squared Error (MSE) and the Mean Absolute Percentage Error (MAPE) could be used to assess all the model forecasts to judge the best performing model.where is the actual value and is the predicted value.

3. Results and Discussion

3.1. Time Series Plots of Variables

A time series plot of the variables CIF, ZE, ECIF, IDREV, and TOTREV is shown in Figure 1. It can be seen from Figure 1 that there was a gradual increase in revenue for all the variables under study prior to 2018. There was a spike in 2018 and a tardy decline in revenue collected after 2019.

3.2. Results of the ARIMA Model

Table 2 contains a summary of the most suitable ARIMA model for forecasting revenue collected by GRA-CD.

From Table 2, the fitted ARIMA (0, 1, 1) model is given as

The AIC and BIC of the model are 1133.50 and 1141.52, respectively. The p-value of the Ljung–Box test of residual is 0.01469 . This means the residuals of the fitted model are correlated. As stated earlier, if the residuals are white noise (uncorrelated), then the fitted model fits the data well. By inference, ARIMA (0, 1, 1) is not very suitable for forecasting the revenue collected by GRA-CD.

Figure 2 shows a graph of the revenue forecasted from the ARIMA model.

It can be seen from Figure 2 that the forecast points show an increasing trend for the forecast period. The confidence band around the forecast is wide, which confirms the unreliability of the ARIMA model for forecasting the revenue collected by GRA-CD.

3.3. Results of the ARIMAX Model

The results of the most suitable ARIMA Error regression model (ARIMAX) with statistically significant covariates are presented in Table 3.

From Table 3, the resulting regression with ARIMA (1, 1, 1) error model is given by

From Table 3, the significant variables required for forecasting revenue collected by GRA-CD using the ARIMAX model is Effective Duty Rate (EDR) and Cost, Insurance, and Freight (CIF). The most suitable ARIMAX model was the regression with ARIMA (1, 1, 1) errors. The -value of the Ljung–Box test is 0.3019 . This means that the residuals of the regression with ARIMA (1, 1, 1) errors are uncorrelated (white noise). This signifies that the fitted model is somewhat suitable for forecasting the revenue collected by GRA-CD. The AIC and BIC of the regression with ARIMA (1, 1, 1) errors model are 1115.91 and 1131.95, respectively.

Figure 3 shows a forecast plot of revenue using the regression with ARIMA (1, 1, 1) errors model.

It can be seen from Figure 3 that the forecast points of the ARIMAX model also show an increasing trend for the forecast period. The confidence band around the forecast is narrower as compared to that of the ARIMA model. This means the ARIMAX model gives more precise estimates/forecasts than the ARIMA model.

3.4. Results of the BSTS Model

Three BSTS models of different compositions are first considered: the local level model, which considers local linear trend and annual seasonality (BSTS Model 1); the local level model with added autoregression (BSTS Model 2); and a BSTS model with a local linear trend, annual seasonality, and a regression component (BSTS Model 3) (see Scott and Varian [18] for further details). Table 4 contains the summary results of the various BSTS models considered in this study.

From Table 4, the fitted Bayesian Structural Time Series models (BSTS Model 1, BSTS Model 2, and BSTS Model 3) had coefficients of determination of 0.992, 0.993, and 0.997, respectively. The BSTS model with a local linear trend, annual seasonality, and a regression component (BSTS Model 3) had the smallest residual standard deviation of 14.01 and the highest coefficient of determination of 0.997. This makes the BSTS Model 3 the best amongst the Bayesian Structural Time Series models for forecasting the revenue collected by GRA-CD. The model as fitted explains 99.7% of the variability in the response variable (revenue data).

The spike-and-slab technique was used to identify the most important variables for modelling the data using their inclusion probabilities. Figure 4 shows some input variables ranked according to their level of importance for the BSTS model.

From Figure 4, Import Duty Revenue (IDREV) had the highest inclusion probability, followed by the average monthly Exchange Rate (XCHR) and Taxable or Effective value portion of CIF (ECIF). This makes IDREV, XCHR, and ECIF the top 3 most important variables required to fit the BSTS model. The Zero Exempt (ZE) had the least inclusion probability in fitting a BSTS model for forecasting revenue. Figure 5 shows the forecast diagram of the best BSTS model (BSTS Model 3).

It is evident from Figure 5 that the forecast shown in blue has a narrower confidence band (green boundaries) than the ARIMA and ARIMAX models. This means the estimates/forecasts from the BSTS model are more precise than those from the traditional time series models (ARIMA and ARIMAX). Also, there was a gradual increase in revenue generated from 2010 till 2018 and a steady decline after 2018 that was followed by a gradual increase after 2019.

3.5. Results of Neural Network Autoregression (NNAR) Model

The most suitable NNAR model using the study data is an NNAR (1, 3) model. The results of NNAR (1, 3) model are summarised in Table 5.

Table 5 contains the results of a Neural Network Autoregression model with 5 input variables, 3 hidden layers, and 1 output variable (revenue data). The forecast of the NNAR (1, 3) model for the next annual period is illustrated in Figure 6.

The thin band of confidence interval shown in Figure 6 indicates that the NNAR (1, 3) model is reliable in forecasting revenue. It can be seen from Figure 6 that there was an increase in revenue collected over time and a sudden decline at the forecast period (2019 to 2020).

3.6. Evaluation of Forecasts

Comparisons of the forecast ability of the study models, using a cost function of Mean Squared Error (MSE) and Mean Absolute Percentage Error (MAPE), are summarised in Table 6.

Table 6 contains the MSE and MAPE values of the study models. These errors measure the deviation of the various model forecasts from the test data. Models with relatively lower MSE and MAPE are preferred. It is evident from Table 6 that NNAR (1, 3) model produces the smallest MSE (53.87) and an appreciable MAPE (0.0861) whereas the BSTS model gives the smallest MAPE (0.0390) and an appreciable MSE (1319.989).

From the above results, the NNAR (1, 3) is adjudged the most suitable model for forecasting revenue collected by GRA-CD as it outperformed the other models with the lowest MSE of 53.87 and relatively better MAPE of 0.0861. It can also be seen from Table 6 that the machine learning time series algorithms (NNAR (1, 3) and BSTS model with a local linear trend, annual seasonality, and a regression component outperformed the traditional time series models (ARIMA and ARIMAX).

4. Conclusion and Recommendations

Comparing the Mean Squared Error (MSE) and Mean Absolute Percentage Error (MAPE) of the forecasts, it was evident the NNAR (1, 3) model was the best fitted model with the least MSE of 53.87 and a comparatively lower MAPE of 0.0861. This makes forecast estimates from the NNAR (1, 3) the most precise as compared to estimates from the other models considered in the study. For this model, the important variables were the Effective Duty Rate (EDR) and the total value of imports (CIF). The forecast values from the NNAR (1, 3) indicated a potential decline in revenue. This result calls for policies geared at improving revenue generations in the immediate future.

The BSTS model also performed better than the traditional ARIMA and ARIMAX models. Although the model gave a comparatively higher MSE of 1319.989, it was better than the traditional time series models (ARIMA and ARIMAX). Also the BSTS model gave the lowest MAPE of 0.039, making it also suitable for forecasting the revenue collected by GRA-CD. One advantage of the BSTS model is its ability to incorporate external data (covariates) to obtain the posterior distributions for the forecasts. The Import Duty factors (IDREV), Exchange Rate (XCHR), and Taxable or Effective value portion of CIF (ECIF) were the top 3 most important variables required for modelling using BSTS model.

Overall, the two machine learning models (BSTS and NNAR) outperformed the traditional time series models (ARIMA and ARIMAX) in forecasting the revenue collected by GRA-CD. This could be attributed to the fact that the ARIMA and ARIMAX models are relatively simpler models that require some further modifications to be suitable for the analysis of the study data. Also, in cases where the data is nonlinear, simple econometric models like the ones currently used by the GRA-CD for forecasting might fail.

The machine learning time series models considered in this study are robust and can give more reliable forecasts or estimates when updated with new data.

The pace of statistical research and data science has led to advancements in machine learning techniques like the ones considered in this study. Exploring these new methods and constantly modifying them to suit real-world big data should be a desired quest for future research.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Authors’ Contributions

Diana Ayorkor Agbenyega contributed to conceptualization, methodology, writing the original draft. John Andoh carried out data curation and formal analysis. Samuel Iddi and Louis Asiedu performed supervision, validation, and reviewing and editing.