A New Weighting Scheme in Weighted Markov Model for Predicting the Probability of Drought Episodes
Drought is a complex stochastic natural hazard caused by prolonged shortage of rainfall. Several environmental factors are involved in determining drought classes at the specific monitoring station. Therefore, efficient sequence processing techniques are required to explore and predict the periodic information about the various episodes of drought classes. In this study, we proposed a new weighting scheme to predict the probability of various drought classes under Weighted Markov Chain (WMC) model. We provide a standardized scheme of weights for ordinal sequences of drought classifications by normalizing squared weighted Cohen Kappa. Illustrations of the proposed scheme are given by including temporal ordinal data on drought classes determined by the standardized precipitation temperature index (SPTI). Experimental results show that the proposed weighting scheme for WMC model is sufficiently flexible to address actual changes in drought classifications by restructuring the transient behavior of a Markov chain. In summary, this paper proposes a new weighting scheme to improve the accuracy of the WMC, specifically in the field of hydrology.
Drought, a highest ranked natural hazard, is the main source of severe destructive effects on the planet . In particular, its sustained consequences lead to sterilization of agricultural land and initiation of diseases. Factors associated with higher risk of drought are long duration of comparatively less rainfall, high rate of evapotranspiration, low relative humidity, high temperature, and high wind speed . Moreover, many other environmental and ecological factors are also responsible for the recurrent occurrences of drought hazard. However, drought intensity, duration, and severity may vary from region to region. In recent decades, almost all the developing countries are facing water shortage due to continued expansion in agriculture, industrial, and energy sectors. Consequently, a perpetual increase in difference between water demand and renewable freshwater resources will lead to major social and economic issues .
To avoid and overcome the adverse effects of drought hazard, numerous studies have been conducted in various climatic regions. Several studies have proposed drought monitoring tools and forecasting methods to predict and quantify the risk associated with the recurrent occurrences of drought. Drought indices are the most frequently used ones around the world among various available tools due to their simplicity in structure, robustness, and popularity. A list of drought indices along with its variable requirement is available in the literature . Beside drought monitoring tools, a number of drought forecasting methods have been developed to assess the risk of future drought conditions [5, 6]. However, the tendency of using stochastic process to model uncertain phenomena is rapidly increasing, such as Markov chain is a promising approach to model dynamic activities . A Markov chain is a discrete time stochastic process, which has the property that the future state of the process is independent from the past state, given the present state . Furthermore, Markov chain models can be useful for forecasting future drought classes due to their multifaceted nature to enumerate uncertainties associated with these hydrometeorological variables [9–11]. However, it is difficult to adjust the transition probability matrix of Markov chain for the precise forecasting of succeeding events at short time scale.
To handle this structural issue, several studies used weights in Markov chain models to improve model accuracy and precision [12–16]. However, the selection of weights is purely a subjective approach. The assignment of these weights depends on the type of data or purely subjective. Dynamic weighting schemes are proposed to adjust Markov chain model [17–22]. Furthermore, an error and trial methodology can be used to increase the performance of the weighted Markov chain .
In this study, we aimed to develop a new weighting method to address the structural difficulties in traditional Markov chain while forecasting short-term drought states. We use the standardized precipitation temperature index (SPTI) for drought classification. As SPTI produces drought severity level (highest to lowest), therefore, prediction of highest severity to lowest severity and vice versa is questionable in the traditional Markov chain setting. The rationale of this study is to use interrater reliability measures of association as weights for accurate and precise forecasting of drought classification states under Weighted Markov Chain (WMC) setting. We use our proposed weighting scheme for one-month ahead forecasting of drought states of five meteorological stations in Pakistan.
Nowadays, drought indices play a substantial role in determining drought classes. However, the methodological configuration of each index solely depends on the availability of climatic data and its prior historical data. In the current scenario, the uses and application of multiscalar drought indices, being more flexible and having the characteristics to determine drought at various time scales, are very common. In this paper, monthly drought classifications determined by the multiscalar drought indices are assumed to follow first-order Markov chain. To predict the probability of one-month ahead drought classes, a new weighting scheme to determine the weights for WMC model is provided by aligning the role of autocorrelation and Cohen Kappa measure.
The detail on the methodological structure of the proposed model weighting schemes is explained in the subsequent subsection.
2.1. The Multiscalar Drought Indices
There are several procedures to report drought severity using multiscalar drought index. McKee et al.  developed SPI drought index, which is based on long-term precipitation record to quantify the precipitation scarcity for different time scales. One of the major advantages of using SPI index is that it can be used to monitor drought for various time scales.
Vicente-Serrano et al.  developed a new multiscalar drought index: the standardized precipitation evapotranspiration index (SPEI). In SPEI, the water balance model based on the difference between precipitation and potential evapotranspiration (PET) is used with similar estimation procedure of SPI. One major advantage of SPEI over SPI is inclusion of the effect of evaporation in rainfall data to characterize the regions under study.
Following the same methodology of SPI and SPEI, Ali et al.  developed standardized precipitation temperature index (SPTI) drought index to capture drought characterization in both cold and hot climate regions. However, appropriate selection of drought indices also depends on the availability of data and the nature of climatic zone. In this study, we employed SPTI methodology to characterize monthly historical behavior of drought hazard. Stepwise mathematical procedure of SPTI methodology is discussed in .
2.2. Drought Classes as a Markov Chain
A discrete Markov chain is a random process that describes a sequence of events from a set of finite possible states, whereas current event depends only on the preceding event. It has been commonly used to model uncertain events in various fields such as, engineering , economics , and physics . In recent decades, the use of Markov chain is common in many applications to capture the behavior of drought classification states using multiscalar drought indices [31, 32]. For example, Mishra et al.  examined the distribution of drought interval time and mean drought interarrival time by their joint probability density functions and Markov chain approach. Shatanawi et al.  found that exact prediction of drought index values is impossible based on ARIMA model. However, early warning of drought can be detected from monthly Markov transition probabilities. Therefore, in drought modeling context, time series data SPTI drought classes for a single station can be considered as a sequence of ordinal drought classes. Consequently, historical series of drought classification states for a specified station can be embodied as a discrete Markov chain process. Here, we assume that any single class of drought in time series of SPTI depends on its previous class and then proceed to the construction of the transition probability matrix. It is just a statistical compliance that allows us to consider each drought class as a 1st order Markov chain. However, if each class depends on its previous two classes or nth order class, it is recommended to use 2nd order or higher order Markov model accordingly.
2.3. The Proposed Weighting Scheme
In this study, time series data on drought classes Zt determined by the SPTI drought index at one-month time scale are assumed as a series of ordinal correlated random variables. In this situation, instead of using traditional correlation between the quantitative series of SPTI at various time lags, interrater reliability measure Kappa is suggested to assess the relationship between stochastic processes of Zt and Zt−1 of SPTI. Moreover, on the same rationale of using autocorrelation as weight, interrater reliability measure among the time series of ordinal classification of the SPTI index at various time lags can be considered in advance to predict the present drought class. In previous studies, Sengupta et al.  showed that the WMC method can make accurate predictions when the time series data exhibit its stochastic nature. Hereafter, in order to predict the occurrence of next drought class, the weighted average of interrater reliability coefficient at various lags as a weight is suggested to adjust the predictive probabilities. Therefore, instead of using correlation, the idea of using an interrater reliability measure for the ordinal classification is more rational.
The basic steps of the proposed weighting scheme in the WMC-based prediction of drought classes are as follows:(1)Drought classification states and transition probability matrix:
Let be the time series of drought classes, where may assume the nominal droughts classes , , , …, depending on the classification criteria of the SPTI drought index. The transient behavior of each drought class can be represented by transition probability matrix in the following wayswhere , , , …, represent drought classes corresponding to their transient probabilities matrix.(2)Construction of transition probability matrix:
In this step, we classify SPTI drought index with estimated one-month time scale according to the classification criteria (see Table 1).
Furthermore, let be the number of transitions from the state to through t steps in time series length of drought classes calculated from one-month time scale. Here, the transition probabilities for various time steps and various drought classes can be obtained by the following equation:where t represents the order of the Markov chain. Here, the transition probability matrix for various existing drought classes at the previous time step is represented as(3)Interrater reliability measures as weights:
Let denote the proportion of drought classes and determined by the SPTI at lag , where and represent time series on drought classes and represents time point. Then, the weighted Kappa at time t lag as a measure of association for ordinal categorical sequence is defined aswhereandwhere is the squared weighted function in Cohen Kappa [35, 36] as suggested by Robieson and King and Chinchilli  and and represent the marginal proportion that assigned to the drought classes and . In this study, the formula of weighted Kappa is modified on the same rationale of autocorrelation. We used irr  R package to compute the values of weighted kappa at various time lags.(4)Standardization of weights:
The weights for the weighted Markov chain model are computed by standardizing Kappa coefficient computed by the following equation:
At this moment, the time step for weighted Kappa is decided according to the steady state nature of the transition probability matrix. If transition probability matrix approaches their steady state at s time point, then, we calculate weighted Kappa from 1 to s steps accordingly.(5)Prediction
In this step, we assume recent past states of drought classification states as an initial drought class and combine it with the row vectors of their corresponding transition probability matrix. Here, the state transition probability vectors can be expressed in the following form.where i represents the drought classes and t represents the order of Markov chain.
Furthermore, those transition probability vectors that are in acceding power for all the previous candidate of drought classes are selected. The following mth order transition probability matrix shows how one can understand the above stated argument:where , and .
Here, and show the previous candidate drought classes and m represents the order of Markov chain corresponding to each candidate drought classes. Finally, by Equation (10), we assigned the weights to each vector of candidate drought classes and get the predicted probabilities for each drought class. Those drought classes which have a maximum prediction probability are then selected as predicted drought classes. Additionally, by iterating this algorithm, we can forecast n-time drought classes under the WMC framework.
Amalgamation of the proposed weighting scheme with SPTI-based historical drought classification is validated based on five meteorological stations located in various regions of Pakistan. We used secondary data on monthly total precipitation and mean temperature data to illustrate WMC based on the prediction process. These datasets were obtained from the Karachi Data Processing Center through Pakistan Meteorological Department. A brief description of the study area and the application of the proposed framework for one-step ahead prediction of drought are provided in the following section.
3.1. Study Area
Illustration of the proposed weighting scheme accomplished for the five meteorological stations of Pakistan includes Astor, Chilas, Cherat, Skardu, and Peshawar. Figure 1 shows the location map of the study area. These stations have high variability in rainfall throughout the season. In each season, some of the stations are continuing to bear extremely vulnerable drought conditions. The drought has become a recurrent phenomenon in the country. In the recent decade, due to severe drought hazard, the economic system of the country was badly disturbed. Most parts of the country are arid to semiarid, with large spatial variability in the temperature, except the southern slopes of the Himalayas and the submountain region where the annual rainfall ranges from 760 mm to 2000 mm . Pakistan has four well-marked seasons: cold, from November to February; premonsoon (hot), from March to mid of June; monsoon, from mid of June to mid of September; postmonsoon, from mid of September to October . Summer season is extremely hot and the relative humidity ranges from 25% to 50%. The major part of Pakistan is arid to semiarid with large spatial variability in the temperature. In recent decades, several authors had been working to explore the geographical and hydrological importance of these stations. Awan  and Archer and Fowler  explored and inferred different climatic variables in terms of regression, spatial correlation, and temporal variation. Ahmad et al.  evaluated the significance of these mountainous areas that have a substantial potential in hydropower production and water resources. Ali et al.  compared the performance of SPTI with SPI and SPEI using time series data on precipitation and temperature on these stations. We applied our proposed model on long-term time series data monthly precipitations, minimum and maximum rainfall that was recorded during January 1955 to December 2017. The data were collected from Karachi Data Processing Center (KDPC) through Pakistan Meteorological Department (PMD). This dataset fulfills the WMO requirements, where errors, scrutiny, tabulation, and quality control are done by KDPC http://www.pmd.gov.pk/rmc/RMCK/Services_Climatology.html. The following two steps describe the detailed procedure for the estimation of SPTI.
The first step consists on the searching of appropriate probability distributions of DAI as suggested by Stagge et al. . Therefore, the entire computational procedure of SPTI values is based on varying probability model for each station. Consequently, the estimation procedure consists of the searching of appropriate candidate distribution using Kolmogorov-Smirnov and Chi-Square, Anderson-Darling tests at the most commonly used level of significance = 0.5 by using easyfit software .
In the second step, several parameter estimation methods (method of moments, method of maximum likelihood estimation, and method of L-moments) are incorporated using R package lmom . Probability distributions that have minimal value of the Akaike information criteria (AIC) is then standardized to obtain temporal values of each index accordingly.
Figures 2 and 3 represent the fitted probability distribution and their corresponding temporal values of SPTI for Astor and Cherat stations, respectively. The resulting values of SPTI are classified according to its classification criteria. To see the exploratory behavior of various drought classes, Figures 4 and 5 represent the cumulative frequencies of drought classes and its transition behavior of moving one drought class to another at Astor observatory. In this station, most of the month continued to bear near-normal weather conditions; however, as compared with other drought categories, extremely dry and severe dry drought classes are quite high.
To test and infer the proposed framework of WMC-based prediction, we first compute transition probability matrix at various order form historical classifications of drought classes. In the current research, we use Markovian chain  R package to construct the transition probability matrices for all the stations. Secondly, weights at various lags are computed from ordinal categorical data on historical classification of drought classes by using Equations (4) and (7). To illustrate the steps associated with proposed framework, we provide the stepwise numerical results of Astor station. Therefore, Table 2 is especially prepared to show the value of Kappa associated with weights that are computed from the temporal classification of ordinal drought classes for Astor station. To predict future drought class, the original data on drought classes from June 2017 to December 2017 is arranged in chronological order. For example, June 2017 to December 2017, each month bears near normal situation. These drought classes are taken from the original classification of the SPTI drought index. To infer the probabilities of next drought classes, transition probability vector for each order is organized in matrix form. Table 3 shows the transition probability matrix in varying order with corresponding weights.
In the next step, according to the Equation (10), the weighted sums of the probabilities are computed for each drought class. In this numerical example, near normal drought class receives a maximum probability of occurrences () in January 2018. The actual drought category is also near normal. Hence, the method performs correct prediction. However, the next tentative class is moderately wet with a probability of 0.1146. Following Zhou et al.  and Zhang et al. , by taking predicted drought class as a reference category, the whole process may iterate to predict drought classes for March 2018 and so forth. Here, interrater reliability measure Kappa playing a role to adjust the long run convergence error in the Markov chain.
In line with Astor station, numerical investigations are carried out for Skardu, Chilas, Cherat, and Peshawar stations. A visual representation of one-step transition probability matrix can be seen in Figure 6, where the temporal profiles of drought classes in Peshawar and Cherat are explored. In these stations, each drought class has a high probability to transit in near-normal drought class. To adjust the temporal behavior of the Markov chain, it is assumed that by assigning appropriate weights to each drought classification, accurate forecasting of one or month ahead may conclude under the weighted Markov chain framework. Therefore, accuracy of the proposed model is assessed by cross-validation of the predicted drought class with original classification. Consequently, we left January 2018 for the validation phase of the proposed framework for all the stations.
Table 4 shows the values of squared Kappa with corresponding values in each time lag, its standardized values (weights), and the one-month ahead predicted probabilities of each drought class, where Chilas station has a high probability () of near-normal drought class on January 2018. In the same way, Cherat and Peshawar also have the same drought classification with probabilities 0.6948 and 0.6623, respectively. Since in these stations, the observed quantities of rainfall and the historical time series of rainfall quantities are quite high in December and January. Moreover, the original series of SPTI drought classes depicted wet classes for these two months in all stations. Contrary to the near normal drought class, Skardu observatory will bear severe drought class with a probability of 0.5003.
This paper provides a new procedure to handle the prediction problem of ordinal categorical series of drought classes determined by SPTI drought index. For assessing the proposed method, we took five meteorological stations located in different climatology of Pakistan (see Figure 1). In this research, time series data of rainfall and temperature ranges from 1955 to 2017 recorded at monthly time scale were used to classify according to the classification criterion given in Ali et al. . By assuming each temporal series of drought classes as a first-order Markov chain, the current research employed Markovian chain  R package to construct the transition probability matrices. According to the setting of WMC, the proposed weighted schemes were used to predict future drought condition of all the stations. We provide stepwise procedure for Astor station (see Table 2 and 3). However, prediction results are given for all the stations (see Table 4).
For assessing the consistency of the proposed method, we compared our prediction results with the steady state probabilities of each drought class. These long-term probabilities can also be viewed to cross validate the observed probabilities. It is observed that, in Chilas, Cherat, and Peshawar, the predicted drought classes are consistent with its long-term probabilities. However, in Skardu, a significant difference is observed, where long-term probability of near-normal drought class is 0.0563 and the predicted probability is 0.2483. This reflects the appropriateness of the proposed weighting scheme for ordinal classification of discrete stochastic process.
In summation, outcomes associated with this research show that the proposed weighting scheme may incorporate to adjust the structure of traditional Markov chain for short-term prediction. Additionally, our detailed analysis has also proved the suitability using interrater reliability instead of autocorrelation, as a weight in the WMC model. Therefore, trend from high to low accuracy can be controlled by adjusting the structural behavior of transition probability vectors from the proposed weighting scheme. However, the limitation of the proposed methods is not to consider the nonstationary behavior of Markov chain. Moreover, in the computations, the study assumed each Markov chain as first-order Markov process.
Prediction and forecasting play a very important role, especially in early warning situations. Consequently, accurate and precise techniques of drought forecasting may reduce their severe effect by making effective drought mitigation policies. In this article, the SPTI drought index being a more comprehensive drought monitoring procedure is used to classify historical monthly drought profile. Outcomes show that by introducing standardized squared weighted Kappa as a weight, the research suggests a new way to get adjusted prediction probabilities under WMC framework. Furthermore, it is observed that the advantage of step forecasting can be achieved by just changing the transition probability vector and rearranging vectors of weights. Therefore, among numerous other studies and forecasting framework, the uniqueness of this research is to introduce ordinal measure of association at various lags in WMC-based prediction method. Consequently, by using SPTI or other multiscalar drought indices such as SPI and SPEI, where meteorological stations are characterized monthly ordinal drought classification, it is more reasonable to use ordinal measure of association, instead of correlation.
The data used to support the findings of this study are available from the corresponding author upon request.
The manuscript is prepared in accordance with the ethical standards of the responsible committee on human experimentation and with the latest (2008) version of Helsinki Declaration of 1975.
Conflicts of Interest
The authors declare that there are no conflicts of interest regarding the publication of this paper.
The authors are very grateful to the Deanship of Scientific Research at King Khalid University, Kingdom of Saudi Arabia, for their administrative and technical support.
G. F. White, Natural Hazards, Local, National, Global, Oxford University Press, Oxford, UK, 1974.
B. Edwards, M. Gray, and B. Hunter, “A sunburnt country: the economic and financial impact of drought on rural and regional families in Australia in an era of climate change,” Australian Journal of Labour Economics, vol. 12, no. 1, p. 109, 2009.View at: Google Scholar
M. Svoboda and B. Fuchs, Handbook of Drought Indicators and Indices, World Meteorological Organization, Geneva, Switzerland, 2016.
K. Lange, Numerical Analysis for Statisticians, Springer Science and Business Media, Berlin, Germany, 2010.
C. Chatfield, The Analysis of Time Series: an Introduction, CRC Press, Boca Raton, FL, USA, 2016.
X. Zhou, Y. Wang, and X. Zhou, “Precipitation estimation based on weighted Markov chain model,” in Proceedings of 2017 Seventh International Conference onInformation Science and Technology (ICIST), pp. 64–68, IEEE, Da Nang, Vietnam, April 2017.View at: Google Scholar
P. Jun and W. Hao-han, “The application of weighted Markov chain in Hunhe flood control planning,” Jilin Water Resources, vol. 8, p. 012, 2015.View at: Google Scholar
J. Chen and Y. Yang, “SPI-based regional drought prediction using weighted Markov chain model,” Research Journal of Applied Sciences, Engineering and Technology, vol. 4, no. 21, pp. 4293–4298, 2012.View at: Google Scholar
P. J. Bhakta, “Markov chains for weighted lattice structures,” Georgia Institute of Technology, Atlanta, GA, USA, 2016, Ph.D. thesis.View at: Google Scholar
J. B. Welsh, L. M. Sapinoso, A. I. Su et al., “Analysis of gene expression identifies candidate markers and pharmacological targets in prostate cancer,” Cancer Research, vol. 61, no. 16, pp. 5974–5978, 2001.View at: Google Scholar
Q. X. Zhou, Y. D. Wu, H. X. Fan, X. Wang, L. H. Sun, and S. Z. Wang, “Weight calculation and application of weighted Markov,” Journal of Harbin University of Commerce, vol. 6, p. 028, 2014.View at: Google Scholar
T. B. McKee, N. J. Doesken, and J. Kleist, “The relationship of drought frequency and duration to time scales,” in Proceedings of the 8th Conference on Applied Climatology, vol. 17, no. 22, pp. 179–183, American Meteorological Society, Boston, MA, USA, January 1993.View at: Google Scholar
Z. Ali, I. Hussain, M. Faisal et al., “Forecasting drought using multilayer perceptron artificial neural network model,” Advances in Meteorology, vol. 2017, Article ID 5681308, 9 pages, 2017.View at: Google Scholar
Y. Gui and J. Shao, “Prediction of precipitation based on weighted Markov chain in Dangshan,” in Proceedings of the International Conference on High Performance Compilation, Computing and Communications, pp. 81–85, ACM, Kuala Lumpur, Malaysia, March 2017.View at: Google Scholar
W. Z. Robieson, “On weighted Kappa and concordance correlation coefficient,” Graduate College, University of Illinois at Chicago, Chicago, IL, USA, 1999, Ph.D. thesis.View at: Google Scholar
M. Gramer, J. Lemon, I. Fellows, and P. Singh, Various Coefficients of Interrater Reliability and Agreement, 2012.
Q. Z. Chaudhry, “Construction of all Pakistan monsoon rainfall series 1901–2008,” Pakistan Journal of Meteorology, vol. 6, no. 12, 2009.View at: Google Scholar
J. A. Khan, The Climate of Pakistan, Rehbar Publishers, Karachi, Pakistan, 1993.
S. A. Awan, “The climate and flood risk potential of northern areas of Pakistan,” Science Vision, vol. 7, no. 3–4, pp. 100–109, 2002.View at: Google Scholar
J. R. Hosking, L-Moments, Wiley StatsRef: Statistics Reference Online, 2009.
G. A. Spedicato and M. Signorelli, The R Package “Markovchain”: Easily Handling Discrete Markov Chains in R, 2014.
C. Zhang, X. Chen, X. Chen, K. Yu, G. Pan, and X. Zhang, “Analysis and prediction of annual precipitation based on weighted Markov chain in typical region of Taihu lake basin,” Bulletin of Soil and Water Conservation, vol. 1, p. 032, 2015.View at: Google Scholar