Research Article | Open Access
Forecasting Rice Productivity and Production of Odisha, India, Using Autoregressive Integrated Moving Average Models
Forecasting of rice area, production, and productivity of Odisha was made from the historical data of 1950-51 to 2008-09 by using univariate autoregressive integrated moving average (ARIMA) models and was compared with the forecasted all Indian data. The autoregressive () and moving average () parameters were identified based on the significant spikes in the plots of partial autocorrelation function (PACF) and autocorrelation function (ACF) of the different time series. ARIMA (2, 1, 0) model was found suitable for all Indian rice productivity and production, whereas ARIMA (1, 1, 1) was best fitted for forecasting of rice productivity and production in Odisha. Prediction was made for the immediate next three years, that is, 2007-08, 2008-09, and 2009-10, using the best fitted ARIMA models based on minimum value of the selection criterion, that is, Akaike information criteria (AIC) and Schwarz-Bayesian information criteria (SBC). The performances of models were validated by comparing with percentage deviation from the actual values and mean absolute percent error (MAPE), which was found to be 0.61 and 2.99% for the area under rice in Odisha and India, respectively. Similarly for prediction of rice production and productivity in Odisha and India, the MAPE was found to be less than 6%.
Rice is one of the most important cereal crops of India occupying an area of 41.92 million hectare with an annual production of 89.09 million tonnes with an average productivity of 2.13 t ha−1 (2009-10) (http://www.agricoop.nic.in/). It plays a vital role in the national food security and would continue to remain so because of its wider adaptability to grow under diverse ecosystems. Rice contributes 40.8% of total food grain and remains the principal source of livelihood for more than 58% of the population. With the stabilization of area under rice at around 42 million hectare, plateauing, and/or declining productivity trend, especially in the Northern and Southern zones and shrinking natural resource bases, the only opportunities for sustaining the current level of sufficiency are seen in the vast underexploited potential of rainfed Eastern India .
A proper trend analysis and forecast of production of such an important crop in the potential Eastern Region is having significance on many accounts. Critical analysis of production and productivity is a prerequisite for proper knowledge base on the ecology and appropriate research/development efforts for harvesting maximum possible potential. Trend analysis has been attempted for crops like papaya and garlic by several authors [2–4]. An unexpected decrease in production reduces marketable surplus and income of the farmers and leads to price rise. Similarly, an increase in production can lead to a sharp decrease in prices and has adverse effect on farmers’ incomes. Impact on price of an essential commodity has a significant role in determining the inflation rate, wages, salaries, and various policies in an economy. The proper forecast would pave way for appropriate surplus and deficit management to stabilize the price and ensure profits for the farmers.
Several techniques like simulation modelling and remote sensing are largely being used for forecasting of the crop yield and acreage. But sometimes, forecasting is needed much before the crop harvest or even before the crop planting. This can be achieved only by modeling the past data and getting the predictions. Autoregressive integrated moving average (ARIMA) has been used for model building based on the past data and predictions are made. ARIMA models have been developed to forecast the cultivable area, production, and productivity of various crops of Tamil Nadu [5, 6] and wheat production in Pakistan  and Canada . Univariate forecasting of state level agricultural production was also made by various authors using ARIMA models [9–12].
Keeping the above requirement in view, the present study was carried out to (i) analyze the trends of production, productivity, and area under rice in Odisha, an Eastern Indian state, and compare with all Indian scenarios and (ii) forecast and validate the rice area, production, and productivity using ARIMA models.
2. Materials and Methods
2.1. Data Collection
The data on cultivable area, production, and productivity of the rice in Odisha was collected from the Annual Report on “Orissa agriculture and statistics” on Agricultural Statistics published by the Directorate of Agriculture and Food Production, Government of Odisha, Bhubaneswar, India. The same data for India was obtained from Directorate of Economics and Statistics, Department of Agriculture and Cooperation, India. The data pertaining to the agricultural years 1950-1951 to 2006-2007 was used for the model building and forecasting. The data of 2007-08, 2008-09, and 2009-10 was used for validation of the model.
2.2. Trend Analysis
The time series data pertaining to rice area, productivity, and production in Odisha as well as India were analyzed using the Mann-Kendall trend test for assessing the trend present in the data. Initially, this test was used by Mann  and Kendall  and subsequently derived the test statistic distribution [15, 16]. This hypothesis test is a nonparametric, rank-based method for evaluating the presence of trends in time series data. The data are ranked according to time and then each data point is successively treated as a reference data point and is compared to all data points that follow in time. Compared with parametric statistical tests, nonparametric tests are thought to be more suitable for nonnormally distributed data . Since the time series data used in the study is mostly nonnormally distributed as evident from the skewness and kurtosis values given in Table 1, the nonparametric tests were used in the study.
The Mann-Kendall test statistic is given by where and are the sequential data values, is the data set record length, and The Mann-Kendall test has two parameters that are of importance to the trend detection. These parameters are the significance level that indicates the trend’s strength and the slope magnitude estimate which indicates the direction as well as the magnitude of the trend.
For independent, identically distributed random variables with no tied data values, we have ; When some data value are tied, the correction to is where denotes the number of ties of extent . For larger than 10, the test statistic follows the standard normal distribution . The magnitude of trend slopes can be also calculated (Sen, 1968). Sen’s estimate for slope is associated with the Mann-Kendall test as follows: where and are considered data values at time and (), correspondingly. The median of these values of is represented as Sen’s estimator of slope which is given as A positive value of indicates an upward trend, whereas a negative value represents a downward trend.
2.3. ARIMA Model
The ARIMA model analyzes and forecasts equally spaced univariate time series data. An ARIMA model predicts a value in a response time series as a linear combination of its own past values. The ARIMA approach was first popularized by Box and Jenkins , and ARIMA models are often referred to as Box-Jenkins models. In this study, the analysis performed by ARIMA is divided into three stages .
2.4. Notation for Pure ARIMA Models
Consider where indexes time, is the response series or a difference of the response series, is the mean term, is the backshift operator, that is, , is the autoregressive operator, represented as a polynomial in the backshift operator: is the moving average operator, represented as a polynomial in the backshift operator , and is the independent disturbance, also called the random error. For simple differencing, , where is the order of differencing.
Identification Stage. The stationary check of time series data was performed, which revealed that rice area, production, and productivity for India as well as for Odisha were nonstationary except for the area under rice in Odisha. The nonstationery time series data were made stationary by first order differencing and best fit ARIMA models were developed using the data from 1951 to 2007 and used to forecast the cultivable area, production, and productivity of rice for Odisha and India for the next three years, that is, 2007-2008, 2008-2009, and 2009-10.
Candidate ARIMA models were identified by finding the initial values for the orders of nonseasonal parameters “” and “.” They were obtained by looking for significant spikes in autocorrelation and partial autocorrelation functions. At the identification stage, one or more models were tentatively chosen which seem to provide statistically adequate representations of the available data. Then precise estimates of parameters of the model were obtained by least squares.
Estimation Stage. ARIMA models are fitted and accuracy of the model was tested on the basis of diagnostics statistics.
Diagnostic Checking. The best model was selected based on the following diagnostics.
(i) Low Akaike Information Criteria (AIC). AIC  is estimated by , where and is the likelihood function.
Sometimes, SBC  is also used and estimated by .
(ii) Insignificance of Autocorrelations for Residuals. If a model is an adequate representation of a time series, it should capture all the correlation in the series, and the white noise residuals should be independent of each other.
(iii) Significance of the Parameters. Significance tests for parameter estimates indicate whether some terms in the model might be unnecessary.
Forecasting Stage. Future values of the time series are forecasted.
2.5. Model Evaluation
The mean absolute percent error (MAPE) as defined below was used as a measure of accuracy of the models: is forecasted variable, is actual variable, and is number of variables.
SAS 9.2 software (SAS Institute, Inc., Cary, NC) was used for time series analysis and developing ARIMA models and forecasting.
3. Results and Discussion
3.1. Trend Analysis
Descriptive statistics for the time series data of rice area, production, and productivity for both Odisha and India is given in Table 1. The time series data is plotted in Figure 1. The time series data for rice area, production, and productivity are nonnormal which can be assessed from their probability density plot and values of skewness and kurtosis. Hence nonparametric Mann-Kendall test for trend analysis was performed to test the significance of trend. As evident from the values of Mann-Kendall’s statistics and Sen’s slope estimate (), the time series data for all the parameters selected for analysis showed significant and positive trend. The Mann-Kendall value as well as magnitude of slope indicated that the rate of increase was less for area, production, and productivity in Odisha as compared to all Indian scenarios.
The trend analysis of long term time series data (1950-51 to 2006-07) for the area under rice was found to be positive with a value of 0.01 and 0.26 for both Odisha and India, respectively. The low value can be explained by the fact that the area under rice remained more or less constant for the last 10 years due to competition from urbanization and industrialization. Area under rice in India was 43.45 million hectare and 43.81 million hectare, respectively, for the years 1997-98 and 2006-07, while during the same period area under rice in Odisha reduced to 4.45 million hectare from 4.50 million hectare. It is evident that there is plateauing in the area under rice in the last decade and the only option available to increase the rice production is vertical expansion.
Trend analysis also showed a considerable increase in all Indian average productivity of rice from 668 kg ha−1 in 1950-51 to 2131 kg ha−1 in 2006-07 and during the same period, rice productivity in Odisha increased from 520 kg ha−1 to 1557 kg ha−1. The rate of increase of productivity in Odisha is less than all Indian average as evident from Sen’s slope estimate of 25.32 kg ha−1 year−1 and 16.39 kg ha−1 year−1 for India and Odisha, respectively, indicating an untapped growth potential for rice in Odisha. In order to tap this potential Government of India has launched a programme “Bringing Green Revolution in Eastern India” since 2010-11.
3.2. Building ARIMA Models
The autoregressive () and moving average () parameters were identified based on the significant spikes in the plots of PACF and ACF of the different time series. While identifying the best fit ARIMA models, appropriate values of , , and were chosen corresponding to minimum value of the selection criterion, that is, AIC and SBC. The appropriate best fit models for rice area, production, and productivity of Odisha and India along with AIC and SBC are given in Table 2. The estimates of the autoregressive and moving average parameters along with the constant term are presented in Table 3. It is clear from the “” value that all the parameters estimates were significant which is an essential criteria for the ARIMA models. It is evident from Figure 2(a) that ACF of area under rice for India has a significant spike at lag 1 and PACF declines gradually (Figure 3(a)), which indicated a moving average model of first order.
|Significant at .|
Similarly significant spikes at lag 2 for the PACF of rice productivity and production of India indicate a second order autoregressive model of ARIMA , which was found to be a best fit model. Significant spike at lag 2 of PACF (Figure 3(d)) and gradually declining ACF (Figure 2(d)) for area under rice for Odisha indicated a pure autoregressive model of order 2 and ARIMA and was found to be best fitted. Significant spike at lag 1 in Figures 2(e) and 2(f) and Figures 3(e) and 3(f), for both ACF and PACF, indicated a first order autoregressive as well as moving average model for both productivity and production of Odisha. The ACF and PACF were plotted for residuals of the fitted model and were lying within the limits, which showed that ARIMA model fitted well.
3.3. Forecast Using ARIMA Models
The observed and predicted values for rice area, production, and productivity along with percentage of deviation are presented in Table 4. The forecasted values of cultivable area of rice, for the years 2007-08, 2008-09, and 2009-10 for Odisha, were 4.46, 4.45, and 4.44 million hectare with the deviation of −0.22, 0, and −1.60%. Negative value in % deviation showed that predicted values were higher than the actual values. Similarly the forecasted values for cultivable area for all Indian average for the years 2007-08, 2008-09, and 2009-10 were 43.88, 44.12, and 44.35 million hectare with deviation from the actual of 0.07%, 3.12%, and −5.80%, respectively.
|MAPE: mean absolute percent error.|
The forecasted values of productivity of rice for 2007-08, 2008-09, and 2009-10 for Odisha were 1503.89, 1544.61, and 1544.45 kg ha−1 with deviation of 12.56 and −1.02 and 1.93%, respectively. The higher deviation of 12.56% in rice productivity of Odisha was due to jump in rice productivity in the year 2007-08 than the average productivity (Table 1). The forecasted values of rice productivity in India for the years 2007-08, 2008-09, and 2009-10 were 2121.24, 2173.59, and 2190.89 kg ha−1 with 3.67, 0.20, and −3.10% deviations in prediction, respectively (Table 4).
The total forecasted production of rice in Odisha was 6.77, 6.95, and 6.99 million tonnes for the years 2007-08, 2008-09, and 2009-10 with prediction deviation of 11.62, −2.06, and −1.01%, respectively. This was due to high average productivity in 2007-08. Similarly the average production deviation for India was 4.62, 3.95, and −7.67%, respectively. The % error in prediction for area under rice varied from 0.07 to −5.80 and −0.22 to −1.60 for India and Odisha, respectively. The % deviation in prediction for rice productivity was 0.20 to 3.67 and −1.02 to 12.56% for India and Odisha, respectively. The % deviation in prediction for production of rice varied from 3.95 to −7.67 and −1.01 to 11.62% for India and Odisha, respectively. The MAPE was within 6% for all the forecasted parameters for Odisha as well as for India.
The trend analysis of the rice data showed an increasing productivity and production trend for both Odisha and India; the rate of increase was less in Odisha than all Indian average. This may be attributed to underexploitation of the potential of the state due to low input in agricultural operations and other biotic and abiotic factors. To bridge the gap between existing and potential productivity, rice varieties suitable to different ecologies can be introduced in farmer’s field along with the nutrient and agronomic management practices. Based on the forecasting and validation results, it may be concluded that ARIMA model could be successfully used for forecasting rice area, production, and productivity of Odisha as well as India for the immediate subsequent years.
(i)Trend analysis of rice area, production, and productivity of Odisha vis a vis India from the historical data of 1950-51 to 2008-09 is done.(ii)Forecasting of rice area, production, and productivity of Odisha vis a vis India was made from the historical data using ARIMA models.
Conflict of Interests
The authors declare that there is no conflict of interests regarding the publication of this paper.
Authors thank Director of Central Rice Research Institute, Cuttack, Odisha (India), for providing all the help for performing this study.
- E. A. Siddiq, “Bridging the rice yield gap in india,” in Bridging the Rice Yield Gap in the Asia -Pacific Region, P. K. Minas, J. D. Frank, and J. H. Edward, Eds., pp. 84–111, Food and Agriculture Organization of the United Nations Regional Office for Asia and the Pacific Bangkok, 2000.
- P. K. Sen, “Estimates of the regression coefficient based on Kendall’s tau,” Journal of American Statistical Association, vol. 39, pp. 1379–1389, 1968.
- S. C. Srivastava, U. C. Sharma, B. K. Singh, and H. S. Yadava, “A profile of garlic production in India: facts, trends and opportunities,” International Journal of Agriculture, Environment and Biotechnology, vol. 5, no. 4, pp. 477–482, 2012.
- M. Mahesh and B. C. Jain, “Compound growth rate (CGR) of area, production and productivity of papaya in Raipur district of Chhattisgarh,” International Journal of Agriculture, Environment and Biotechnology, vol. 6, no. 1, pp. 139–143, 2013.
- D. Balanagammal, C. R. Ranganathan, and R. Sundaresan, “Forecasting of agricultural scenario in Tamil Nadu—a time series analysis,” Journal of the Indian Society of Agricultural Statistics, vol. 53, no. 3, pp. 273–286, 2000.
- P. Balasubramanian and P. Dhanavanthan, “Seasonal modeling and forecasting of crop production,” Statistics and Applications, vol. 4, no. 2, pp. 107–118, 2002.
- N. Saeed, A. Saeed, M. Zakria, and T. M. Bajwa, “Forecasting of wheat production in Pakistan using ARIMA models,” International Journal of Agricultural Biology, vol. 2, no. 4, pp. 352–353, 2000.
- V. K. Boken, “Forecasting spring wheat yield using time series analysis: a case study for the Canadian prairies,” Agronomy Journal, vol. 92, no. 6, pp. 1047–1053, 2000.
- R. Indira and A. Datta, “Univariate forecasting of state-level agricultural production,” Economic and Political Weekly, vol. 38, no. 18, pp. 1800–1803, 2003.
- K. P. Chandran and Prajneshu,, “Nonparametric regression with jump points methodology for describing country's oilseed yield data,” Journal of the Indian Society of Agricultural Statistics, vol. 59, no. 2, pp. 126–130, 2005.
- K. K. Suresh and S. R. K. Priya, “Forecasting sugarcane yield of tamilnadu using ARIMA models,” Sugar Tech, vol. 13, no. 1, pp. 23–26, 2011.
- Sarika, M. A. Iquebal, and C. Chattopadhyay, “Modelling and forecasting of pigeonpea (Cajanus cajan) production using autoregressive integrated moving average methodology,” Indian Journal of Agricultural Sciences, vol. 81, no. 6, pp. 520–523, 2011.
- H. B. Mann, “Nonparametric tests against trend,” Econometrica, vol. 13, pp. 245–259, 1945.
- M. G. Kendall, Rank Correlation Measures, Charles Griffin, London, UK, 1975.
- R. M. Hirsch and J. R. Slack, “Nonparametric trend test for seasonal data with serial dependence,” Water Resources Research, vol. 20, no. 6, pp. 727–732, 1984.
- T. Y. Gan, “Hydroclimatic trends and possible climatic warming in the Canadian Prairies,” Water Resources Research, vol. 34, no. 11, pp. 3009–3015, 1998.
- L. Dou, M. Huang, and Y. Hong, “Statistical assessment of the impact of conservation measures on streamflow responses in a watershed of the Loess Plateau, China,” Water Resources Management, vol. 23, no. 10, pp. 1935–1949, 2009.
- G. E. Box and G. M. Jenkins, Time Series Analysis. Forecasting and Control, Holden-Day, San Francisco, Calif, USA, 1970.
- E. P. Box and G. M. Jenkins, Time Series Analysis: Forecasting and Control, Prentice-Hall, Englewood Cliffs, NY, USA, 1976.
- H. Akaike, “A new look at the statistical model identification,” IEEE Transactions on Automatic Control, vol. 19, no. 6, pp. 716–723, 1974.
- A. Hirotsugu, “Likelihood and the Bayes procedure,” in Bayesian Statistics, J. M. Bernardo, M. H. DeGroot, D. V. Lindley et al., Eds., pp. 143–166, University Press, Valencia, Spain, 1980.
Copyright © 2014 Rahul Tripathi et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.