Descriptive statistics regarding the mean, median, mode and skewness are reported in Table 1. As Table 1 is examined, we find the area of Arecanut increased by 558 per cent from 1960 to 2020. During the same duration, production increased by 1160 per cent. It shows a significant productivity improvement too. Productivity as bled during the study period with more than 100 per cent increase. The positive skewness of area, Production and productivity indicates the scope of future improvements in production too. A negative value of Kurtosis in productivity indicates a mesokurtic curve thatemphasises relatively stable productivity throughout the study period.
After seeing the arecanut area, production and productivity through descriptive statistics in Table 1, next step is model specification validation and forecast the area, production and productivity of Arecanut using the time series data. For projection purposes, we used different time series models. ARIMA, Holt’s winter linear and Holt’s winter exponential models were estimated and compared for best projection. The model selection for ARIMA Holt’s winter linear and exponential for area, production and productivity of arecanut was obtained using some goodness of fit criteria like AIC and BIC. Following the autocorrelation function and partial autocorrelation Function charts, we were able to determine the various p and q values for the ARIMA model. (Fig 2 and 3).
The combination with least AIC and BIC criteria were selected for model validation and forecasting. After estimating all the possible combinations, the best model for area and production was Holt’s exponential model (Table 2) with least AIC and BIC values. However, we did consider ARIMA (2, 2, 1) for area and production validation and forecasting as it was having least AIC and BIC values among all combinations of ARIMA. In case of productivity, ARIMA (0, 1, 1) is the best model for forecasting as it possessleast AIC and BIC values (Table 2).
However, in all these three cases, finally we proceed with ARIMA, Holt’s linear and exponential models and tried to capture the best forecasts possible for area, production and productivity of arecanut.
We estimated the errors on the forecasted values of testing data using some well-established measures like RMSE (root mean square errors), MAPE (mean absolute percentage errors), MAE (Mean Absolute Error) and MSE (Mean Square Error) to obtain the best model which is presented in Table 3.
In case of area, all the indicators suggest Holts Exponential as the best suited model for forecasting. In the case of production, two indicators suggest ARIMA (2, 2, 1) and rest two suggests Holts exponential as best fit model for prediction. Again, in case of productivity, both ARIMA (0, 1, 1) and Holt’s Linear model was found to be best fit for future prediction. The models predicted values are reported later in Table 6. To further verify our models, we ran a Ljung box q test on the data we had collected as residuals. Based on the results of the tests, we know that the residuals are white noise series and so lack autocorrelation. The detailed results of the model parameter of the best fit models and Ljung box Q test are presented in the Table 4.
The mathematical model for Area prediction using ARIMA (2, 2, 1) is specified as:
Where,
Y
t = Value of the time series at time t.
c = Constant term or the intercept.
(-0.032) and (-0.2416) = Autoregressive (AR) coefficients for the lagged terms Y
t-1 and Y
t-2, respectively.
(-0.848) = Moving average (MA) coefficient for the lagged = error term e
t-1.
e
t = Error term at time t.
The mathematical model for prediction of production using ARIMA (2,2,1) is specified as:
Where,
Y
t = Value of the time series at time t.
c = Constant term or the intercept.
(-0.341) and (-0.332) = Autoregressive (AR) coefficients for the lagged terms Y
t-1 and Y
t-2 respectively.
(-0.718) = Moving average (MA) coefficient for the lagged error term e
t-1.
e
t = Represents the error term at time t.
The mathematical model for prediction of Productivity using ARIMA (0, 1, 1) is specified as:
Where,
Y
t = The differenced value of the time series at time t.
c = Constant term or the intercept.
(-0.373) = Moving average (MA) coefficient for the lagged differenced value ΔY
t-1.
e
t = Represents the error term at time t.
The selected model lead to fewer errors in predicting the future and the difference in the accuracy among the selected models were tested with the help of DM test (Table 5). The DM test results show that there is a significant difference among the predicted values of ARIMA and Holt’s Exponential in case of arecanut area.
No significant differences were found in the predicted values of production and productivity when we used ARIMA v/s Halt’s Exponential and ARIMA v/s Holt’s linear model. Either of the models give forecast with least errors. The final forecasts for the next five years, starting from 2021 to 2025 is presented in Table 6.