Legume Research

  • Chief EditorJ. S. Sandhu

  • Print ISSN 0250-5371

  • Online ISSN 0976-0571

  • NAAS Rating 6.80

  • SJR 0.391

  • Impact Factor 0.8 (2023)

Frequency :
Monthly (January, February, March, April, May, June, July, August, September, October, November and December)
Indexing Services :
BIOSIS Preview, ISI Citation Index, Biological Abstracts, Elsevier (Scopus and Embase), AGRICOLA, Google Scholar, CrossRef, CAB Abstracting Journals, Chemical Abstracts, Indian Science Abstracts, EBSCO Indexing Services, Index Copernicus
Legume Research, volume 45 issue 4 (april 2022) : 454-461

​Modelling and Forecasting of Pulses Production in South Asian Countries and its Role in Nutritional Security

Yashpal Singh Raghav1, Pradeep Mishra2, Khder Mohammed Alakkari3, Monika Singh4, Abdullah Mohammad Ghazi Al Khatib5, Ritisha Balloo6
1Department of Mathematics, Faculty of Science, Jazan University, KSA.
2College of Agriculture, Jawaharlal Nehru Krishi Vishwa Vidyalaya, Powarkheda-461 110, Madhya Pradesh, India.
3Department of Statistics and Programming, Faculty of Economics University of Tishreen, Lattakia, Syria.
4Sugarcane Research Station, Jawaharlal Nehru Krishi Vishwa Vidyalaya, Bohani-487 555, Narsinghpur, Madhya Pradesh, India.
5Department of Banking and Insurance, Faculty of Economics, Damascus University, Syria.
6Department of Law and Management, University of Mauritius, Reduit, Mauritius.
  • Submitted29-07-2021|

  • Accepted04-01-2022|

  • First Online 23-02-2022|

  • doi 10.18805/LRF-645

Cite article:- Raghav Singh Yashpal, Mishra Pradeep, Alakkari Mohammed Khder, Singh Monika, Khatib Al Ghazi Mohammad Abdullah, Balloo Ritisha (2022). ​Modelling and Forecasting of Pulses Production in South Asian Countries and its Role in Nutritional Security . Legume Research. 45(4): 454-461. doi: 10.18805/LRF-645.
Background: The goal of this study was to forecast pulse production in six countries: Afghanistan, Bangladesh, China, India, Nepal and Pakistan (2020-2027). In this study, time series forecasting was used. 

Methods: The data series were divided into training set from 1961 to 2015 for model building, testing set from 2016 to 2019 for validation and finally, after selecting the best model, forecast was used from 2020 to 2027, the models were compared. The best-fit model was chosen based on the minimum ME, RMSE, MAE, MPE, MAPE, MASE, ACF1 values on the training data set and the minimum MAPE values on the testing data set. 

Result: The best fitted model for India was NNAR (1,1). Similar to Afghanistan, the best fit model for forecasting was NNAR (3,2). The best fit model for forecasting in China was ARIMA (0,1,1). The best fit model for forecasting in Nepal was ARIMA (1,1,0). The best fit model for forecasting in Pakistan was ETS (A, N, N) (M, N, N). With a 15.73 per cent growth rate from 2020 to 2027, the best models predict that the production of pulses in (Afghanistan, China, India) will increase until 2027. India will continue to be the largest producer of pulses among the six countries, with production expected to reach 1088.778 thousand tons in 2027. Afghanistan and China have extreme growth rates of 25.19% and 11.95%, respectively, while the rest of the countries have relatively stable production volumes. These results may be crucial for developing an effective agriculture production policy, whether by providing forecasted production values or evaluating such policies.
Pulses are highly pretentious and are one of the major food crops globally. It is also an important crop in India, as it contributes the most to financial gains by amounting to a large part of the exports. Major pulses are grown like chickpeas, pigeon peas, moong beans, black beans, lentils, peas and various other kinds of beans. Protein forms a major part of pulses by weight (20 to 25%), as compared to the amount of protein present in wheat and rice. It is also a major source and an integral part of vegetarian protein in the Indian diet. Globally, India produces the largest amount of pulses, although nutritional and food security issues are still present because of imbalanced growth in agriculture, which is more concentrated on rice and wheat production. Pulse production in recent decades has been increased but not in response to an increased population. Furthermore, there has been a decline in consumption of pulses, which leads to malnutrition. This would be overcome by institutional and policy support, adopting high yielding varieties of pulses, low-cost technologies and proper marketing for pulses (Shalendra et al., 2013). Singh et al., 2007; Bera and Nandi, 2011). Development and research activities have received less attention from both international and private multinational corporations, along with other economic and physical factors (Reddy, 2010; Joshi and Saxena., 2002; Vani and Mishra, 2019).
Although there has been a strong demand for pulses over the past two decades, this has contributed to a positive trade in pulses. In countries such as the U.S., Canada and Australia, technological advances combined with good prices mean farmers are obliged to produce better yields to compete effectively in terms of returns with other important crops. This expansion of pulse production is linked to the increasing importance of the commodity in international trade. However, production fluctuations in importing countries are significant, which can cause uncertainty for pulse exporting countries. Export countries’ farmers were not sure if pulses were planted because of unstable foreign demand (Belhassen and Rawal, 2018). In a study carried out by Vishwajith et al., (2019) on the forecasting of mung production, ARIMA (4,1,4) was the best-fitted model over the ARIMAX and GARCH models. Contrary to it, Ray and Bhattacharyya (2020) found the ARIMAX (1,1,1) model for pulse production better suited than the ARIMA model. Price prediction serves as an important tool to forecast the market price, which is necessary for framing policies for sustained production and remunerative prices (Darekar and Reddy, 2017). The three aspects such as data trends in a certain location, amount of production and productivity of pulses in India were researched by Savadatti (2017) who applied the ARIMA type and observed stagnancy in the area while increasing pulse production and productivity. Many other studies have employed the ARIMA model for forecasting, viz., cotton crop production and yield (Ali et al., 2015) in Pakistan, the different crops of forecasting of area, production and productivity (Balanagammel et al., 2000), pulses in South Asian counties (Mishra et al., 2021), sugarcane yield (Mishra et al., 2021) in India and many more. Vishwajith et al., (2018) could not establish the superiority of either GARCH or ARIMA in modeling of data for arhar production in India.
In this research, we aim to predict the production of pulses in six countries (Afghanistan - Bangladesh - China - India - Nepal - Pakistan). In present studies these countries for selected for south Asian region on the basis of higher production. The study period extends (1961-2019 from www.fao.org) an annual frequency. To forecast pulses production up to year 2027, we use three types of models Autoregressive Integrated Moving Average (ARIMA) - Nonlinear autoregressive neural network (NNAR) - Exponential Smoothing (ETS) first use the full forms followed by short forms) and compare their results. We will use data spanning from (1961-2015) for estimation and training by models and data from (2016 - 2019) to validate the models. Before that, our methodology goes through several stages:
DATA exploration
To visualize the data features (patterns, unusual observations, changes over time) we need to plot the data and then translate that through descriptive statistics and normal distribution of the data using the following statistic:
n: Number of observations, S: Skewness, K: kurtosis.
DATA stationary
Time series that have a trend and volatility are not stationary, will affect the value of the time series at different time. Thus, the series cannot be predicted in the long run. One way to determine whether a time series is stationary or not is to use a unit root. The time series in Augmented Dickey-Fuller test is described by the equation (Dickey and Fuller, 1981):

c : constant, α: coefficient on a time trend, p: lag order of the autoregressive process. The ADF test is carried out under the null hypothesis δ = 0 (not stationary) against the alternative of d < 0 (stationary). If the null hypothesis is not rejected, we perform the first difference to make the series stationary:
Estimation of models
To forecast pulses production up to year 2027, following three types of models are used:
ARIMA model
ARIMA models are the most widely used statistical models for time series forecasting, this is done by describing the autocorrelation in the data (Box et al., 2015). These models are divided into three parts, according to their nomenclature (Auto Regressive-Integrated-Moving Average) (p, d, q).
Autoregressive (p) refers to predicting a variable using a linear set of its preceding values, the model of order p can be written as:

βp: parameters of model, p: lag order of the autoregressive process, εt: error term. Integrated (d) refers to the degree of stationary of a variable that is determined using ADF test.
Moving average (q): uses past forecast errors in in regression. The equation will be in the form:
βq: Parameters of model, q: Lag order of the moving average, εt: Error term.
Whereas (d) is determined by ADF test, (p) and (q) are determined by the autocorrelation r(p) function and the partial autocorrelation function R(p), which are given according to the following:

NNAR model
Neural network autoregressive models are statistical models that allow complex nonlinear relationships to predict a variable using its lagged values. Where lagged values of the time series can be used as inputs to a neural network. (Hyndman et al., 2012) previously suggested this method. These models are distinguished from ARIMA models by the presence of a hidden layer, in which the linear weighted input is modified by a nonlinear function before it is output:  
In the hidden layer, this is modified using a nonlinear function:

bi and ωi parameter of model are learned from the data. This model can be written as NNR(p, k) where p lagged input and k nodes in the hidden layer. Model is neural network with observations (yt-p) used as inputs for forecasting the output yt and with k neurons in the hidden layer, with neglecting the effect of seasonality because the data is annual. The optimal number of lag  p as (p, q) in ARIMA model is chosen using akaike information criterion (AIC), which is given as follow:
maximum value of the likelihood function. We remind that this model does not assume restriction about the stationary and therefore the random part is included in the predictions, Lama et al., (2021).
ETS Model
Whereas ARIMA model describe autocorrelation in the data, exponential smoothing model (ETS) are based on describing the trend in the data, which was suggested by (Holt, 1957), Mishra et al., (2021), Devi et al., (2021) and (Winters, 1960).
ETS models are a systematic development in which exponential smoothing models (ETS) are combined into a nonlinear dynamic model. Analysis of these models using state-space based likelihood calculations, with support for model selection and calculation of forecast standard errors Hyndman et al., (2002).
Interested in the model in three main component of time series: trend (T), seasonal (S), error (E). Reflects the trend term of the long-term movement of time series and the error term is the unpredictable component of the time series. In our case, do not care about the seasonal term because the data annual. The components we need are combined in our model, in various additive and multiplicative combinations to produce yt. We have additive model yt = T+E or multiplicative model like yt = T. E. where the individual components of the model are given as follows:
E [A, M]
T [N, A, M, AD, MD]
S [N, A, M]
N = None, A = Additive, M = Multiplicative, AD = Additive dampened and MD = Multiplicative dampened (damping uses an additional parameter to reduce the impacts of the trend over time). The models that we are interested in estimating can be written (after selecting S [N]) in the following Table A:

Table A: State space equations for each of the models in the Holt’s nonlinear.

Performance indicators
To compare the prediction performance of the three models used, we first test the validity of the model by calculating mean absolute percentages error (MAPE) between the estimated data and the actual data during the period (2015-2019):
Then we evaluate the performance of the model by calculating root mean square error (RMSE) and (MAPE) between the estimated data and actual data during the period (1961-2015):

t : the forecast value, yt: the actual value, n: number of fitted observed. The last stage is to predict the pulses production for the countries of the study sample until 2027, the model that hasthe least values of (RMSE-MAPE) is the best and the uncertainty is included in the expectations 95% prediction interval is given by (Mishra et al.,2021):

T + h observation that will be predicted, Zα/2=1.96, sh: forecast variance.
Empiricial study
To know the development and trends in pulses production for (Afghanistan-Bangladesh-China-India-Nepal-Pakistan) we present the following Fig 1.

Fig 1: Development of pulses production in six countries during the period 1961-2019.

Fig 1 shows that both China and Nepal achieved significant growth in the pulses production until the year 2019 almost exponentially. Afghanistan has achieved this development to 2015 and then production decreased after this year. Bangladesh and Pakistan have a similar situation they achieved a development in production until 1995 and then production decreased after this period similar to a second-order polynomial. As for India, it notices the high volatility in pulses production during the studied period and as it was shown to us that production is down trend. Descriptive statistics show the most important values of changes in data through Table 1.

Table 1: Normal distribution and descriptive statistics for pulses production in six countries during the period 1961-2019.

Table 1 shows us that the data for (Afghanistan- China -Nepal-Pakistan) are not distributed normally, as the probability (Jarque-Bera) is less than the level of significance 5%. Thus the estimator here (Mean and standard deviation) are useless because they are breakdown points. These countries experienced significant changes in pulses production during the studied period, as shown by the maximum-minimum values and skewnes and kurtosis coefficient that Pakistan has the greatest development, where production reached 414.4 thousand tons in 1989 before it decreased to 80.7 in 2014. We also note from the Table 1 that the data from India and Bangladeshshare distributed normally, the production of pulses in India and the largest producer among the six countries, changed from 1506 thousand tons in 1991 to 535.2 in 1992. In order to find out the effect of these volatility and trends on the stationary of the variables, we use ADF test and we get the  results from Table 2.

Table 2: ADF test result.

Table 2 shows that (Afghanistan-India) is stationary in level with linear trend for Afghanistan and around constant for India as shown in Fig 1. For the rest of the countries, the high volatility during the studied period made it a stationary series at the first difference. With the aim of forecasting the pulses production for six countries up to 2027, we use (ARIMA-NNAR-ETS) models; we use the data during the period (1961-2015) to estimate using models (training) and (2016-2019) to verify the validity of the model (testing). The following table shows the results of estimating the ARIMA model for the six countries:
Table 3 shows that all selected b models have better out-of-sample prediction results, as the value of (MAPE-Testing) is less then (MAPE-Training) for all models except Afghanistan, where the model failed to predict lower values after 2015. The best model is ARIMA (1,1,0) for Nepal, which achieved the lowest values of (AIC, RMSE, MAPE) among the selected models. We note from table that there is no autocorrelation problem for the residual values in all models. We estimate NNAR model for all countries and get the following results:

Table 3: Estimation of ARIMA models for six countries.

Table 4 shows the prediction using NNAR models is better than the prediction using ARIMA inside the sample (Training), but the good performance decreases outside the sample. The best model among selected models is for India NNAR (1,1), Which is better than the corresponding ARIMA model (Mishra et al., 2020). We estimate ETS model for all countries and get the following results:

Table 4: The results of estimating NNAR models for six countries.

Table 5 shows that the best ETS model among the estimated models is for Nepal (M,N,N), which has the lowest values of (RMSE-MAPE in-out of the sample), similar to the results of the ARIMA model. Among all selected models, we choose the best model for forecasting pulses production to 2027 for each country:

Table 5: the results of estimating ETS models for six countries.

According to these models (Table 6), a forecast of pulses production for 2027 is obtained with the inclusion of uncertainty in the forecast, as shown in the following Fig 2.

Table 6: The best model for forecasting pulses production to 2027 for each country.


Fig 2: Pulses production forecast for six countries to 2027.

Fig 2 shows that (Afghanistan, China, India) are expected to increase the quantities of pulses production until 2027, with relative stability in the volume of production for the rest of the countries. The following table shows the expected production quantities for the six countries until 2027 with prediction interval 95% (Table 7).

Table 7: The expected production quantities for the six countries until 2027 with prediction interval 95%.

Table 7 shows that India is the largest producer of pulses among all six countries, where production is expected to reach 1088.778 thousand tons in 2027, with a growth rate 15.73% during the period 2020-2027. In addition, Afghanistan and China have an extreme growth rate of 25.19%, 11.95% respectively.
Pulses have long been considered the available source of protein for the poor. Compared to ARIMA–NNAR–ETS models, this study shows that no single model is suitable for all countries. Based on the minimum values of ME, RMSE, MAE, MPE, MAPE, MASE and ACF1 among the chosen models on the training data set and the minimum value of MAPE among the chosen models on the testing data set, the best-fitted model was selected. The NNAR(1,1) model was considered the best fitted model for India. Similarly, NNAR (3,2) was the best fit model for forecasting in Afghanistan. There were two equally best models for Bangladesh: ARIMA (0,1,0) and ETS (A, N, N). The best fit model for forecasting in China was ARIMA (0,1,1). The best fit model for forecasting in Nepal was ARIMA (1,1,0). The best fit model for forecasting in Pakistan was ETS (M, N,N).  According to the best models, pulse production quantities in Afghanistan, China and India will be increased until 2027, with India remaining the largest producer of pulses among all six countries, with production expected to reach 1088.778 thousand tons in 2027, with a growth rate of 15.73% between 2020 and 2027. In addition, Afghanistan and China have an extremely high growth rate of 25.19% and 11.95%, respectively, with relative stability in the volume of production for the rest of the countries. These results may be vital for building an effective agriculture production policy, whether by providing an awareness of the forecasted production values or evaluating such policies and forecasting the food gap for pulse crops. There are a number of important changes which need to be made by policy makers, especially in Pakistan, Nepal and Bangladesh, to increase the production of pulses. Further research regarding the determinants of pulse production in these countries would be worthwhile.

  1. Ali, S., Badar, N., and Fatima, H. (2015). Forecasting production and yield of sugarcane and cotton crops of Pakistan for 2013-2030. Sarhad Journal of Agriculture. 31(1): 1-10.

  2. Balanagammal, D., Ranganathan, C.R., Sundaresan, K. (2000). Forecasting of agricultural scenario in Tamil Nadu: A time series analysis. Journal of Indian Society of Agricultural Statistics. 53(3): 273-286.

  3. Boubaker Ben-Belhassen and Rawal, V. (2018). Global Perspective on Pulse Production-Pulse Pod. Jawaharlal Nehru University, New Delhi. https://pulsepod.globalpulses.com/pod-feed/post/a-global-perspective-on-pulse-production.

  4. Box, G.E.P., Jenkins, G.M., Reinsel, G.C. and Ljung, G.M. (2015). Time Series Analysis: Forecasting and Control (5th ed). Hoboken, New Jersey: John Wiley and Sons.

  5. Bera, B.K. and Nandi, A.K. (2011). Variability in Pulses Production of West Bengal. Economic Affairs. 56(2): 197-202.

  6. Devi, M., Kumar, J., Malik, D.P. and Mishra, P. (2021). Forecasting of wheat production in Haryana using hybrid time series model. Journal of Agriculture and Food Research. 100175.

  7. Devi, M., Rahman, U.H., Weerasinghe, W.P.M.C.N., Mishra, P., Tiwari, S. and Karakaya, K. (2021). Future milk production prospects in India for various animal species using time series models. Indian Journal of Animal Research. DOI:10.18805/IJAR.B-4409.

  8. Dickey, D. and Fuller, W. (1981). Likelihood Ratio Statistics for Autoregressive Time Series with a Unit Root. Econometrica. 49: 1057-1072.

  9. Darekar, A. and Reddy, A.A. (2017). Cotton price forecasting in major producing states. Economic Affairs. 62(3): 373-378.

  10. Holt, C.E. (1957). Forecasting Seasonals and Trends by Exponentially weighted averages (O.N.R. Memorandum No. 52). Carnegie Institute of Technology, Pittsburgh USA.

  11. Hyndman, R.J., Koehler, A.B., Snyder, R.D. and Grose, S. (2002). A state space framework for automatic forecasting using exponential smoothing methods. International Journal of Forecasting. 18(3): 439-454.

  12. Joshi, P. K. and Saxena, R. (2002). A profile of pulses production in India: Facts, trends and opportunities. Indian Journal of Agricultural Economics. 57(3): 326-339.

  13. Lama, A., Singh, K.N., Singh, H., Shekhawat, R., Mishra, P. and Gurung, B. (2021). Forecasting monthly rainfall of Sub-Himalayan region of India using parametric and non-parametric modelling approaches. Modeling Earth Systems and Environment. 1-9.

  14. Mishra, P., Al Khatib, .M.G., Sardar, I. et al. (2021). Modeling and forecasting of sugarcane production in India. Sugar Tech. 23: 1317-1324. https://doi.org/10.1007/s12355-021-01004.

  15. Mishra, P., Fatih, C., Niranjan, H. K., Tiwari, S., Devi, M. and Dubey, A. (2020). Modelling and forecasting of milk production in Chhattisgarh and India. Indian Journal of Animal Research. 54(7): 912-917.

  16. Mishra, P., Yonar, A., Yonar, H., Kumari, B., Abotaleb, M., Das, S. S. and Patil, S.G. (2021). State of the art in total pulse production in major states of India using ARIMA techniques. Current Research in Food Science. 4: 800-806.

  17. Reddy, A. and Reddy, G.P. (2010). Supply side constraints in production of pulses in India: A case study of Lentil. Agricultural Economics Research Review. 23(1): 129-136.

  18. Ray, S. and Bhattacharyya, B. (2020). Time series modeling and forecasting on pulses production behavior of India. Indian Journal of Ecology. 47(4): 1140-1149.

  19. Savadatti, P.M. (2017). Trend and forecasting analysis of area, production and productivity of total pulses in India. Indian Journal of Economics and Development. 5(12): 1-10.

  20. Shalendra, G., K.C., Sharma, Purushottam, Patil, S.M. (2013). Role of pulses in the food and nutritional security in India. Journal of Food Legumes. 26 (3 and 4): 124-129.

  21. Singh, J., Kishor, R. and Singh, S.P. (2007). Trends of pulses production in planned economy of India Agricultural Economics Research Review Conference Issue. 20: 584

  22. Singh, S.K., Riyajuddeen, Ojha, Shankar, V. and Yadav, S. (2013). Area expansion under improved varieties of lentil through participatory seed production programme in Ballia district of Uttar Pradesh. Journal of Food Legumes. 26(3 and 4): 115-119.

  23. Vani, G.K. and Mishra, P. (2019). Impact of irrigation on pulses production in India: A time-series study. Legume Research- An International Journal. 42(6): 806-811.

  24. Vishwajith, K.P., Sahu, P.K., Mishra, P., Dhekale, B.S. and Singh, R.B. (2018). Modelling and forecasting of arhar production in India. International Journal of Agricultural and Statistical Sciences. 14(1): 73-86.

  25. Vishwajith, K.P., Sahu, P.K., Mishra, P., Devi, M., Dubey, A., Singh, R.B., Dhekale, B.S., Fatih, C. and S. (2019). Modelling and forecasting of mung production in India. Current Journal of Applied Science and Technology. 34(1): 1-19. https://doi.org/10.9734/cjast/2019/v34i130118,

  26. Winters, P.R. (1960). Forecasting sales by exponentially weighted moving averages. Management Science. 6(3): 324-342.

Editorial Board

View all (0)