Chief EditorT. Mohapatra
Print ISSN 0367-8245
Online ISSN 0976-058X
NAAS Rating 5.20
Rainfall Probability Distribution and Forecasting Monthly Rainfall of Navsari using ARIMA Model
First Online 31-07-2021|
Methods: The method of moments was used to determine the parameters of distributions and the chi-square test was used as a goodness of fit test to obtain the best fit distribution for monthly rainfall of Navsari, Gujarat utilizing 36 years of rainfall data. Auto regressive moving average (ARIMA) model, popular owing to its simplicity and ability to simulate various stochastic processes was used in the study.
Result: It was revealed that the Weibull distribution was the best fit distribution for June and September, whereas Gumbel was the best fit distribution for July. For simulating monthly rainfall, the seasonal ARIMA model (0,0,1) (0,1,1)12 was found to be the appropriate model based on its performance. The model had the least root mean square value and also the residuals were found to have no correlation.
The Smirnov anderson and Chi-square tests were used by Sharma and Singh (2010) to obtain the best fitting probability distribution for monthly, seasonal and annual rainfall of Pantnagar and it was found that the lognormal and gamma distribution were the best fit probability distributions for the annual and monsoon season period of study, respectively. The goodness of fits namely Smirnov and Anderson were utilized by Mandal and Choudhury (2015) to obtain the probability distribution of Sagar Island, located on the continental shelf of the Bay of Bengal and found that Normal distributions were appropriate for annual, post-monsoon and summer seasons. Trend analysis helps in assessing the positive or negative trends of rainfall. Brema and Anie (2018) analysed the rainfall trend by Mann Kendall test for Vamanapuram river basin, Kerala by using 30 years (1984-2013) of rainfall data. For January, May, June, September, October and November there was an evidence of rising trend while negative trend was observed in February, March, April, July, August and December.
Reliable rainfall forecast could be a boon to farmers as many important decisions about sowing time and selection of crops are based on the rainfall. Auto regressive moving average (ARIMA) is a linear model, popular owing to its simplicity and ability to simulate various stochastic processes (Adhikari and Agrawal, 2013). The ARIMA model was utilized by Swain et al., (2018) for predicting monthly rainfall over Khodha district in Odisha and concluded that ARIMA (1, 2, 1) (1, 0, 1)12 was the best fit model.
This study was undertaken with the objectives of obtaining the best fit distribution of monthly rainfall (1984-2019) of Navsari from a set of continuous probability distributions using chi-square as the goodness of fit test and to develop best fit ARIMA model for monthly rainfall of Navsari.
MATERIALS AND METHODS
The normal distribution, also known as Gaussian distribution is one of the most frequently used distributions to model the random phenomenon. Any linear function of a random variable is also a normal random variable. The probability density function of normal distribution is given by equation (1):
μ and σ are the mean and standard deviation of the distribution which are also its location and scale parameters. The parameters of the distribution were determined using method of moments in which the mean and the standard deviations were obtained.
Log normal distribution
Log-normal distribution is a transformed normal distribution where the variable is replaced by its logarithmic value. It has positive skewness which increases with its scale parameter. A random variable x is log-normally distributed if its probability density function is as shown by equation (2).
In which mn and sn are scale and shape parameters of the distribution respectively.
The scale and shape parameters are also the mean and variance of the variable ln x. The two parameters of this distribution can be obtained using method of moments using equations (3) and (4).
Gamma distribution is a flexible distribution with a wide variety of shapes. A random variable x follows gamma distribution if its probability density function is given by equation (5):
In which a and b are shape and scale parameter of distribution respectively.
The method of moments was used to estimate the parameters of the distribution as given in equation (6).
µ and 𝛔 are the mean and standard deviation of the distribution respectively.
It is the extreme value type I distribution where the parent distribution is unbounded in the direction of the desired extreme and all the moments of the distribution exist. The probability distribution function for this distribution is given by equation (7).
for -∞< x < ∞, where
It is the extreme value type III distribution in which the parent distribution is bounded in the direction of the desired extreme. The probability distribution function for this distribution is given by equation (8).
for 0 ≤ x < ∞, α, β > 0
α is the scale parameter and b is location parameter of the distribution.
The mean and variance are given by the following equations (9) and (10).
The chi-square test was used for checking the validity of the assumed probability distribution. If more than one distribution passed the test then the distribution with the least value of chi-square was considered as the best fit distribution (Greenwood and Nikulin, 1996). The chi-square statistic is given by equation (11).
ni = Observed value.
ei = Expected value.
Mann kendall test
This test is used for the purpose of statistically assessing if there is a monotonic upward or downward trend of the variable of interest over time (Mann, 1945; Kendall, 1975). According to this test, the null hypothesis H0 assumes that there is no trend (the data is independent and randomly ordered) and this is tested against the alternative hypothesis H1, which assumes that there is a trend.
Auto regressive integrated moving average (ARIMA) model
The formulation of ARIMA model required three steps, namely model identification, parameter estimation and diagnostic checking for analysis of residuals (Box and Jenkins, 1976). The ACF and PACF plots were used for identifying the order for the autoregressive and moving average terms.
The seasonal ARIMA model is given as follows:
ΦP (Bs) = Seasonal autoregressive operator of order P.
φp = Regular autoregressive operator of order p.
▽sD = Seasonal differences.
▽d = Regular differences.
ΘQ (Bs) = Seasonal moving average operator of order P.
θq (B)= Regular moving average operator of order p.
at= White noise process.
Ljung-Box test was used for testing the residuals. This statistic measured the significance of residual autocorrelations as a set and pointed out if they were collectively significant (Paretkar, 2008).
RESULTS AND DISCUSSION
The highest average rainfall (647.3 mm) occurred in July followed by August (344 mm) and June (317.8). It was lowest (264.04 mm) in the month of September compared to the other months included in monsoon season. The percentage of contribution to average annual rainfall was 19.6% (June), 39.9% (July), 21.2% (August) and 16.3% (September). Thus, the four months of monsoon contributed a total of 97.1% of average annual rainfall and the remaining months contribute only 2.9%. Weibull distribution was found to be the best fit distribution for June and September whereas for July and August month Gumbel and log normal distributions were found to be the best fit distribution. The design of temporary as well as permanent structures is based on the rainfall at various recurrence intervals. Usually, the recurrence interval of annual rainfall is taken into consideration for designing of structures and planning of watersheds. The results obtained about rainfall at various recurrence intervals in this study can be used by policy decision and planning related to soil and water conservation as well as crop planning.
The result of the trend analysis by Mann Kendall test for monthly and annual rainfall is given in Table 6. The time series plots of the monthly and annual rainfall showing increasing or decreasing trend is shown in Fig 6 and Fig 7 respectively. In this study, the trend was found to be significant for rainfall of September month as indicated by the standardized S statistic value (Z value) of 2.342 which was greater than the critical value of 1.645 at 5% significance level. There was insufficient evidence to suggest a significant trend in case of June, July, August and annual rainfall. The trend for annual rainfall was positive as suggested by the positive value of Mann Kendall test value (S), however, the trend was insignificant.
Auto regressive moving average (ARIMA) model
The preliminary examination of autocorrelation function (ACF) plot and partial autocorrelation (PACF) function plot revealed the presence of periodicities indicating a non- stationary process which was subsequently transformed into a stationary process by differencing. The monthly rainfall data series was converted into a stationary time series (Fig 8 and Fig 9). The performance of the ARIMA model was assessed using 5 years monthly data (2015-2019). Root Mean square errors of several candidate models were calculated and the model with least mean square error was chosen for modelling monthly rainfall. Karthika et al., (2017) utilized ARIMA model to forecast meteorological drought up to 2 years in lower Thirumanimuthar sub-basin in Tamil Nadu and the predicted data show reasonably good agreement with the actual data.
In the present study, ARIMA (0,0,1) (0,1,1)12 was selected as the appropriate model as the performance on the testing data was better compared to other candidate models and the residuals were found to have no correlation. The data in Table 7 shows the parameters of the selected ARIMA model while that in Table 8 depicts the performance of the ARIMA model in terms of root mean square error for the training and testing period. The observed and predicted rainfall during training and testing period by ARIMA (0,0,1) (0,1,1) 12 are as in Fig 10 and Fig 11. The Ljung box values and residuals plots are shown in Fig 12. The residuals lay within bounded limits which meant that the residuals were uncorrelated and they followed white noise. The selected model was used for predicting monthly rainfall of the year 2020 as shown in Fig 13. The model predicted that in the year 2020, the rainfall in the monsoon months i.e. June, July, August and September would be 339 mm, 680 mm, 426 mm and 349 mm respectively and the total annual rainfall would be 2062 mm which is 27.2% more than the average annual rainfall of Navsari.
- Adhikari, R. and Agrawal, R.K. (2013). An introductory study on time series modeling and forecasting. Saarbrucken: LAP LAMBERT Academic Publishing.
- Alam, M., Emura, K., Farnham, C. and Yuan, J. (2018). Best-fit probability distributions and return periods for maximum monthly rainfall in Bangladesh. Climate. 6(1): 9.
- Bhakar, S.R., Iqbal, M., Devanda, M., Chhajed, N. and Bansal, A.K. (2008). Probability analysis of rainfall at Kota. Indian Journal of Agricultural Research. 42(3): 201-206.
- Box, G.E., and Jenkins, G.M. (1976). Time Series Analysis: Forecasting and Control, Revised ed. Holden-Day.
- Brema, J. and Anie, J. (2018) Rainfall trend analysis by Mann-Kendall test for Vamanapuram river basin, Kerala. International Journal of Civil Engineering and Technology. 9(13): 1549- 1556.
- Erhan, (2011). Probability and Stochastics. New York: Springer.
- Greenwood, P.E. and Nikulin, M.S. (1996). A Guide to Chi-squared Testing (Vol. 280). John Wiley and Sons.
- Kendall, M.G. (1975). Rank Correlation Methods. Griffin, London.
- Karthika, M. and Thirunavukkarasu, V. (2017). Forecasting of meteorological drought using ARIMA model. Indian Journal of Agricultural Research. 51(2): 103-111.
- Mandal, S. and Choudhury, B.U. (2015). Estimation and prediction of maximum daily rainfall at Sagar Island using best fit probability models. Theoretical and Applied Climatology. 121(1-2): 87-97.
- Mann, H.B. (1945). Nonparametric Tests against Trend. Econometrica. Journal of the Econometric Society. 245-259.
- Paretkar, P.S. (2008). Short-Term Forecasting of Power Flows over Major Pacific Northwestern Interties: Using Box and Jenkins ARIMA Methodology (Doctoral dissertation, Virginia Tech).
- Reddy, P.J.R. (1997). Stochastic Hydrology. Laxmi Publications, Ltd.
- Swain, S., Nandi, S. and Patel, P. (2018). Development of an ARIMA model for monthly rainfall forecasting over Khordha district, Odisha, India. In Recent Findings in Intelligent Computing Techniques. Springer, Singapore. (pp. 325-331)
- Yadav, R., Tripathi, S.K., Pranuthi, G. and Dubey, S.K. (2014). Trend analysis by Mann-Kendall test for precipitation and temperature for thirteen districts of Uttarakhand. Journal of Agrometeorology. 16(2): 164.
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.
This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.