The rainfall data of 36 years (1984-2019) of Navsari, Gujarat was analysed to obtain the best fit monthly and annual probability distribution. The analysis required in the study was conducted in the year 2020. Navsari receives its rainfall during June, July, August and September. Water conservation measures can be planned based on the expected rainfall. The excessive rainfall also causes frequent inundation and therefore, suitable drainage can also be planned for protecting the crops. The continuous probability distributions with their probability density functions used in the study are described below.
Normal distribution
The normal distribution, also known as Gaussian distribution is one of the most frequently used distributions to model the random phenomenon. Any linear function of a random variable is also a normal random variable. The probability density function of normal distribution is given by equation (1):
...........(1)
for -∞< x < ∞, -∞< µ < ∞ and 𝛔 > 0
μ and σ are the mean and standard deviation of the distribution which are also its location and scale parameters. The parameters of the distribution were determined using method of moments in which the mean and the standard deviations were obtained.
Log normal distribution
Log-normal distribution is a transformed normal distribution where the variable is replaced by its logarithmic value. It has positive skewness which increases with its scale parameter. A random variable x is log-normally distributed if its probability density function is as shown by equation (2).
..........(2)
for -∞< x < ∞, -∞< mn < ∞ and 𝛔n > 0,
In which mn and sn are scale and shape parameters of the distribution respectively.
The scale and shape parameters are also the mean and variance of the variable ln x. The two parameters of this distribution can be obtained using method of moments using equations (3) and (4).
..........(3)
..........(4)
Gamma distribution
Gamma distribution is a flexible distribution with a wide variety of shapes. A random variable x follows gamma distribution if its probability density function is given by equation (5):
..........(5)
In which a and b are shape and scale parameter of distribution respectively.
The method of moments was used to estimate the parameters of the distribution as given in equation (6).
...........(6)
Where
µ and 𝛔 are the mean and standard deviation of the distribution respectively.
Gumbel distribution
It is the extreme value type I distribution where the parent distribution is unbounded in the direction of the desired extreme and all the moments of the distribution exist. The probability distribution function for this distribution is given by equation (7).
...........(7)
for -∞< x < ∞, where
𝛔 is the scale parameter and µ location parameter of the distribution.
Weibull distribution
It is the extreme value type III distribution in which the parent distribution is bounded in the direction of the desired extreme. The probability distribution function for this distribution is given by equation (8).
...........(8)
for 0 ≤
x < ∞, α, β > 0
α is the scale parameter and b is location parameter of the distribution.
The mean and variance are given by the following equations (9) and (10).
...........(9)
...........(10)
Goodness-of-fit test
The chi-square test was used for checking the validity of the assumed probability distribution. If more than one distribution passed the test then the distribution with the least value of chi-square was considered as the best fit distribution (
Greenwood and Nikulin, 1996). The chi-square statistic is given by equation (11).
..........(11)
Where,
n
i = Observed value.
e
i = Expected value.
Mann kendall test
This test is used for the purpose of statistically assessing if there is a monotonic upward or downward trend of the variable of interest over time (
Mann, 1945;
Kendall, 1975). According to this test, the null hypothesis H
0 assumes that there is no trend (the data is independent and randomly ordered) and this is tested against the alternative hypothesis H
1, which assumes that there is a trend.
Auto regressive integrated moving average (ARIMA) model
The formulation of ARIMA model required three steps, namely model identification, parameter estimation and diagnostic checking for analysis of residuals (
Box and Jenkins, 1976). The ACF and PACF plots were used for identifying the order for the autoregressive and moving average terms.
The seasonal ARIMA model is given as follows:
ΦP (Bs) φp (B) ▽sD ▽d zt = θq (B)ΘQ(Bs)at ..........(12)
Φ
P (Bs) = Seasonal autoregressive operator of order P.
φ
p = Regular autoregressive operator of order p.
▽
sD = Seasonal differences.
▽
d = Regular differences.
Θ
Q (Bs) = Seasonal moving average operator of order P.
θ
q (B)= Regular moving average operator of order p.
a
t= White noise process.
Ljung-Box test was used for testing the residuals. This statistic measured the significance of residual autocorrelations as a set and pointed out if they were collectively significant (
Paretkar, 2008).