Indian dairy industry provides livelihood to about 70 million households. A key feature of India’s dairy sector is the predominance of small producers. The livestock sector of this industry will approve the growth of both the socio-economic as well as the national economy. This investigation brings out the important features of the results obtained by employing various statistical modelling procedures to milk production of India collected for during 1961-62 to 2018-19 from www.fao.org.
Auto-regressive integrated moving (ARIMA) approach
Time series is a branch of Statistics; the object is to study variables over time. Among its main objectives is the determination of trends within these series as well as the stability of values (and their variation) over time. Unlike traditional econometrics, the purpose of time series analysis is not to relate variables to one another, but to focus on the “dynamics” of a variable. In particular, linear models (mainly AR and MA, for Auto-Regressive and Moving Average), (
Box and Jenkins, 1976), conditional heteroscedasticity models, notably ARCH (Auto-Regressive Conditional Heteroscedasticity), (
Engel, 1982) are used in modeling time series. In this study, we deal with Auto-Regressive Integrated Moving (ARIMA) process, (called Box-Jenkins Approach) to estimate and forecast the milk production in India and five major milk producing animal species namely, cow, buffalo, goat, sheep and camel over the period 1961 to 2018. The data for analysis was collected from the website of
www.fao.org.in.
In practice, it is impossible to know the probability distribution of a time series; y
t, t≥0; therefore, when primary interest is in the modeling of the conditional distribution (a priori constant in time) of yt
via its density:
Conditioned on the history of the process: yt = yt, yt-1,.....,y0. It is therefore a necessity to model yt on its past values.
Auto-regressive model, AR (p)
The conditional approach in Equation (1) provides a decomposition prediction error, according to which:
Where
E (
yt /
yt-p), is the component of
yt, that can give rise to a forecast, when the history of the process,
yt-1, yt-2.....y0 are known and
ϵt represents unpredictable information. We suppose,
ϵt~ WN (0, s2), is a white noise process. The equation (2) represents an autoregressive model (AR) of order
p. As an example an autoregressive process of order 1, AR (1) is defined:
The value
yt depends only on its predecessor. Its properties are functions of a which is a factor of inertia. Autoregressive processes AR(p) assume that each observation
yt can be predicted by the weighted sum of a set of previous observations
yt-1, yt-2.....yt-p, plus a random error term. The other type of process of the box-Jenkins approach is moving average, MA(q).
Moving-average process MA (q)
The moving average processes assume that each observation
yt is a function of the errors in the preceding observations,
ϵ t-1,
ϵ t-2......
ϵt-p, plus its own error. A moving average process is given as
Mishra et al., (2021) :
The combination of the two models, AR (p) in equation (3) and MA(q) in equation (4) is an ARMA (p, q) process; which is the most popular models of the Box Jenkins for its flexibility and suitability for various data types. The model is designed as follow:
With:
The time series y
t must be stationary to be fitted by an ARMA models. We take the case of weak stationary and we put its definition:
Definition: A time process y
t with real values and discrete time y
1, y
2,...y
t It is stationary in the weak sense (or “second order”, or “in covariance”) if:
·
·
·
When one or more stationary conditions are not met, the series is said to be non-stationary. This term, however, covers many types of non-stationary, (non-stationary in trend, stochastically non-stationary), we focused on the later. Thus, if
yt is stochastically non-stationary, a difference stationary technique should be applied. Consequently, a series is a stationary in difference if the series obtained by differentiating the values of the original series is stationary. Generally, we used the KPSS test, (
Kwiatkowski et al., 1992;
Leybourne and McCabe, 1994).
The difference operator is given by ∆(
yt) =
yt - yt - 1: if the series is differentiated d times, we say that it is integrated of order I (d). The process will be noted as ARIMA (p,d,q), defined by the equation:
With, L: is the lag operator (L) or backshift operator (B); If the time series
Xt = (
1-L)
d yt is stationary, then, estimating an ARIMA (p,d,q), process on
yt is equivalent to estimating an ARMA (p, q) process on
Xt.
Box and Jenkins (1976) proposed a prediction technique for a univariate series that is based on the notion of the ARIMA process. This technique has three stages: identification, estimation and verification. The
first step is to identify the ARIMA model (p, d, q) that could spawn the series. It consists, first of all, in transforming the series in order to make it stationary (the number of differentiations determines the order of integration: d) and then to identify the ARMA model (p, q) of the series transformed with the correlogram and partial correlogram. The graph of autocorrelation (correlogram) and partial autocorrelation coefficients (partial correlogram) give information on the order of the ARMA model. Thus, if we observe that the first two autocorrelation coefficients are significant, we will identify the following model: MA (2). The
second step is to estimate the ARIMA model using a non-linear method (Nonlinear least squares or maximum likelihood). These methods are applied using the degrees p, d and q found in the identification step.
Generally, we use the maximum likelihood method; by consider that the errors
ϵt follow a normal distribution, N(0,𝛔
2e). The log-likelihood function of ARMA (p,q) process is defined as
Lama et al., (2021) :
With:
· T: Number of observations
,
·
ψA matrix of (p+q+T, p+q) dimensions, dependent of β
i = (i = 1,...,p) and Θ
i = (i = 1,...,q)
,
·
The third step is to check whether the estimated model reproduces the model that generated the data. For this purpose, the residuals obtained from the estimated model are used to check whether they behave like white noise errors using a “portmanteau” test (a global test that makes it possible to test the hypothesis of independence of residues). The common tests are based on residuals analysis for normality and autocorrelation is
Durbin and Watson (1950), test for Homoscedasticity:
Breusch (1978);
Breusch and Pegan (1979), ARCH Test,
Engel (1982).The last point under this step is the prediction of future values of y
t by the selected model.