The study had been conducted on 2100 first lactation bimonthly test day milk yield (BTDY) records of 350 Murrah buffaloes calved in between 1993 and 2017 at ICAR-NDRI, Karnal, India. The records with lactation length less than 100 days, lactation yield less than 900 kg, culled in the middle of lactation, abortion, still-birth or any other pathological causes were considered as abnormal and thus, such records were not included for the prediction of the lactation yield. The outliers beyond three-standard deviation on both the tail ends of normal distribution were also excluded from the data. A total of 6 BTDY records were taken from each animal at an interval of 60 days. First BTDY was recorded on 6
th day (
i.e. BTDY-1), BTDY-2 on 65
th day, BTDY-3 on 125
th day, BTDY-4 on 185
th day, BTDY-5 on 245
th day and the last BTDY (
i.e. BTDY-6) was taken on 305
th day.
The prediction of First Lactation 305-Day Milk Yield (FL305DMY) was performed utilizing the following four conventional methods:
Centering date method (CDM)
In the case of intervening intervals, production credits (
Likhi et al., 1995;
O’ Connor and Lipton, 1960) were calculated by multiplying yield on intervening sample days with fixed sampling interval:
PN = (LI) (Pn)
In first and last sample day yield production credits (PN) were calculated as follows:
PN = (DIM + ½ LI) Pn
Where,
DIM = days from the first day of lactation to the first sample day in case of first sample day yield and days between last sample day and terminal day of lactation in case of last sample day yield
LI = Sampling interval
P
n = Production on n
th sample day
Production credits of all intervals were summed up to estimate lactation milk yield.
Test interval method (TIM)
TIM is also based on the calculation of prediction credit for all test-day yields and then summing up in order to know predicted milk yield (
Likhi et al., 1995;
Everett and Carter, 1968;
Sargent et al., 1968). Basic formulae for all calculations were the same as in case of CDM, with sampling interval as follows:
For n
th intervening interval:
LI = ½ (DIM
n+1 - DIM
n-1)
Where,
DIM
n+1 and DIM
n-1 were the days in milk up to and including the proceeding (n+1)
th and preceding (n-1)
th sample days, respectively.
For first and last sampling intervals, sampling interval was the length of the first or last test period as the case may be.
Ratio method (RM)
It predicts the 305-day lactation yield by multiplying each test day with the ratio factor (Dass and Sadana, 2003).
= R X
i
Where,
R = Y / X = Ratio of average 305-day milk yield to average test day milk yield.
= Estimated 305-day milk yield of the i
th animal.
Xi = Test day milk yield of i
th animal.
Y = Average 305-day milk yield.
X = Average test day milk yield.
Multiple linear regression (MLR)
MLR was used to develop prediction equations by estimating the regression coefficients for the test-day milk yield records in different combination. The software used for MLR was SAS Enterprise Guide 4.3, 2003. Stepwise backward multiple linear regression analysis was used to estimate 305-day milk yield (
Singh et al., 2015;
Kokate et al., 2014; Chakraborty et al., 2010).
Where,
= Estimated first lactation 305-day or less milk yield of the i
th animal.
x
i = Test-day record of i
th animal.
a = Intercept.
b
i = Regression coefficient of first lactation 305-day or less milk yield on test- day records. The accuracy of fitting the regression models was calculated by using the following formula:
The comparison of the above mentioned conventional methods was made with newly evolved computational machine learning method that mimics human brain (neural network) called Artificial Neural Network (ANN).
Artificial neural network
ANN model is basically an intelligent data processing system which learns the predictive ability automatically from the data set presented while training the network. Such neural network consists of input layer, hidden layer(s) and an output layer. Each layer has a specific role in the execution of the neural network. In back propagation technique, input vector and the corresponding target vectors are used to train a network until it can approximate a prediction function
(Ruhil et al., 2011; Gandhi et al., 2010).
A multilayer feed forward neural network with back propagation of error learning mechanism was developed using Weka software version 3.8.0 to predict the first lactation 305-day or less milk yield (FL305DMY). The network was trained and simulated using cross-validation of 10 folds, up to 2500 epochs or till the algorithms truly converged. Network parameters such as learning rate (0.3), momentum (0.5) and validation set size (0) were used as the default setting of the algorithms. Most of the time, it was observed that algorithms were truly converged which means that performance/error goal was achieved.
Estimation of error in prediction
The error in the prediction of first lactation 305-day milk yield was estimated as a deviation of predicted milk yield from the actual milk yield:
Where,
E
i = Error in prediction.
= Predicted 305-day milk yield of the i
th animal.
Y
i = Actual 305-day milk yield of the i
th animal.
Absolute error
Absolute error was estimated without considering the positive or negative signs as follows:
= | E
i |
Percentage absolute error
= (Absolute error / Mean of actual 305-day milk yield) × 100
Average error
= Sum of error / No. of observations
Percentage average error
= (Average error / Mean of actual 305-day milk yield) × 100
Root mean square error (RMSE)
Where,
= Predicted 305-day milk yield of the i
th animal.
Y
i = Actual 305-day milk yield of the i
th animal.
n = Number of observations.
Percentage RMSE
= (RMSE / Mean of actual 305-day milk yield) × 100