Indian Journal of Animal Research

  • Chief EditorK.M.L. Pathak

  • Print ISSN 0367-6722

  • Online ISSN 0976-0555

  • NAAS Rating 6.50

  • SJR 0.263

  • Impact Factor 0.4 (2024)

Frequency :
Monthly (January, February, March, April, May, June, July, August, September, October, November and December)
Indexing Services :
Science Citation Index Expanded, BIOSIS Preview, ISI Citation Index, Biological Abstracts, Scopus, AGRICOLA, Google Scholar, CrossRef, CAB Abstracting Journals, Chemical Abstracts, Indian Science Abstracts, EBSCO Indexing Services, Index Copernicus

Evaluating Peak Milk Yield as a Predictor for First Lactation 305-Day Milk Yield in Murrah Buffaloes

Ekta Rana1,*, Ashok Kumar Gupta1, Anand Prakash Ruhil2, Shweta Mall1, Manokaran Ashokan1
1Division of Animal Genetics and Breeding, ICAR-National Dairy Research Institute, Karnal-132 001, Haryana, India.
2Division of Dairy Economics, Statistics and Management, ICAR-National Dairy Research Institute, Karnal-132 001, Haryana, India.
Background: Peak milk yield is an important early observable trait that holds immense practical significance in dairy farming. It is often utilized to assess the production potential of dairy animals under field conditions. Therefore, the present study aimed to predict the first lactation 305-day milk yield (FL305DMY) based on peak milk yield (PY) and subsequently, compare its efficiency with monthly test-day milk yield (TD) records alone or in combinations.

Methods: Data on 350 PY and 3,850 TD records pertaining to first lactation of 350 Murrah buffaloes that calved in between 1993 and 2017 at ICAR-National Dairy Research Institute, Karnal, India were utilized for the investigation. A total of 11 TD records were taken from each animal at an interval of 30 days, starting from 6th day onwards until 305th day of lactation. The prediction of FL305DMY was performed by univariate and stepwise backward multiple linear regression analysis. The efficiency of the prediction equations was evaluated by Coefficient of determination (R2), Akaike Information Criterion (AIC), Bayesian Information Criterion (BIC) and Root Mean Square Error (RMSE).

Result: It was observed that the peak milk yield alone could predict the FL305DMY with 64.15% accuracy, which was the highest in univariate linear regression analysis. Furthermore, the regression coefficients and R2 values indicated that mid-lactation monthly test-day milk yield records up to TD-7 could be utilized in conjunction with peak milk yield for early prediction of FL305DMY. The stepwise backward multiple linear regression analysis revealed that the most optimal prediction equation, including peak milk yield as one of the variables, was found to be composed of three variables (PY, TD-4 and TD-7) showed 84.79% R2. The results could be utilized for early selection of genetically superior animals in breeding strategies.
First lactation 305-day milk yield (FL305DMY) is a fundamental parameter for assessing the overall productivity and potential of dairy animals. Breeding strategies are predominantly structured and evaluated based on FL305DMY of daughters of genetically superior sires. However, the record on complete 305-day lactational yield requires data recording on a day-to-day basis, which is time-consuming and expensive under field conditions. Studies conducted in recent years revealed that peak milk yield (PY) and test-day milk yield records could act as potential alternatives to daily data recording due to their high genetic association with complete 305-day milk yield in dairy animals (Torshizi and Mashhadi, 2015; Miranda et al., 2019; Rana et al., 2021a). Peak milk yield, in particular, is an important early observable trait that is attributed to the lactation curve. It denotes the highest daily milk yield obtained at any point during the lactation. Under field conditions, the farmers usually remember the peak milk yield obtained during the lactation and use it to negotiate the price of the dairy animal for sale or purchase purposes (Chaudhry et al., 2000; Singh and Kumar, 2007; Taher, 2012; Marumo et al., 2022).
       
Precise and early prediction of FL305DMY based on early traits such as peak milk yield and test-day milk yield has significant advantages viz. (a) early selection of genetically superior animals, (b) better farm management by sorting out non-productive animals, (c) reduced time and expenditure in dairy farming and (d) increased annual genetic response (Patond et al., 2013; Bhosale and Singh, 2015). Over the years, regression analysis has been extensively utilized by researchers for the prediction of lactational yield in dairy animals based on test-day milk yield records (Grzesiak et al., 2003; Singh and Rana, 2008; Chakraborty et al., 2010; Singh and Tailor, 2013; Chandrakar et al., 2019; Rana et al., 2021b). However, the literature on the prediction of FL305DMY based on peak milk yield is scanty till date, especially in buffaloes. Therefore, the present study was carried out to predict the FL305DMY based on peak milk yield in Murrah buffaloes. The prediction accuracy was then compared with the predictions made using monthly test-day milk yield (TD) records alone and in combinations, to understand the significance of peak milk yield as a predictor for FL305DMY in Murrah Buffaloes.
The data on 350 peak milk yield (PY) records and 3,850 monthly test-day milk yield (TD) records pertaining to first lactation of 350 Murrah buffaloes that calved in between 1993 and 2017 at ICAR-National Dairy Research Institute, Karnal, India were considered for the present study. The data was subjected to standardization by excluding the records depicting (a) lactation length less than 100 days, (b) 305-day milk yield lower than 900 kg, (c) interruption in the middle of lactation due to culling or death of the animal and (d) abnormal pathological cases such as abortion, still-birth, etc. The normalization of the data was performed by deleting the outliers beyond three standard deviations on both the tail ends of normally distributed data set. A total of 11 TD records were taken from each animal at an interval of 30 days. First TD (TD-1) was recorded on 6th day, TD-2 on 35th day, TD-3 on 65th day, TD-4 on 95th day, TD-5 on 125th day, TD-6 on 155th day, TD-7 on 185th day, TD-8 on 215th day, TD-9 on 245th day, TD-10 on 275th day and the last TD (TD-11) was recorded on 305th day.
 
Prediction based on regression analysis
 
The prediction of first lactation 305-day milk yield (FL305DMY) was performed by univariate and stepwise backward multiple linear regression analysis. Statistical Analysis System (SAS) Enterprise Guide version 4.3, 2003 software was utilized to conduct the regression analysis. Stepwise backward multiple linear regression analysis was performed by estimating the regression coefficients of respective peak milk yield and monthly test-day milk yield records in different combinations. The prediction equations were derived using peak milk yield and/or one or more TDs as independent variables based on the following formula:
 
  
 
Where,
 = Estimated first lactation 305-day milk yield of ith animal.
a = Intercept value.
bi = Regression coefficient of first lactation 305-day milk yield on monthly test-day milk yield or peak milk yield record.
Xi = Monthly test-day milk yield or peak milk yield of ith animal.
 
Criteria for evaluating the prediction equations
 
The efficiency of the prediction equations was evaluated based on Coefficient of determination (R2), Akaike Information Criterion (AIC), Bayesian Information Criterion (BIC) and Root Mean Square Error (RMSE) as per the formulas mentioned below (Akaike, 1974; Schwarz, 1978; Yang, 2019; Rana et al., 2021b). The coefficient of determination is also represented as the accuracy of fitting the regression models. Higher the values of R2 and lower the values of AIC, BIC and RMSE denoted the optimal prediction equations. 


                                                               
AIC= 2k + n ln(RSS / n)
                               
BIC= k ln(n) + n ln (RSS / n)

 
Where,
k = Number of parameters in the model.
n = Number of observations.
ln = Natural logarithm.
RSS = Residual sum of squares from the fitted regression model.
= Predicted 305-day milk yield of ith animal.
Yi= Actual 305-day milk yield of ith animal.
Prediction based on univariate linear regression analysis
 
The univariate linear regression analysis revealed that the highest prediction accuracy was exhibited by peak milk yield records with 64.15% R2 and 256.0293 RMSE. TD-5 showed the highest accuracy amongst the monthly test-day milk yield records with 62.53% R2, followed by TD-7 with 62.30% R2. It was observed that the extremes of the monthly test-day milk yield records (TD-1, TD-11, TD-10, TD-9, TD-2) showed lower prediction accuracy, whereas, the mid-lactation monthly test-day records (TD-5, TD-7, TD-3, TD-4, TD-6) showed higher prediction accuracy. Furthermore, the regression coefficient (bi) was found to be highest for TD-7. It could be inferred from the results that mid-lactation monthly test-day milk yield records up to TD-7 hold the potential to predict FL305DMY in conjunction with peak milk yield records, hence, were utilized further for multiple linear regression analysis. This would also facilitate early prediction of FL305DMY. The results of univariate linear regression analysis for the prediction of FL305DMY based on peak milk yield and monthly test-day milk yield records are presented in Table 1.
 

Table 1: Univariate linear regression equations and their respective accuracy.


       
Sharma et al., (2019) performed the univariate linear regression analysis for the prediction of FL305DMY in 384 crossbred cattle and reported that peak milk yield alone demonstrated 47.10% R2, which was lower than the estimate observed in the present study. They also reported that higher prediction accuracy was exhibited by mid-lactation monthly test-day records, with TD-7 demonstrating the highest R2 value of 75.15%. Similarly, Sah et al., (2013) based on the study on 300 Kankrej cows, reported that peak milk yield alone exhibited 49.9% prediction accuracy and the highest R2 (67.10%) was exhibited by TD-8. Singh and Rana (2008) and Elmaghraby (2009) reported that the highest R2 was exhibited by TD-6 in Murrah and Egyptian buffaloes, respectively.
 
Prediction based on multiple linear regression analysis
 
The regression analysis involving two independent variables (bivariate linear regression), including peak milk yield as one of the variables, generated a total of seven prediction equations (Table 2). The prediction accuracy based on bivariate linear regression analysis ranged from 65.84% to 79.80%. A perusal of the results revealed that the most optimal prediction equation featured PY and TD-7 with 79.80% R2 and 192.4454 RMSE. Interestingly, the initial analysis showed that PY alone yielded a prediction accuracy of 64.15% (Table 1). However, upon the addition of one more variable to the equation, the accuracy surged to 79.80%, representing a significant increase of 15.65%. Furthermore, it was observed that PY and TD-5 yielded a prediction accuracy of 73.55% and stood in third position, however, initial analysis showed that TD-5 alone emerged as the most accurate predictor amongst the monthly test-day milk yields in univariate linear regression analysis (Table 1). It could be inferred from the results that the variable demonstrating the highest prediction accuracy in univariate regression analysis might yield different results when considered in combination with others. While examining the top five prediction equations generated from bivariate linear regression analysis, it was observed that the variations in accuracy were evident, on the contrary, no such significant variations in accuracy were found in top five prediction equations generated from three, four and subsequent combinations of independent variables regression analysis.
 

Table 2: Bivariate linear regression equations and their respective accuracy.


       
Sharma et al., (2019) based on the study on 384 crossbred cattle, reported that the most optimal bivariate linear regression equation consisted of PY and TD-7, which was in agreement with the present study. They reported the range of R2 between 50.80% (PY and TD-1) and 85.20% (PY and TD-7). Interestingly, a very high increase in R2 estimate (38.10%) was observed on the introduction of an additional TD to the univariate peak milk yield equation, which was more than double the increase observed in the present study on Murrah buffalo. This could be attributed to the genetic variations present between the species. Similarly, Sah et al., (2013) also reported that the most optimal bivariate linear regression equation consisted of PY and TD-7 with 75.10% prediction accuracy in Kankrej cows. Singh and Kumar (2007) studied peak milk yield along with pre-peak period in 284 Karan Fries cows and reported that both together demonstrated 61.85% accuracy in the prediction of FL305DMY. Furthermore, based on the magnitude of regression coefficients they also reported that in the observed prediction accuracy the contribution of pre-peak period was negligible and PY emerged as a key predictor for FL305DMY.
 
In-depth analysis of prediction equations
 
The stepwise backward multiple linear regression (MLR) analysis of monthly test-day milk yield records with and without peak milk yield was performed and the most optimal equations for the prediction of FL305DMY are presented in Table 3 and Table 4, respectively. The most optimal prediction equation with three independent variables including peak milk yield (PY, TD-4 and TD-7) showed an accuracy of 84.79%, whereas, TD-4 and TD-7 (without peak milk yield) showed 81.24% accuracy. While considering four independent variables including PY, the most optimal prediction equation featured PY, TD-2, TD-4 and TD-7 yielded 87.24% R2, whereas, the prediction equation incorporating only TD-2, TD-4 and TD-7 showed 86.28% R2. The most optimal prediction equation with five independent variables including PY (PY, TD-2, TD-4, TD-6 and TD-7) demonstrated an accuracy of 89.30%, whereas, when considering only TD-2, TD-4, TD-6 and TD-7 in a regression analysis without PY, the accuracy was 88.87%. Similarly, for six and seven independent variables analysis including PY, the R2 was 89.74% and 90.13%, respectively. In contrast, the same equations without PY, yielded an accuracy of 89.54% and 90.04%, respectively. While considering PY with all the seven TDs, the prediction accuracy was found to be 90.45%, whereas, without PY the seven TDs yielded an accuracy of 90.37%.
 

Table 3: Most optimal prediction equations with peak milk yield and their respective accuracy.


 

Table 4: Most optimal prediction equations without peak milk yield and their respective accuracy.


       
The graphical presentation for comparative analysis of the observed results has been depicted in Fig 1. A perusal of the illustration revealed an increasing trend in prediction accuracy (R2) with each successive introduction of an additional TD into the regression equation. Interestingly, it was observed that while the increase in R2 was evident with each successive addition of a TD, however, the trend for increase in R2 showed a declining pattern with each successive step. The most significant increase in R2 (15.65%) was observed when the investigation switched from univariate to bivariate regression analysis. The R2 was significantly increasing up to five variables regression analysis, however, in-depth observation revealed that the contribution of PY in the prediction accuracy of this five variables regression analysis was relatively limited. The comparison between the most optimal prediction equations with PY and the same equations without PY revealed that the contribution of PY in prediction accuracy ranged between 0.08% and 17.50%. The most significant contribution of PY (17.50%) was observed when the investigation switched from univariate to bivariate regression analysis. The contribution of PY in prediction accuracy was found to be significant up to three variables regression analysis, thereafter no major contribution was observed. Therefore, it could be inferred from the results that for the prediction of FL305DMY, the most optimal prediction equation including peak milk yield was the MLR equation consisting of three variables (PY, TD-4 and TD-7) with 84.79% R2. Beyond this three variables equation, considering peak milk yield in the prediction of FL305DMY would not be significant and unnecessarily increase the cost incurred on data recording.
 

Fig 1: Comparative analysis of prediction accuracy (R2) across various regression equations.


       
Singh et al., (2022) reported that the optimal equation for the prediction of FL305DMY consisted of age at first calving (AFC), PY, TD-1, TD-2, TD-3 and TD-4 showed 80.53% accuracy. However, the regression coefficient in this equation indicated that the role of AFC was negligible as a predictor for FL305DMY in Murrah buffalo. A higher R2 estimate of 83.91% for prediction equation consisting of TD-4 and TD-7 only (without PY) was reported by Sharma et al., (2019) in crossbred cattle. In addition, they reported that the best prediction equation incorporating peak milk yield consisted of PY, TD-6 and TD-7 showed 86.30% accuracy, whereas, early prediction could be achieved by PY, TD-5 and TD-6 with 80.30% accuracy. Sah et al., (2013) suggested that the most optimal prediction equation including PY consisted of three variables (PY, TD-6 and TD-7) exhibited 78.10% accuracy.
Based on the study, it is concluded that the prediction of FL305DMY could be achieved by peak milk yield alone with 64.15% accuracy in Murrah buffaloes. The results of univariate linear regression analysis indicated that mid-lactation monthly test-day milk yield records up to TD-7 could be utilized in conjunction with peak milk yield records for early prediction of FL305DMY. The stepwise backward multiple linear regression analysis of monthly test-day milk yield records with and without peak milk yield revealed that the most optimal prediction equation including peak milk yield consisted of three variables (PY, TD-4 and TD-7) showed 84.79% accuracy. This would facilitate early and cost-effective prediction of FL305DMY, leading to early selection of genetically superior animals. Furthermore, this would reduce the generation interval and increase the annual genetic response in breeding strategies.
The authors are immensely thankful to the Director of ICAR-National Dairy Research Institute (NDRI), Karnal for providing the facilities required to conduct the study. The authors express their gratitude to the Head of Animal Genetics and Breeding division at ICAR-NDRI, Karnal for the valuable guidance and input during planning and execution of the study. The authors are also grateful to the Livestock Record Unit of ICAR-NDRI, Karnal, for providing the data required for the analysis.
The authors declare that they have no conflict of interest.

  1. Akaike, H. (1974). A new look at the statistical model identification. IEEE Transactions on Automatic Control. 19(6): 716-723.

  2. Bhosale, M.D. and Singh, T.P. (2015). Comparative study of feed- forward neuro-computing with multiple linear regression model for milk yield prediction in dairy cattle. Current Science. 108(12): 2257-2261.

  3. Chakraborty, D., Dhaka, S.S., Pander, B.L., Yadav, A.S., Singh, S. and Malik, P.K. (2010). Prediction of lactation milk yield from test day records in Murrah buffaloes. Indian Journal of Animal Sciences. 80(3): 244-245. 

  4. Chandrakar, J., Kumar, A., Naskar, S., Gaur, G.K. and Dutt, T. (2019). Performance evaluation and prediction of first lactation milk yield in Vrindavani cattle. Indian Journal of Animal Sciences. 89(2): 187-192.

  5. Chaudhry, H.Z., Khan, M.S., Mohiuddin, G. and Mustafa, M.I. (2000). Peak milk yield and days to attain peak in Nili-Ravi buffaloes. International Journal of Agriculture and Biology. 2(4): 356-358.

  6. Elmaghraby, M.M.A. (2009). Lactation persistency and prediction of total milk yield from monthly yields in Egyptian buffaloes. Lucrãri ªtiinþifice. 53(15): 242-249.

  7. Grzesiak, W., Wojcik, J. and Binerowska, B. (2003). Prediction of 305-day first lactation milk yield in cows with selected regression models. Archives Animal Breeding. 46(3): 213-224.

  8. Marumo, J.L., Lusseau, D., Speakman, J.R., Mackie, M. and Hambly, C. (2022). Influence of environmental factors and parity on milk yield dynamics in barn-housed dairy cattle. Journal of Dairy Science. 105(2): 1225-1241.

  9. Miranda, J.C., León, J.M., Pieramati, C., Gómez, M.M., Valdés, J. and Barba, C. (2019). Estimation of genetic parameters for peak yield, yield and persistency traits in Murciano- Granadina goats using multi-traits models. Animals. 9(7): 411.

  10. Patond, M.N., Khutal, B.B., Pachpute, S.T. and Ramod, S.S. (2013). Studies on peak yield and days to attain peak yield in Jersey cattle. Research Journal of Animal Husbandry and Dairy Science. 4(1): 4-6. 

  11. Rana, E., Gupta, A.K., Singh, A., Chakravarty, A.K., Yousuf, S. and Karuthadurai, T. (2021a). Genetic analysis of first lactation monthly test day milk yields, peak yield and 305 days milk yield in Murrah buffaloes. Indian Journal of Animal Research. 55(2): 134-138. doi: 10.18805/IJAR.B- 3679.

  12. Rana, E., Gupta, A.K., Singh, A., Ruhil, A.P., Malhotra, R., Yousuf, S. and Ete, G. (2021b). Prediction of first lactation 305- day milk yield based on bimonthly test day milk yield records in Murrah buffaloes. Indian Journal of Animal Research. 55(4): 486-490. doi: 10.18805/ijar.B-3963.

  13. Sah, R.K., Shah, R.R. and Pandey, D.P. (2013). Prediction of lactation yield from test day milk yield and peak yield in Kankrej cows. Indian Journal of Animal Sciences. 83(2): 170-172. 

  14. Schwarz, G. (1978). Estimating the dimension of a model. The Annals of Statistics. 6(2): 461-464.

  15. Sharma, N., Narang, R., Ratwan, P., Kashyap, N., Kumari, S., Kaur, S. and Raina, V. (2019). Prediction of first lactation 305- days lactation milk yield from peak yield and test day milk yields in crossbred cattle. Indian Journal of Animal Sciences. 89(2): 200-203.

  16. Singh, A. and Kumar A. (2007). Prediction of 305-day milk yield based on peak yield and pre-peak period in Karan Fries cows. Indian Journal of Animal Research. 41(4): 299-301. 

  17. Singh, A. and Rana, J.S. (2008). Prediction of 305-day milk yield based on test-day values in Murrah buffaloes. Indian Journal of Animal Sciences. 78(10): 1131-1133.

  18. Singh, N.P., Dutt, T., Usman, S.M., Baqir, M., Tiwari, R. and Kumar, A. (2022). Prediction of first lactation 305 days milk yield using artificial neural network in Murrah buffalo. Indian Journal of Animal Sciences. 92(9): 1116-1120.

  19. Singh, S. and Tailor, S.P. (2013). Prediction of 305 days first lactation milk yield from fort nightly test and part yields. Indian Journal of Animal Sciences. 83(2): 166-169. 

  20. Taher, K.N. (2012). Some factors influencing peak yield and days attain to peak yield in Friesian cattle in the central region of Iraq. AL-Qadisiya Journal of Veterinary Medicine Science. 11(2): 117-120. 

  21. Torshizi, M.E. and Mashhadi, M.H. (2015). Evaluation of various approaches in prediction of daily and lactation yields of milk and fat using statistical models in Iranian primiparous Holstein dairy cows. Iranian Journal of Applied Animal Science. 5(1): 81-87.

  22. Yang, X.S. (2019). Introduction to algorithms for data mining and machine learning. Academic Press, pp 174.

Editorial Board

View all (0)