Faba bean yield forecast under shading conditions
Stepwise regression was accordingly implemented to assess the yield prediction potential of the Faba bean under the three shading conditions (Table 1). The predicted yield was essentially based on NP, NS and WSS parameters, which had significant trends in all shading levels (Table 1). The distribution of data points (crosses) across the plot follows a diagonal trend, indicating a close alignment between the predicted and actual yields. The model’s performance was strong and accurate across the studied entities (R² = 0.84-0.88; R²
overall = 0.86;
P < 0.001) (Table 1). Furthermore, an average de
viation of predicted yields from the actual yields was observed as most of the predicted entities were below the observed ones (RMSE
overall = 214.8). The model had an average prediction error being less than 7% (MAPE
overall = 6.88%) reflecting its precision. Notably, the differences in the slopes of the S1, S2 and S3 levels suggest that the model captures the varying effects of shading conditions on yield, with S3 generally yielding higher predictions.
The lasso regression was applied to optimize the yield prediction aspect of the studied shading levels (Table 2). Similar to the previous model, yield predictions were attributed to NP, NS and WSS parameters, with statistical significance observed for all treatments (
P <0.0001) (Table 2). The data points exhibit a strong diagonal alignment, signifying an excellent agreement between predicted and actual yields. The model’s performance demonstrated better accuracy and reliability across the shading levels, with R² values ranging from 0.93 to 0.96, culminating in an R²
overall of 0.95 (
P < 0.001) (Table 2). The predicted yields show a highly minimal de
viation from actual values, as reflected in the much lower RMSE overall value (102.6) compared to the stepwise model (Table 3). The lasso model’s prediction error was less than 3.5% (MAPE
overall = 3.28%), highlighting the model’s enhanced precision. The distinct slopes observed across shading levels (S1, S2, S3) suggest that the model accurately captures the nuanced impact of shading on yield, with S3 again yielding higher predictions overall.
The findings of this study offer significant insights into the impact of artificial shading on bean yield and highlight key factors influencing it. Stepwise and Lasso regression analyses demonstrated that NP, NS and WSS are crucial predictors of
V. faba yield across all shading treatments. The exclusion of variables such as plant height, NTS and NP from the final models in specific treatments suggests that their influence may be limited under certain shading conditions. This could be attributed to the diminished relevance of these traits in altered light environments, as highlighted by recent research showing a significant reduction of plant height and stem characteristics of tropical tree species under high shading (
Cifuentes and Moreno, 2022). In addition, our results align with those reported by (
Chan Fu Wei and Molin, 2020), who found that models based on seed number effectively predict soybean yield with high accuracy (R² = 0.70). Furthermore, our findings were in line with other studies emphasizing the importance of seed number and seed weight in yield prediction (
Wu et al., 2017;
Zhang et al., 2023).
Liu et al. (2021) similarly observed that in high shading conditions, both seed number and seed weight are pivotal for accurate yield predictions, reinforcing the reliability of these predictors across varying shading scenarios.
The high R² values achieved in stepwise analysis demonstrate the model’s robustness in explaining the variability in Faba bean yield. In this case, the decrease in R² value from S2 to S3 reflects the increasing complexity of predicting yield under higher shading intensities. It is often explained by the increased variability and stress of plants at a higher level of shading, as will be assumed. This assumption is also strengthened by some recent studies that prove the trend of lowered predictive accuracy over specific variables with shading intensification
(Pratzer et al., 2023). Unaccounted yield variability ranged from 11.2% to 16.2% with stepwise regression and from 4.2% to 6.2% with Lasso. This would prove that multiple factors, other than variables influence the yield. Further research work has to be directed towards localizing, identifying and quantifying those factors responsible for this variability due to environmental, genetic diversity and management practices.
Faba bean yield prediction models were validated to evaluate the performance of both stepwise and Lasso regression aspects during Y1 and Y2. Results showed that the prediction error with the Lasso approach was 2.38 and 2.09, respectively which was less than the ones obtained with the stepwise approach (Table 4), which implied the high prediction accuracy of the Lasso model. In this study, the Faba bean yield prediction model at the field scale developed by the Lasso method showed better performance and accuracy than that developed by the stepwise method. In a study conducted by Sharif
et al. (2017) for modeling winter oilseed rape yield, Lasso was among the most accurate regression techniques among several others. This may be attributed to its ability to exclude variables that do not significantly contribute to the final yield. For such scenarios, where the true values of corresponding coefficients are likely near zero, regression techniques with variable selection capabilities often outperform traditional methods
(Hastie et al., 2009). However, stepwise regression was less performant in this study, possibly due to the inaccurate contribution of explanatory variables. Stepwise regression is primarily designed for low-dimensional problems. When the number of observations is not substantially larger than the number of explanatory variables, high variance and overfitting can compromise the predictive power of classical approaches. These results were in line with previous studies on regional yield prediction
(Cai et al., 2019; Cao et al., 2021; Liu et al., 2022). Kumar et al. (2021) compared the performance of stepwise and Lasso regression models in wheat (
Triticum aestivum L.) yield prediction using meteorological data. Their results demonstrated that Lasso regression was superior to stepwise regression in reducing data dimensionality.
The superior performance of the Lasso regression method highlights its potential as a valuable tool in precision agriculture, where high-dimensional datasets are common
(Miller et al., 2022; Pukrongta et al., 2024). Accurate yield prediction can guide resource allocation, inform decision-making and support agricultural policies aimed at ensuring food security (
Raihan, 2024). For instance, the ability to identify critical predictors from environmental, agronomic and management factors enhances the development of targeted interventions to optimize yield under variable conditions
(Kumar et al., 2015). Moreover, the implications extend to sustainable agriculture practices. Farmers and stakeholders can plan more efficiently, minimize input wastage and mitigate risks associated with adverse environmental conditions by accurately predicting their crop yield.
Relationships of Faba bean yield prediction parameters under shading conditions
To delve deeper into the predictive yield dynamics, Pearson correlation analyses were accordingly conducted in the best possible shading conditions (S2 and S3) (Fig 1). Results revealed that under these conditions in Y1, Faba bean yield (YP) was positive and significantly correlated with NP, NS and WSS parameters (r = 0.35-0.72;
P < 0.001), indicating that increasing these traits would increase Faba bean yield. In contrast, significant negative correlations were spotted between plant height and the YP (r = -0.39 to -0.37);
P < 0.001) (Fig 1A). Furthermore, the transition from S2 to S3 levels further amplified these correlative effects.
Regarding the second experimental year (Y2), the variables NP, NS and WSS still showed a significant positive correlation with YP in the S2 level, while a transitional shift was noticed into positive correlations between plant height and YP unlike Y1 (Fig 1B). The significant positive correlations observed between YP and NP, NS and WSS parameters in both Y1 and Y2 highlight the importance of these traits in determining yield. The increase in grain yield was attributed to an increase in the number of pods and seeds due to greater branching and more pods on the branches (
Alharbi and Adhikari, 2020). The negative correlation between plant height and YP in Y1, particularly under S2 and S3 conditions suggests that taller plants may not necessarily be more productive in Faba bean. While taller plants often have a larger leaf area, excessive height can lead to increased lodging and reduced light interception at the lower canopy levels. Our results are similar to those of (
Wadan and Abd el Shafi, 2014), who showed a negative correlation between plant height of wheat and grain yield. However, for barley (
Hordeum vulgare L.), grain yield was positively correlated with plant height under all shading treatments (25% and 30%) (
Arenas Corraliza et al., 2020). The shift in the correlation between plant height and YP from negative in Y1 to positive in Y2 under S2 conditions suggests that the relationship between plant height and yield can be influenced by environmental factors and specific growing conditions. In Y2, taller plants may have been able to utilize the available resources more efficiently, leading to increased yield. However, further research is needed to elucidate the underlying mechanisms driving this change. The number of branches is highly variable in legumes, but is an essential determinant of grain yield (
Alharbi and Adhikari, 2020), however, no correlation was reported between YG and NTS under the two shading treatments. Furthermore, the impact of these traits on yield becomes more pronounced under higher levels of shading throughout the seasons as indicated by our study. This could be attributed to the increased competition for resources, such as light and nutrients, under more shaded conditions.
The contribution of the most important Faba bean parameters in each shading condition was further explored using an RDA. The results revealed that the variance in shading composition could be largely explained by both axes, with 43.89% of the variance explained by RD1 and 41.56% by RD2 (Fig 2). The third shading level (S3) had 4 major contributors including NP, NS, WSS and YP with the two latter components having more significance overall. Three contributors were detected for S2 (with WSS significance), while only the NP parameter was shown to be significantly relevant to the S1 level among the two detected contributors (Fig 2). The results of the RDA offer insightful contributions to our understanding of how the Faba bean responds to varying shading conditions. Key parameters (NP, NS, WSS and YP) emerged as major contributors, with WSS and YP having greater significance. These results are consistent with the evidence showing that yield-related traits, such as seed weight and total yield, are highly sensitive to reductions in light, with plants adjusting their resource allocation to maximize reproductive success under suboptimal conditions. A study by
Wang et al. (2021) has demonstrated that high shading often induces changes in photosynthate partitioning, favoring reproductive organs over vegetative growth. This could explain the prominence of yield-related traits in the S3 shading condition. The differentiation of parameter importance across shading levels points to an adaptive response by Fab beans to varying light conditions, a concept supported by recent studies on crop plasticity (
Mínguez and Rubiales, 2021). The findings reinforce the importance of NP, NS and WSS in determining yield, particularly under higher levels of shading.