The practice of growing cereal/legume intercrops among subsistence farmers in semi-arid parts of South is well documented. It is worth noting that the reason behind the continued practice of such a farming method cannot be attributed merely to culture. In TamilNadu’s arid areas, with an annual average rainfall of less than 350 mm and irregular rainfall within the monsoon season, intercropping sorghum with lentils has advantages that cannot be achieved by growing them individually, such as interception of light by two layers, nutrient absorption from various depths in the soil and two harvest times. When one component struggles in a dry year, the other frequently compensates and that yield-stability property is worth considerably more to a resource-constrained household than any marginal gain from sole-crop intensification.
Repeated over several years, this pattern pushes chemical inputs upward, weakens root nodule activity, reduces the soil’s native population of arbuscular mycorrhizal fungi and in calcareous semi-arid profiles gradually shifts pH in a direction that further penalises microbial communities. Integrated nutrient management shorthand for the deliberate combination of chemical fertilisers at reduced rates with organic manures and microbial inoculants has attracted attention as a way out of this cycle, but published evidence for the lentil-sorghum intercrop specifically, tracked across multiple seasons with soil biological measurements taken alongside yield, remains thin. Field studies on legume-cereal crop rotations have provided empirical data that proves that the combination of organic fertilizers with biofertilizers within INM provides better yield parameters and soil quality parameters compared to the use of individual sources of nutrients. According to
Sodavadiya et al., (2023), treatment with 50% RDF along with vermicompost, Rhizobium and PSB resulted in the best seed yield and dry root biomass in a sequence of chickpea and forage sorghum crops by
Gaba et al., (2015).
A second gap sits at the intersection of agronomy and data science. The interaction between nutrient input, soil organic matter carry-over, microbial activity, crop growth and seasonal weather does not unfold in straight lines and ordinary regression cannot capture it faithfully. Threshold responses, lagged effects of organic matter decomposition and feedback loops between soil biology and crop nitrogen uptake all require modelling approaches that can handle non-linearity and temporal sequence simultaneously
(Bedoussac et al., 2021). Used together, they form a hybrid architecture suited to exactly the kind of multi-dimensional, season-length prediction problem that INM optimisation in dryland intercrops represents in Fig 1.
The relevance of the proposed machine learning approach to the problem domain is not an issue. Random Forest models have successfully identified patterns of crop diseases in leguminous systems with highly accurate classifications (
Cho, 2024), whereas intercropping systems utilizing sorghum under rainfed vertisols reveal that yield improvement according to Land Equivalent Ratio (LER) is contingent upon treatment-level data that cannot be captured by linear models during different seasons (
Sujathamma and Nedunchezhiyan, 2024). In other words, a combination of machine learning algorithms for the processing of multi-feature time series agricultural data is not only computationally convenient; it is a necessity, as the output variable-in this case, lentil-equivalent yields-is influenced by nutrients, soil biology and seasonality.
While all these developments have been made, there is still a major issue: no such research exists in which an ensemble approach using RF-LSTM has been applied to predict INM treatments in the context of an intercropping system involving lentil-sorghum crops. Machine learning models that have been proposed for similar agronomic settings either use monocrop datasets, thus not capturing the competition between the species
(Prodhan et al., 2022; Corrales et al., 2022), or miss out on soil biological factors, including dehydrogenase activity, microbial biomass carbon content and AMF colonization, which have been proven sensitive to organic fertilizers
(Sharma et al., 2021). Also, there have been no attempts by any machine learning models to consider the season-dependent impact of organic fertilizer application on soil organic carbon and subsequent crop yields, which can be successfully accounted for using LSTM models. This study bridges those gaps, employing the knowledge from two kharif seasons of agronomic, soil biological and meteorological data through a valid RF-LSTM ensemble.
The present investigation was accordingly structured around three objectives: first, to characterise the agronomic response of lentil-sorghum intercropping to seven INM treatments across two kharif seasons under field conditions at Pattukottai; second, to build and validate a hybrid RF-LSTM model capable of predicting lentil equivalent yield and soil organic carbon dynamics from field-collected soil, crop and weather data and third, to extract from that model a ranked account of the agronomic and environmental variables most responsible for yield variability, so that the findings carry practical relevance beyond the single experimental site.
Literature review and research gap analysis
Summary of representative studies (2020-2025)
Table 1 synthesises twelve peer-reviewed studies published between 2020 and 2025 that addressed, individually or in combination, INM in legume-cereal systems, machine learning applications in nutrient and yield prediction and soil biological response to organic inputs.
Identified research gaps
As shown in Table 1, there exist four major gaps that the current study intends to address, based on the systematic review of literature. Gap 1: To date, there are no published studies where the hybrid approach of RF-LSTM is used to predict INM treatment data in an intercrop of lentils and sorghum. Machine learning methods, either those used to analyze monoculture datasets
(Prodhan et al., 2022; Corrales et al., 2022) or excluding organic predictors
(Droutsas et al., 2022), were applied to solve agroecological problems. Gap 2: Despite the high responsiveness of soil biological parameters (dehydrogenase activity, microbial biomass carbon, AMF colonisation) to the impact of organic fertilizers
(Sharma et al., 2021), none of the intercrop models described in the literature have considered them as independent or dependent variables. Gap 3: The impact of organic manure application in one season on the SOC level and yield in subsequent seasons has not yet been captured by any ML model except for the traditional regression-based approach which does not have memory of consecutive soil state changes. Gap 4: Decision-support outputs (Feature ranking leading to management recommendations) have never appeared in the literature among agronomic INM models.