Soybeans (
Glycine max L. Merrill) are one of the most important seed legumes in the world today. It is an important source of edible oil, accounting for almost 25% of global edible oil production and acting as a major reservoir
(Agarwal et al., 2013). Its nutritional composition helps lower the risk of diabetes and heart disease. According to predictions, there will be more than 9.1 billion people on the planet by 2050 and at the same time, food demand is expected to increase by 60% (Food Security Statistics, accessed 27 December 2009). It is therefore imperative that efforts to raise and improve the quality of crop yield be intensified. According to recent scientific evaluations, infectious biotic and abiotic diseases have a negative impact on yield potential, typically reducing it by 40%. The unequal impact on farmers in developing nations is particularly concerning, with instances of yield losses reaching an unsettling 100%
(Karlekar and Seal, 2020). Continuous crop monitoring in conjunction with prompt and accurate disease diagnosis is essential. Reducing output losses and preserving agricultural sustainability are considered to require effective disease management and control strategies.
Food production must significantly expand to keep up with the growing global population (FAO) (Food Security Statistics, accessed 27 December 2009). High-yield food production must be combined with sustainable agricultural methods to protect natural ecosystems. Global food security must coexist with a high nutritional content
(Carvalho, 2006). For the production of healthy crops with a high yield, cutting-edge scientific techniques are essential to employ for crop management and leaf disease detection. Comprehensive ecosystem monitoring is one area in which AI-based state-of-the-art technologies are being applied. It is known that soybean legumes are used as a worldwide feed crop and may be processed into a variety of cuisines
(Jianing et al., 2022). Soybean disease is an important factor restricting the high quality and high yield of the soybean plant (
Cen et al., 2020;
Meng et al., 2022). Ensuring the health of soybean plants is critical for optimizing yields and maintaining sustainable agricultural practices. One pivotal aspect of soybean plant health is the early detection of leaf-related issues, including diseases, pests and nutrient deficiencies. Traditional methods of visually inspecting soybean fields for leaf-related concerns are often time-consuming and may lack the precision needed for timely intervention. However, recent advancements in technology, particularly in the field of computer vision, offer innovative solutions to address these challenges. Automated leaf detection involves leveraging image processing algorithms and machine learning to analyse digital images of soybean plants. This technology enables the rapid and accurate identification of leaves, providing valuable insights into plant health and facilitating early intervention when issues arise.
Agricultural research has used a range of machine learning techniques, such as support vector machine (SVMs), ANNs, decision tree architectures, K-means and k- nearest neighbors, CNN
(Panigrahi et al., 2020; Min, et al., 2024). Traditionally, image classification problems have been solved with hand-engineered features, including SURF (
Untari and Satria, 2022;
Pranata et al., 2019), HoG
(Dalal and Triggs, 2005) and SIFT
(Lowe, 2004), together with some sort of learning technique in these feature spaces. However, taught representation are more effective and successful, according to a new advancement in machine learning. The main advantage of representation learning is its capacity to automatically search through large image datasets and identify features that permit the lowest possible level of error in the classification of images
(Tang et al., 2023; Wihardjo et al., 2024; Sankaran et al., 2010).
CNNs are a class of deep learning algorithms specifically designed for image analysis tasks, making them well-suited for the complex visual patterns associated with plant diseases. These neural networks can learn intricate features and patterns from large datasets, enabling them to differentiate between healthy and diseased soybean leaves with remarkable accuracy. Concerning the detection of soybean leaf disease, CNNs offer various advantages. By employing massive databases of annotated images, these networks can identify subtle signs associated with specific diseases. The automated capabilities of CNNs allow for the fast and reliable analysis of vast agricultural fields, providing farmers with a useful tool for early disease identification.
In this paper, soybean leaf diseases are identified using Convolutional Neural Networks (CNNs). To train the CNN model, a variety of images depicting different diseases are used. Diabrotica Speciosa, Caterpillar and Healthy soybean leaves are the three main categories included in the Mendeley dataset. Labelling, scaling and grayscale conversion are all part of the image processing process. The model’s accuracy is then assessed after 80% of the dataset is used for training and 20% for validation. The performance of the model is evaluated by confusion matrix, classification report and ROC (AOC). The accuracy and loss of the model is obtained for unseen data. This work can be utilized as a reference to identity diseases in the leaf and other parts of the plants.
Related work
The most recent advancements in CNN architectures and deep learning applications for agricultural applications are covered in this section. Before deep learning, several plant diseases were categorized by machine learning and image processing techniques
(Cho, 2024;
Barbedo, 2013;
Pydipati, et al., 2005; Camargo and Smith, 2009b; 2009a). With a digital camera, the original digital photos are taken. The images are then prepared for the following stages by using image processing techniques like segmentation, color space conversion, filtering, as well as picture enhancement. Subsequently, salient characteristics of the picture are taken out and fed into a machine learning model
(Al-Hiary et al., 2011). The overall accuracy of the classification will therefore be contingent upon the kind of image processing and the feature extraction techniques chosen. Conversely, more recent studies suggest that generic data may be used to provide state-of-the-art performance that is limited by the network. CNNs are supervised multi-layer networks that can automatically learn features from datasets. CNNs have shown to be at the cutting edge of performance in almost every notable classification task during the past few years.
Atabay (2016) asserts that it is capable of both feature extraction and categorization by using identical architecture. Plant diseases were classified using convolutional neural networks, or CNNs.
Cortes (2017) utilized an openly available dataset comprising 86,147 images of healthy and diseased plants. They developed a network of neural networks employing deep learning and semi-supervised methods to identify crop varieties and diseases across 57 distinct classifications. The successful experiment with unlabeled data resulted in Russ-net, which became operational in less than five epochs. It achieved a detection rate in the vicinity of 1e-5 and an initial training score of 80%.
Recently, object recognition and picture classification have been accomplished with convolutional neural networks (CNNs)
(Atabay, 2016b;
Hanson, et al., 2017; Mohanty, et al., 2016). Inspired by the visual system of humans, convolutional neural networks (CNNs) are a type of deep neural networks, or DNNs, that analyze images. It was suggested that several CNN architectures be used for object recognition. Among them, Alex Net
(Krizhevsky, et al., 2012) and LeNet
(Le Cun et al., 1998) have been regarded as a baseline for a variety of tasks
(Atabay, 2016).
A unique class of neural networks known as ACNN has been extensively used to address a range of pattern identification issues in computer vision, speech recognition and other areas. Three architectural techniques shared weights, spatially-temporal subsampling and local receptive fields are used by CNNs to provide a degree of shift, scale and distortion invariance
(Le Cun et al., 1998). Many CNN architectures, like as LeNet, Alex Net, Google Net and others, have been utilized for object recognition.
A review for CNN-based plant disease categorization was presented by
Lu et al., (2021). They assessed the major issues and fixes with CNN, which is used to classify plant diseases, as well as the DL criteria. They found that to get a more satisfying outcome, more study with more complicated datasets was needed.
Golhani et al., (2018) outlined the advantages and disadvantages of using hyperspectral data for the diagnosis of plant leaf diseases. Within a brief period of time, they also introduced CNN techniques for SDI development. They found that tests of SDIs on a variety of hyperspectral imaging devices at plant leaf size are necessary as long as they are relevant for appropriate crop protection. With an emphasis on potato leaf disease,
Bangari et al., 2022 provided a review of disease detection with CNN. After looking over several studies, they concluded that convolutional neural networks are more effective in identifying the illness. It was also found that CNN significantly improved the highest level of diseases identification accuracy.
Korea is known for producing a wealth of agricultural assets and is especially skilled at growing soybeans. Soybeans are key to the country’s agriculture as an essential source of vegetable protein for the local diet, a raw material for numerous businesses and a valuable commodity on the market. Even with its bright future, soybean farming is not without its difficulties. One major problem is plant diseases, which can significantly reduce productivity and result in losses for farmers.