Among other essential crops, soybean (
Glycine max (L.) Merril) is the most significant seed legume globally. It contributes about 25% of the world’s edible oil production, serves as the primary source of protein concentrate for animal feed and is necessary for the making of chicken and fish feeds. Soybeans are also a significant supplier of raw materials for the food, pharmaceutical and other industries. India ranks fifth in soybean output volume and fourth in cultivation area worldwide, based on estimates from the FAO and AMIS. During the 2020-21 growing season, soybeans in India were cultivated on 12.06 million hectares, producing 13.58 million tonnes at a productivity rate of 1,126 kg/ha. In contrast, the global average soybean productivity is approximately 2,900 kg/ha, with the highest yields (>3,000 kg/ha) recorded in the United States and Brazil
(FAO, 2020). However, soybean plants are exposed to a range of leaf diseases caused by pathogens including fungus, bacteria and viruses. If these diseases are not immediately recognised and controlled, they may have a significant negative effect on crop output and quality. Timely identification of leaf diseases is essential for implementing timely solutions that might reduce economic losses and promote sustainable farming practices. Traditional disease identification techniques often depend on agricultural experts visually examining the plants, a process that is time-consuming and susceptible to human mistakes. Consequently, there is an increasing need to use advanced methods like machine learning and computer vision to automate and improve disease detection procedures.
Computational methods such as Convolutional Neural Networks (CNN), Support vector machine (SVM), Random Forest (RF)
etc. are widely recognized for their exceptional ability to address computer vision tasks
(Saleem et al., 2022). CNN has gained widespread recognition for its proficiency in interpreting image data, particularly in applications such as image classification, segmentation and detection. The remarkable effectiveness of the CNN architecture in these tasks might be attributed to its capability to handle raw data without any previous knowledge.
(Senan et al., 2020). Several researchers have contributed to improving the performance of CNNs. According to the research, enhancing the performance of the Convolutional Neural Networks (CNNs) model requires adjusting architectural parameters and weights
(Gonzalez, 2007). Improving the performance of the model generally can be achieved by increasing the size of the training dataset, implementing transformations on the training data and fine-tuning different parameters
(Zhang et al., 2020; Semara et al., 2024; Maltare et al., 2023; Bagga et al., 2024).
Various machine learning techniques have found applications in agricultural research
(Rumpf et al., 2010). Traditionally, image classification tasks relied on handcrafted features like SIFT
(van Dokkum et al., 2000), HoG
(Zhang et al., 2012), SURF, followed by the utilization of learning algorithms within these feature spaces
(Bay et al., 2008). However, the effectiveness of such approaches heavily depended on predefined features
(Ma et al., 2021). Utilizing learned representations, which are more successful and efficient, has been more popular in recent years. With the use of representation learning, algorithms can automatically analyse large image collections and identify characteristics that help classify images with the least amount of mistakes
(Taye, 2023). Sequential neural network techniques have become an effective tool for classifying images and recognizing objects. It is a kind of deep neural network (DNN) made especially for processing images and it draws inspiration from the human visual system. For object recognition, a number of CNN architectures have been put out; baseline models for these tasks are AlexNet
(Mall et al., 2023) and LeNet
(Alzubaidi et al., 2021).
This study introduces the benefits of the CNN model to identify soybean disease from leaf images captured in uncontrolled environments. The models are structured according to the sequential architect and the image dataset is sourced from the Mendeley database. The dataset images are preprocessed before sending for training. The overall performance of the proposed model is evaluated in terms of classification matrices such as accuracy, precision, recall, F1-Score and confusion matrix.
Related work
Farming used to be about feeding more people, but now it’s a really important part of economies worldwide. However, plant diseases are a big problem, causing a lot of losses in crops and money for agriculture and forestry. One example is soybean rust, a fungal disease that has caused substantial financial damage to soybean crops. Eliminating just 20% of the infection has the potential to generate farmers a profit of almost $11 million
(Singh et al., 2020). Hence, prompt intervention is vital for the early detection and identification of plant diseases. Multiple techniques are available for identifying plant diseases. Certain diseases may not pre sent any observable symptoms or only become apparent when it is already too late to take action. For such situations, intricate examinations are required, frequently utilizing high-powered microscopes. Furthermore, some indications may solely be discernible in segments of the electromagnetic spectrum that are imperceptible to the human visual system
(Liu and Wang, 2021). This section explores the current trends in utilizing deep learning and CNNs architectures in the field of agriculture. Before deep learning became popular, image processing and machine learning methods were used to classify different plant diseases
(Tugrul et al., 2022). Usually, these systems would go through a series of steps: first, digital images were taken using a digital camera and then image processing techniques like enhancement, segmentation, color space conversion and filtering were applied to prepare the images for further analysis. Subsequently, the images were subjected to feature extraction, resulting in the identification of important characteristics. These features were then employed as input for the classification process. The total accuracy of categorization was highly dependent on the techniques used for image processing and feature extraction
(da Silva and Mendonca, 2004). Nevertheless, recent research has shown that networks trained using generic data can achieve state-of-the-art performance.
Artificial Intelligent based algorithms have consistently shown exceptional performance in almost all-important classification tasks
(Salvi et al., 2021). The same architecture allows for both feature extraction and classification to be performed. Convolutional Neural Networks (CNNs), a distinct category of artificial neural networks, have been extensively utilized in several domains of pattern identification, including computer vision and speech recognition.
Patil and Rane (2021) employed three architectural concepts to guarantee a certain level of invariance to shifts, scales and distortions: local receptive fields, shared weights and spatial or temporal sub-sampling. Several CNN architectures, like as LeNet, AlexNet and GoogLeNet, have been suggested for object recognition
(Shamsaldin et al., 2019; Wasik and Pattinson, 2024;
Cho, 2024;
Porwal et al., 2024; AlZubi, 2023;
Hai and Duong, 2024).
The LeNet architecture, pioneered by
Boulent et al., (2019), was the inaugural convolutional neural network specifically engineered for identifying handwritten numerals. The model consists of an array of convolutional and sub-sampling layers, which are then followed by dense layers connected MLP
(Boulent et al., 2019).
Researchers have proposed the use of Convolutional Neural Networks (CNNs) for plant disease classification and leaf identification. By examining leaf pictures,
Atabay (2017) created a convolutional neural network (CNN) structure for plant recognition. According to
Atabay (2017), the model’s classification accuracy was 97.24% for the Flavia leaf dataset and 99.11% for the Swedish leaf dataset. Using 1.8 million images from the ILSVRC 2012 dataset,
Wallelign et al., (2018) trained a Convolutional Neural Network (CNN) using a deep learning technique. For the task of plant identification, they obtained an average precision of 0.486
(Wallelign et al., 2018). Plant diseases were categorised by
Mohanty et al., (2016) using well-known deep convolutional neural network (CNN) models as AlexNet and GoogLeNet. With a publicly accessible dataset of 54,306 images, they attained an astounding accuracy of 99.35%. However, when tested on images taken in a different environment, the model’s accuracy dropped to 31.4%. Still, these results highlight how well deep convolutional neural networks (CNNs) classify plant illnesses
(Mohanty et al., 2016).