Soybean (
Glycine max) is one of the most broadly grown legume plant in Madhya Pradesh and Maharashtra, India for its oil and protein products
(Jianing et al., 2022). The yield of the plant greatly depends on the health of the plant which in turn depends on the resistance of the plant for leaf disease. The common leaf diseases in soybean plants include soybean leaf spots and soybean rust. The early detection of these diseases is important for better productivity. In early days these disease identification were done manually and accurate identification of these disease types was a time consuming process
(Wu et al., 2023).
The rapid developments of image processing and machine learning techniques have led to algorithms for automatic disease detection from the leaf images of the diseased plants. Image processing aims at deriving useful information by performing many operations on the digital images
(Saradhambal et al., 2018). Image processing finds its application in the various research fields such as cyber security, telemedicine, agriculture,
etc.
(Prabaharan et al., 2020). Machine learning has its application in the field of agriculture to design automatic harvesting machines; estimate production; managing irrigation needs; pest and weed control activities,
etc.
(Yao et al., 2023).
Machine learning algorithms used for feature extraction and leaf disease classification are discussed in the rest of this section. Leaf images were segmented using partition based clustering algorithm namely, K-means clustering algorithm to identify the lesion part of the leaf and then colour and shape features are extracted from it. Machine learning techniques like decision tree (DT), Support vector machines (SVM) and K-Nearest neighbors (K-NN) are applied to classify the leaf diseases (
Nandhini and Bhavani, 2020). Leaf images in RGB, HSV and Lab* colour space were used to extract texture features such as Gray-Level Run-Length Matrix and the Gray-Level Occurrence Matrix from the chickpea plant leaf images. Multi-class classification models such as K-NN, SVM and Neural Networks were used to classify leaf diseases. Their proposed model works well in identifying fusarium wilt of chickpea leaves
(Hayit et al., 2024). Colour features like HSV features were obtained from segmented images of lesion leaves to train the artificial neural network (ANN) to distinguish the healthy and diseased cotton leaf samples
(Ranjan et al., 2015). Texture features such as contrast, energy, homogeneity, correlation and smoothness along with region shaped shape features were extracted from leaf images and Adaptive Neuro Fuzzy Inference system was used for disease identification (
Nandhini and Srisathya, 2021). Statistical features like colour co-occurrence matrix and shape features using blob analysis were extracted from segmented leaf images and disease classification was done using SVM
(Kappali et al., 2024). These five works in the literature, extracted colour, shape or texture features from the segmented plant leaves and applied SVM, K-NN, ANN, ANFIS, RF, LR or ensemble classification techniques for identifying leaf diseases and achieved average accuracy in the range of 80% to 90%. It is found in literature that autoencoder, a deep neural model, is used for image denoising, image reconstruction, image compression, feature extraction,
etc. (
Li et al., 2023). Stacked denoise autoencoder (SDAE) was used to extract features from hyperspectral images and logistic regression (LR) approach was employed for classification
(Xing et al., 2016). Convolutional autoencoder was used in feature extraction from Optical Emission Spectroscopy (OES) data samples and Support Vector Regression (SVR) machine was for predicting final
etch rate
(Maggipinto et al., 2018). A multitask learning framework based on siamese network and autoencoder was developed for classification of hyperspectral images
(Miao et al., 2019). Autoencoders were applied in converting MRI image to feature vectors
(Chen et al., 2024).
With the advancement in deep learning techniques, leaf disease classification was done by applying Convolutional Neural Network (CNN) models. Leaf diseases in Faba bean plants were identified using a CNN model (
Jeong and Na, 2024). VGG16 pretrained model was enhanced with a stack of one convolutional layer, one pooling layer and one fully connected layer in order to detect and classify Wilting in Soybean crop (
Na and Na 2024). Foliar diseases in black gram plants were detected using CNN
(Kalpana et al., 2023). A customized CNN model was designed to identify diseases in leaves of tomato plants. The results obtained were compared with the pre-trained models of VGG16, InceptionV3 and MobileNet
(Agarwal et al., 2020). A web based application to identify fungal and bacterial diseases in potato leaves was built using image segmentation techniques and CNN model (
Shukla and Sathiya, 2022). All these methods used a customized CNN model or improvised the existing CNN model to classify leaf disease in various types of plants. On an average these models achieved accuracy in the range of 85% to 90%.
With the advent of autoencoders for feature extraction in various fields like hyperspectral image classification, OES classification, MRI image retrieval,
etc., this paper aims to develop a classification model using convolutional autoencoder and Multinomial logistic regression. Convolutional autoencoder is used for extracting features from the Soybean leaf images and Multinomial logistic regression model is applied in classifying the type of disease from the extracted features.