The model was trained for 100 epochs (Fig 3). During the first epoch, the training accuracy was 14.38% and the validation accuracy was 11.16%, indicating that the model had not yet learned meaningful features. By epoch 25, the training accuracy increased to 81.80% and validation accuracy reached 84.38%, showing substantial learning progress.
At epoch 50, training accuracy rose to 97.60%, while validation accuracy improved to 92.86%, with a significant reduction in both training and validation loss. The model achieved its highest validation performance at epoch 75, where validation accuracy reached 100% and validation loss dropped to 0.0087. The final epoch (epoch 100) showed a training accuracy of 96.37% and validation accuracy of 98.21%, with low loss values of 0.1038 (training) and 0.0681 (validation). The model demonstrated consistent improvement in learning and generalization across epochs. The gap between training and validation metrics remained small, suggesting minimal overfitting. These results indicate that the CNN effectively learned to classify pest images and performed reliably on unseen data.
Fig 4 shows the confusion matrix for pest classification on the test dataset. The model classified aphids correctly in 48 out of 49 cases, with one image misclassified as stem borer. For armyworm, 39 images were correctly identified and one was misclassified as grasshopper. All beetle images were classified correctly, showing perfect recognition. In the case of bollworm, 34 were predicted correctly, while two were confused with mites and one with stem borer. All grasshopper samples were correctly identified with no misclassification. The model predicted mites correctly for 41 out of 42 images, with one confused as beetle. For stem borer, 38 images were correctly classified and three were wrongly labelled as mites. Overall, the confusion matrix shows that the model performed well across all seven classes with few misclassifications.
Table 1 presents the classification metrics for each pest class. The CNN model showed strong performance in identifying aphids, with a precision of 1.0000 and recall of 0.9796. The F1-score reached 0.9897, indicating high accuracy. These results show the model could distinguish aphids from other pests with minimal false negatives. Armyworm detection was also reliable, achieving a precision of 1.0000 and recall of 0.9750. The F1-score of 0.9873 suggests that the model handled this class with few classification errors. For beetles, the recall reached 1.0000, while precision was slightly lower at 0.9773. The model successfully identified all beetle samples, giving an F1-score of 0.9885. This implies clear feature representation for this class.
The bollworm class had perfect precision (1.0000) but slightly lower recall at 0.9189, with an F1-score of 0.9577. A few bollworms were misclassified, possibly due to similarity with stem borers. Grasshopper identification was nearly flawless. With a precision of 0.9730 and recall of 1.0000, the F1-score stood at 0.9863. The model recognized all test samples from this class correctly. The mites class showed the lowest precision (0.8913) among all. However, its recall was high at 0.9762, giving an F1-score of 0.9318. This suggests the model confused mites with visually similar classes like beetles. In the case of stem borers, the model achieved a precision of 0.9500 and recall of 0.9268, resulting in an F1-score of 0.9383. A few samples may have been classified as mites, but performance remained acceptable. The overall accuracy across all pest classes was 96.88%. Both macro and weighted averages remained high, indicating balanced model performance without favouring any single class.
Fig 5 (a) shows the ROC curves for each pest class. All classes achieved an Area Under the Curve (AUC) of 1.00, which indicates perfect discrimination ability. While all classes achieved an Area Under the Curve (AUC) of 1.00, indicating perfect discrimination, this result should be interpreted cautiously. Such perfect scores may reflect a high model capacity relative to the dataset size and variability, potentially indicating overfitting or limited diversity in the test set. Future work with larger and more diverse datasets is recommended to confirm the generalizability of the model under varied field conditions.
The model was able to distinguish between each pest and all others with zero false positives. ROC curves near the top-left corner suggest strong classification performance. This result confirms that the model learned highly separable features for all seven pest categories. Fig 5(b) presents the PR curves for all pest classes. The precision and recall values remained consistently high across the classes. Aphids, armyworm, beetle, bollworm and grasshopper showed average precision (AP) values of 1.00, indicating both low false positives and low false negatives. Mites and stem borer each achieved an AP of 0.99, slightly below perfect due to a few misclassifications. High AP values indicate that the model is highly confident and accurate, even with class imbalance.
Fig 6 presents predictions made by the trained CNN model on test images from each pest class. The model correctly identified grasshopper with a confidence of 90.37%. Although slightly lower than other predictions, this score still indicates a strong match. The stem borer and beetle samples were classified with 100% confidence, suggesting the model has learned distinct features for these categories. For armyworm, the confidence reached 99.77%, reflecting the high-quality visual features in the test sample. Mites were detected with 99.29% confidence, which supports the model’s ability to recognize even small and visually complex pests. The bollworm sample was identified with 99.48% confidence, showing that the model performs well even for fine-grained insect classes. These results validate the CNN model’s robustness across diverse pest types and confirm its practical utility in soybean pest detection scenarios. The high confidence levels across predictions indicate strong generalization on unseen data.
The custom CNN model demonstrated strong performance in classifying seven soybean pest classes, achieving an overall accuracy of 96.88% and high precision, recall and F1-scores across classes. The ROC-AUC values of 1.00 indicate excellent discrimination; however, such perfect scores suggest careful interpretation, as they may reflect the limited variability in the dataset. Future studies with larger and more diverse datasets are necessary to fully assess generalizability under field conditions.
The confusion matrix and class-wise metrics reveal that some visually similar pests, such as mites and stem borers, were occasionally misclassified, highlighting the challenges of fine-grained pest differentiation. Despite these minor misclassifications, the model showed robust performance and high confidence in predictions, suggesting strong feature learning.
Several recent studies have explored deep learning and hyperspectral techniques for soybean pest detection. Table 2 compares existing studies on soybean pest detection.
Tetila et al., (2020) evaluated five deep learning models, Inception-v3, ResNet-50, VGG-16, VGG-19 and Xception, on UAV images, achieving a maximum accuracy of 93.82%.
Gui et al., (2023) proposed an Attention-ResNet meta-learning model using hyperspectral data for detecting
Leguminivora glycinivorella, reaching an accuracy of 94.57%.
Similarly,
Tailanián et al. (2015) used SVM and spectral signatures for early caterpillar detection, reporting a 95% classification rate.
Ma et al., (2014) combined hyperspectral imaging with fuzzy-rough set-based wavelength selection and SVDD, achieving 98.8% accuracy on insect-damaged soybeans.
Huang et al., (2012) applied hyperspectral transmittance imaging and SVDD, yielding 95.6% accuracy. In a more recent study,
Shah et al., (2022) used a ResNet-50 model on an augmented image dataset, achieving 96.25% accuracy. In comparison, the present study developed a custom CNN trained on 2,450 RGB images across seven pest classes, achieving a competitive accuracy of 96.88%, demonstrating its effectiveness and scalability for real-time field deployment.
Practical deployment in Indian soybean fields presents challenges such as dense crop canopies, overlapping pest symptoms, mixed infestations and variable illumination. The model’s performance suggests it can handle these conditions effectively, particularly for smallholder farmers using smartphones or edge devices. Nonetheless, potential overfitting indicated by perfect ROC-AUC values should be addressed in future work by expanding dataset diversity and incorporating UAV-assisted imagery or cross-regional data.
Overall, the proposed CNN model is highly effective, scalable and field-ready for integrated pest management in soybean farming. Its lightweight architecture enables real-time applications, supporting timely decision-making and contributing to improved crop health monitoring and yield optimization.