Three Garo Hills districts of Meghalaya, India, were selected due to their large cashew plantations and the prevalence of various plant diseases. Fieldwork lasted several months, (20 months in 2023-25) during which various leaf samples were collected under different environmental conditions and at different growth stages. The research investigation was conducted in the Department of computer application, North-Eastern Hill University Tura Campus, Meghalaya. Fig 1 showing the district’s on the map of Meghalaya, India from where the image data of Cashew Leaf collected.
To obtain high-quality images suitable for DL/ML, a smartphone with a 64-megapixel camera was used. This choice provided a good balance between portability and image resolution, facilitating the capture of detailed leaf characteristics necessary for accurate disease classification. The resulting images had a resolution of 4640 x 2610 pixels and 72 dpi (Fig 2).
Table 1 summarizes recent related studies using various plant disease datasets. It highlights the attributes, number of samples and classes, as well as the dataset creator. The specification table presents the dataset’s key characteristics and metadata, including the subject area, data type and data collection process. It also provides details about the data format, storage method and source location (Table 2).
The dataset setup workflow, depicting the sequential procedure followed for image acquisition, preprocessing, annotation and dataset structuring. Each phase ensures data integrity, uniformity and suitability for subsequent model training and performance evaluation (Fig 3).
Causes of the cashew leaf diseases
Some varieties are susceptible to several leaf diseases that reduce photosynthetic capacity and fruit yield. The main causes of these diseases are fungal pathogens, bacterial infections and environmental circumstances. Fungal pathogens are the greatest culprits, which attack leaves under various conditions like high humidity and warm temperatures in addition, abiotic stress factors such as prolonged leaf wetness, soil moisture imbalance and wounds caused by pruning or insect feeding create entry points for pathogens and increase infection (
Akinwale and Esan, 2021). Here, four categories of cashew leaf diseases are as described below.
Powdery mildew
It is widespread and unhelpful diseases affecting cashew caused by
Erysiphe quercicola (formerly
Oidium anacardii). It manifests as white type, powdery fungal growth on the upper portion of leaves, young shoots and inflorescences. Infected tissues become diluted and make infections lead to leaf curling, yellowing, early defoliation and suppression of new growth. When the pathogen attacks inflorescences, it causes flower abortion, resulting in poor fruit and nut set
(Smith et al., 1995).
Anthracnose
It is caused by
Colletotrichum gloeosporioides and devastating foliar syndrome of cashew. It typically begins as small, water-soaked spots with sunken on young or mature leaves, which later expand into irregular brown to black necrotic lesions. As the infection advances, the lesions may coalesce, leading to extensive leaf blight and premature defoliation
(Monteiro et al., 2022).
Leaf spot
Leaf spot disease is a common foliar problem in cashew plantations, primarily caused by
Pestalotiopsis species, with recent reports identifying
Neopestalotiopsis clavispora as an emerging pathogen. Initial symptoms typically appear as small, circular to irregular brown lesions scattered across the leaf surface. As the disease progresses, these lesions enlarge and may coalesce, forming extensive necrotic patches (
Manjunatha et al., 2023).
Healthy
These leaves observed appear completely healthy. It has a smooth texture, consistent light green color with lusture and no visible signs of damage or infection. The veins are clearly defined and the leaf is intact without any deformities or discoloration. Healthy leaves such as this one are a positive sign of good plant health and effective crop management (
Akinwale and Esan, 2021). Fig 4 demonstrate each class images of the datasets.
Methodology
Data augmentation not only addresses the class imbalance problem but also advances the model’s generalization capability by enabling it to adapt to different variations of the same leaf state. Therefore, the augmented dataset provides a more solid foundation for deep learning classification, reduces overfitting and improves the model’s strength in real-world scenarios. Table 3 shows the distribution before and after augmentation.
Data augmentation
To balance the class distribution and increasing dataset diversity data augmentation using the Keras Image Data Generator was performed. The augmentation approaches are as:
• Rotation up to 20°.
• Shifting width and height by 10%.
• Shearing up to 10%.
• Scaling within a 20% range.
• Horizontal flipping.
• Filling empty pixels using nearest neighbors after the transformation.
These transformations produce synthetic yet realistic variations of the original leaf images and enrich the dataset with new samples that reflect real-world variations.
Balancing the dataset
The exact number of images per class was generated to 637, resulting in a balanced dataset covering all four classes.
Dataset annotation by expert
The collected leaves data were classified into four categories manually by agricultural professional according to their respective diseases: “Leave-spot”, “Anthracnose”, “Powdery Mildew” and “Healthy”. This classification was performed in collaboration with agricultural professionals and is based on theoretical/practical evidence gathered prior to implementing the data collection methodology. Dataset annotation process is shown in Fig 5. The complete classification criteria are outlined below.
This pipeline study includes of three phases. In the initial phase, data gathering and data pre-processing are performed. In the second phase, four different DTL based models are used to extract various features from the input images during the training process (Fig 6). Finally, in the third phase, the classification task is performed using a fully connected layer. Those four models were selected because they represent a balanced combination of efficiency (EfficientNet-B0), feature richness (DenseNet-121), reliability (ResNet-50) and innovation (ViT-B16). Together, they can provide a comprehensive set of transfer learning approaches for evaluating the cashew nuts datasets.
The accuracy of a deep CNN model is largely depended on quality of the dataset used for training. To ensure reliable performance, a through data cleaning process is carried out after data collection. This process involves removing any faulty or irrelevant images from the dataset. Furthermore, all images are resized to a uniform dimension of 224×224 pixel. Which help reduce computational complexity during training and enhance overall performance of the models. Some of the details of the applied DTL models are given below.
EfficientNet-B0
EfficientNet-B0, developed using a novel compound scaling method that uniformly balanced a network’s depth, width and resolution in a systematic and efficient manner. Unlike traditional scaling methods that arbitrarily increase these dimensions, EfficientNet implements a balanced scaling strategy that improves accuracy at low computational costs. EfficientNet-B0, being a baseline version, provides an excellent balance between model size, computational efficiency and accuracy, making it highly deployment.
DenseNet-121
DenseNet-121 is a deep CNN that introduces the concept of dense connectivity. Here. Its every layer is receiving feature maps from its all-preceding layers and sends its feature maps to all subsequent layers. The 121-layer version strikes a balance between efficiency and depth, achieving high performance with low redundancy and is particularly effective in tasks requiring rich feature representation.
ResNet-50
ResNet-50 is a widely adopted CNN architecture that presented the concept of residual learning through shortcut (or skip) connections. The “50” in ResNet-50 refers to its 50 convolutional layers, making it a moderately deep model that balances computational cost and accuracy. ResNet-50 has become a benchmark model in computer vision, demonstrating strong generalization across diverse image classification and feature extraction tasks.
Vision transformer (ViT-B16)
ViT-B16 represents a significant departure from CNNs by adopting the transformer architecture, originally designed for NLP, to computer vision tasks. This model split the input image into fixed-size patches (in this case, 16×16. It is offering superior performance on large-scale datasets. The ViT-B16 variant, with its baseline configuration and 16-pixel patch size, strikes a stability between programmatic efficacy and correctness in vision tasks.
The accuracy of a deep CNN model is largely depended on quality of the dataset used for training. To ensure reliable performance, a through data cleaning process is carried out after data collection. This process involves removing any faulty or irrelevant images from the dataset. Furthermore, all images are resized to a uniform dimension of 224×224 pixel. Which help reduce computational complexity during training and enhance overall performance of the models.