Full Research Article

Automated Mango Disease Detection using ViT-CNN Fusion Across Multiple Public Datasets

Anuja A. Gharpure^1,*

Neha Jain¹

Vaibhav E. Narawade²

Email aagharpure@gmail.com

Affiliations

¹Pacific Academy of Higher Education and Research University, Udaipur-313 024. Rajasthan, India.

²Ramrao Adik Institute of Technology, D Y Patil Deemed to be University, Nerul, Navi Mumbai-400 706, Maharashtra, India.

Submitted16-01-2026|
Accepted14-04-2026|
First Online 30-04-2026|
doi 10.18805/IJARe.A-6510

ABSTRACT

Background: Mango is a commercially important fruit. Diseases affecting mangoes may lead to commercial losses. Early detection is needed for better production. The traditional method of disease detection is time-consuming and labour-intensive, affecting quality and yield. To accelerate the disease detection process, the machine learning algorithms will contribute to substantial development in disease detection in mango fruits.

Methods: The datasets used in this research work are publicly available, viz. mango disease dataset and mango leaf dataset based on diseases found in mangoes. The proposed work focused on the convolutional neural network (CNN) and vision transformer (ViTs) for the detection of diseases found in the mangoes. The lightweight models, such as ShuffleNet, MobileNet, ResNet were used in the research work to extract features. Further, ViT was used to detect and classify diseases. From the results, perfect accuracy was obtained for leaf diseases for all tested models.

Result: As compared to the leaf dataset used in the research work, the mango disease dataset gave lower accuracy; 89.17% for ShuffleNet and MobileNet and 86.23% for the ResNet model. These findings indicate that lightweight CNN models can provide high accuracy while being computationally efficient. Automation in disease detection will set a promising pathway in real-world mango production environments, supporting precision agriculture and reducing commercial losses.

KEYWORDS

INTRODUCTION

The fruit industry presents a lucrative opportunity for entrepreneurs and businesses. The fresh fruit trade has emerged as a growing opportunity, requiring processing, packaging, cold chain logistics and retail analytics. Climatic changes and pest infestations, combined with improper treatment, can negatively impact fruit cultivation. A wide range of diseases affect both fruits and their outer skin. Due to the diseases observed in fruits, farmers suffer from commercial loss. Studies report that fruit diseases contribute to nearly 10% of postharvest losses (Patel and Patil, 2024). To prevent this loss, the early detection of diseases in fruits is necessary. Human diagnosis can be unpredictable due to individual perspectives and varying levels of expertise. Advancements in technology such as machine learning techniques can improve the detection of diseases (Mehta et al., 2025). The technologies, such as Precision farming and Internet of Things (IoT), can help farmers increase yield, monitor environmental conditions and improve fruit quality (Abbasi et al., 2022).

Common fruit diseases like Anthracnose, Powdery Mildew, Mango Malformation, Mango Scab, Black Sigatoka, Colletotrichum musae, Black Rot and Apple Scab, are affecting crops such as mango, apple, banana and pineapple etc. (Meena et al., 2024). The research work focused on the detection of the disease found in the mango. To accomplish this task a machine intelligence can be used to enhance the detection of diseases.

In this research, two datasets based on mango diseases are used for understanding the efficiency of the machine learning models in disease detection. This work focused on the efficiency of the algorithm in the prediction of mango diseases. The efficiency of convolutional neural network (CNN) in recent advancements in the research indicates the remarkable accuracy in the predictions through the images (Li et al., 2022; Patel et al., 2023). Several algorithms have been combined with CNNs to improve prediction and classification accuracy. Algorithms based on LSTM, BiLSTM, Honey Badger Optimisation Algorithm, SVM, Random Forest, etc, are the algorithms observed to use with CNN to increase the accuracy in the prediction (Yuan et al., 2024 and Patil and Deshpande, 2024). The transfer learning is one of the algorithms that gave fine-tuning in the performance (Reddy et al., 2020 and Joseph et al., 2021).

Transfer learning typically involves using a pre trained base model, performing feature extraction, adding optional layers such as dropout or batch normalization and applying a suitable training strategy. The examples of transfer learning are ResNet-50, EfficientNet, Vision Transformer and Swin Transformer (Zhuang et al., 2021). Fig 1 illustrates the architecture of the transfer learning.

Fig 1: Architecture of transfer learning.

The researchers need to tackle the real-world challenges, such as lighting variations, occlusion and noisy images. The preprocessing and augmentation process will help in the reduction of the difficulties (Bhat et al., 2023). Preprocessing techniques such as resizing, shifting and zooming are applied to enhance image quality (Rahman et al., 2023).

Despite these advances, a need remains for lightweight, computationally efficient models that can deliver robust accuracy across multiple datasets and disease classes. The current research focuses on the detection of diseases on mangoes and mango leaves. The current research combined pretrained lightweight CNN variants such as MobileNet, ShuffleNet and ResNet for feature extraction from RGB images and ViT transformer for the disease detection. The proposed research is performing fine tuning of pre-trained layers. Performance metrics are compared to identify the most efficient predictor for each dataset. The proposed framework is designed to balance accuracy with efficiency, enabling scalable deployment in precision agriculture. The main contribution of this work is summarized as follows:
1. The ViT- CNN fusion is used for mango disease detection that combines feature extraction and contextual reasoning.
2. A comparative performance of all three lightweight pretrained CNNs (MobileNet, ShuffleNet, ResNet) is evaluated across datasets related to mango diseases.
3. The efficiency of the experiment will be used in deployment in precision algorithms.

Related work

Machine learning and Computer Vision play a significant role in fruit classification and disease detection, enabling the automation of systems that rely on visual features. For the classification and disease detection problem, research are focused on various image processing techniques and feature extraction methods which are tested across publicly available and self repositories. The publicly available dataset is analysed in the research to focus on resolution, lighting, variety and background complexity. This approach highlights the strengths and limitations of different approaches.

To find the defect in the apple, the YOLOv4 algorithm was applied to the images obtained using an NIR camera. YOLOv4 achieved over 92% accuracy on variants of apples in detecting defects to demonstrate its robustness. The variants of the YOLOv4 algorithms are also applied on the dataset, which gave an average 93.9% overall accuracy (Fan et al., 2022).

Surveys indicate that VGGNet generally outperforms AlexNet in fruit disease detection. Observations suggest hybrid approaches combining both models can achieve an accuracy of nearly 99% (Goel and Pandey, 2022). Tomato leaf diseases have been successfully classified using DenseNet121, achieving high accuracy across multiple class configurations. The publicly available dataset was organised in three ways: 5-class, 7-class and 10-class classification. The accuracy obtained on the original dataset is 98.16%, 95.08% and 94.34% for 5-class, 7-class and 10-class classification, respectively (Abbas et al., 2021). Similarly, grape leaf diseases are detected used DenseNet121 model with an accuracy 99.86% (Patil and More, 2025).

It has been observed that the Transfer Learning using Convolutional Neural Network gave 94.8% accuracy for public dataset FIDS 30 for classification and fruit detection (Geerthik et al., 2024). On the same dataset alexNet algorithm gives 75% accuracy (Geerthik et al., 2024). RNN gave 98.47% accuracy for the Dataset FIDS 30 (Dhiman et al., 2021).

A public dataset, Fruit-360, with more than 40,000 images, available on the Kaggle website, is widely used in classification problems based on CNN (Oltean, 2025). One such research focused on the classification of fruits on the images of the fruits apple, lemon and mango resulted in 95% accuracy (Bobde et al., 2021). Banana ripeness classification using YOLOv8 variants on a dataset of 18,000 images achieved accuracy between 94% and 96% (Aishwarya and Vinesh, 2023).

Hybrid approaches, such as combining CNNs with optimization algorithms like Honey Badger, have achieved near perfect accuracy in pomegranate disease classification (Patil and Deshpande, 2024). Feature extraction is accomplished with the algorithm of RestNet 50 and Detectron 2. The Multiclassification problem provides 99.58% accuracy in the prediction of the diseases (Patil and Deshpande, 2024).

In another former research, a self-repository FruitQ based on images of 11 fruits was created and tested with deep learning algorithms. Among them, the ResNet18 had given 99.80% result in the classification problem (Abayomi-Alli  et al., 2024).

Another research based on object detection frameworks such as YOLOv8 and Faster R CNN have been used to localise and quantify mango fruit and leaf diseases (Srinivasan et al., 2025). For the dataset based on Alphanso mangoes from Mysore, Karnataka, the machine learning machine learning classifiers gave 83% and 82% accuracy in the hierarchical classification and single-shot multiclass classification, respectively (Raghavendra et al., 2020).

Recent research mentioned accuracy above 98% when applied on MangoFruitDDS and MangoLeafBD datasets using ConvNeXt and Vision Transformers (ViTs), underscoring the potential of transformer-based architectures for mango disease detection (Alamri et al., 2025).

While prior studies demonstrate high accuracy in fruit disease detection, most focus on single datasets or computationally heavy models. There remains a need for lightweight, efficient architectures that generalize across multiple mango disease datasets. This study addresses this gap by proposing a ViT CNN fusion framework.

MATERIALS AND METHODS

The research was carried out at Pacific Academy of Higher Education and Research University, Udaipur, Rajasthan, India and Ramroa Adik Institute of Technology, D. Y. Patil University, Navi Mumbai, Maharashtra, India from February 2025 to September 2025.

Datasets used in work

The two datasets, mainly focused on mango diseases, are used in research. The public datasets based on various mango diseases and diseases observed on mango leaves are considered for evaluation and named as ‘Mango Disease Dataset’ for images with various diseases on mangoes and ‘Mango Leaf Disease Dataset’ for images with various diseases observed on mango leaves.

As per the recommendations in former research, both datasets are rearranged in training and testing folders (Mehta et al., 2021), where each folder consists of subfolders for various classes that represent diseases. The images are organised in the form of a problem-solving exercise to facilitate supervised learning. The class-specific folders are created. The mango disease dataset covers the diseases observed on the skin of the mangoes. Diseases like alternaria, anthracnose, black mould rot, healthy and stem-end rot are classified in the dataset (Faye et al., 2023). The mango leaf dataset collects the images for the diseases observed on the mango leaves. Diseases like anthracnose, bacterial canker, cutting weevil, die back, gall midge, powdery mildew and sooty mould are covered in the dataset (Ali et al., 2022). The mango leaf dataset also contains images for healthy leaves. The details of the datasets are given in Table 1.

Table 1: Dataset description.

Both datasets are divided into training and testing in 80: 20 proportion. The training set is further classified into a validation subset. This division helps the algorithm to learn features and adjust the hyperparameters using a validation set (James et al., 2021 and Han et al., 2022). A separate test dataset is needed for unbiased estimation of the prediction of model (Han et al., 2022). Such partitioning will enhance the robustness and ensure the reliable accuracy. Such splitting is a general practice in the machine learning and deep learning implementation in agriculture to ensure reliability and reproducibility of the result (Liu et al., 2025).

Data augmentation

To achieve better results, high- quality images are needed for successful results from machine learning and deep learning algorithms. To consume time from manual annotation and data balancing, the data augmentation process will be used (Mumuni and Mumuni, 2022). The data augmentation technique like geometric transformation, colour transformation, etc. are used to enhance the images in the dataset (Mumuni and Mumuni, 2022). The image size 224 x 224 is used in the current research. Random flipping, 15 degree geometric rotation, image cropped with random padding 4 are applied on all images of both the datasets. Mean values 0.485, 0.456, 0.406 and standard deviation with values 0.229, 0.224, 0.225 for each colour channel are applied on every image.

Tested algorithms

In the implementation of the proposed ViT transform, the model building was tried with pre-trained ResNet, MobileNet and ShuffleNet. The pretrained models are used for feature extraction. And the ViT transform is used to detect and predict the result.

Shufflenet is a lightweight architecture that can be used for the efficient implementation of the algorithm in mobile devices. In the ShuffleNet, the pointwise group convolution and channel shuffling are used. This will reduce the computation and increase the efficiency of the code, as well as fast calculation (Zhang et al., 2023).

The MobileNetV2 is also tested in the current research work. This algorithm is optimised for devices like smartphones, CPUs, GPUs, DSPs (Qin et al., 2024).

Due to the robustness of the ResNet 18, it is used in transfer learning (Bhogal and Singh, 2025). However, while implementing ResNet 18, it is being observed that ResNet 18 requires more computational time than the ShuffleNet and MobileNet algorithms. Due to its wide usage in transfer learning and domain-specific applications, the algorithm is also tested in the current research work to predict the classes from both datasets (Bhogal and Singh, 2025). Transfer learning also gave a 92.76% accuracy on the Areca Nut fruit disease detection where CNN algorithm is combined with SVM (Shree et al., 2025).

The fusion of ViT-CNN is applied on each dataset. Augmentation is applied on each dataset. The dataset is partitioned into a training and a testing dataset. A subset for validation was created from the training set. ViT transform collects the features from the pre-trained model and further forms the patches of the images, acting for each token to learn the pattern obtained in the images (Zhuang et al., 2021). Further prediction is accomplished from the ViT transformer.

The loss is calculated using cross entropy. Cross entropy is mainly used in classification tasks as it penalises incorrect predictions and facilitates faster convergence and better discrimination (Zhang et al., 2024). Adam optimiser is used. Adam optimiser gave the computational efficiency and adaptive learning rate. The optimiser’s adaptive nature enables faster convergence with minimal parameter tuning, making it particularly suitable for deep learning models in agricultural and image-based disease detection applications (Patil and Sawant, 2023). The used architecture is given in Fig 2. The same architecture is used for MobileNet and ShuffleNet algorithm for feature extraction.

Fig 2: Implementation of the ViT transform with pretrained ResNet 18.

RESULTS AND DISCUSSION

As per the work prescribed in the paper by Gharpure et al., (2026), before ViT application to the disease detection, the three pretrained models (ShufflNet, MobileNet and ResNet) are used to extract features. The threshold value is detected by the gradual increase in the epochs to check the aacuracy in the detection of the Mango disease (Gharpure et al., 2026).

The behaviour of the each pretrained algorithm against each dataset is noted in Table 2. For the ShuffleNet algorithm, the mango disease dataset gave a threshold value after 60^th epochs, where 89.13% accuracy is obtained. The mango leaf dataset had given 100% accuracy after 30^th epochs. For MobileNet, mango disease dataset gave the same accuracy as Shufflenet, that is, 89.13% after 60^th epochs. The mango leaf dataset gave the threshold value 100% after 30^th epochs. When the pretrained ResNet model is used, the mango disease dataset gave 86.23% accuracy, which is less than the MobileNet and ShuffleNet results. Fig 3 visualize the threshold for accuracy reached is after 60^th epoch for MobileNet, ShuffleNEt and ResNet whereas Resnet achieved a threshold value after 20^th epoch for mobile disease dataset. Mango leaf disease dataset reached to the threshold value after 40^th epoch for all CNN backbone.

Table 2: Accuracy table for proposed ViT transformer with pretrained shufflenet, mobilenet and ResNet Model.

Fig 3: Comparative chart for accuracy for both dataset using ViT-CNN fusion.

Accuracy for ShuffleNet and MobileNet for both datasets is nearly identical. On reaching to the threshold value, the Mango Leaf Dataset has given 100% accuracy for all pretrained models. This indicates that the leaf images in the Mango Leaf Dataset possess distinct and well-separated disease features. The high accuracy indicates that the models achieve both colour and texture-based features more accurately. From Fig 3, equivalent accuracy is achieved after 60th epoch for the mango disease dataset for Shufflenet and MobileNet and slightly lower accuracy in the ResNet implementation. The relatively lower performance of ResNet might be due to its higher model complexity, which can lead to overfitting on smaller or less diverse datasets (Tan and Le, 2020; Dosovitskiy et al., 2021).

From the analysis of the three models, the lightweight CNN models integrated with ViTs are more suitable for real-time mango disease detection as they balance computational efficiency. To get robust performance, the findings highlights the importance of dataset quality and disease separation.

CONCLUSION

The current work focuses on the performance of pre-trained models, including ResNet, MobileNet and ShuffleNet, on two publicly available datasets related to Mango Diseases. The consistent performance of the Mango Leaf Dataset was observed in all three variants, giving high accuracy. When the two datasets are compared in terms of the number of images, the mango leaf dataset yields better accuracy than the mango disease dataset, as it consists of more images. Not only the number of images but also the cleaning, preprocessing and augmentation of the images will help to increase the accuracy. The effect of the ViT model with pre-trained models of variants of CNN can be examined to check the performance. Additional diseases found in local geographical areas with a keen classification can also be added in further study. The obtained results suggest that efficient lightweight architectures such as MobileNet and ShuffleNet are more suitable for real-time applications due to their computational efficiency. Future work will be explored to find the accuracy on self-repositories where varying lighting conditions, occlusions and other diseases are considered. The current research work will frame the pathway for finding mango diseases to support precision agricultural and reducing commercial loss.

ACKNOWLEDGEMENT

The present study was supported by the guidance and encouragement of Dr. Vaibhav E. Narawade (Professor, RAIT, Mumbai) and Dr. Neha Jain (Assistant Professor, PAHER University). The authors also acknowledge the facilities and cooperation provided by the Department of Computer Science, PAHER University, Udaipur.

Disclaimers

The views and conclusions expressed in this article are solely those of the authors and do not necessarily represent the views of their affiliated institutions. The authors are responsible for the accuracy and completeness of the information provided, but do not accept any liability for any direct or indirect losses resulting from the use of this content.

CONFLICT OF INTEREST

The authors declare that there are no conflicts of interest regarding the publication of this article. No funding or sponsorship influenced the design of the study, data collection, analysis, decision to publish, or preparation of the manuscript.

REFERENCES

Abayomi-Alli, O.O., Damaševièius, R., Misra, S. and Abayomi-Alli, A. (2024). FruitQ: A new dataset of multiple fruit images for freshness evaluation. Multimedia Tools and Applications. 83(4): 11433-11460. https://doi.org/10.1007/s11042-023- 16058-6.

Abbas, A., Jain, S., Gour, M. and Vankudothu, S. (2021). Tomato plant disease detection using transfer learning with C- GAN synthetic images. Computers and Electronics in Agriculture. 187: 106279. https://doi.org/10.1016/ j.compag.2021.106279.

Abbasi, R., Shamsuddin, M.F.H. and Rahman, K.A. (2022). The digitization of the agricultural industry: A systematic review. Smart Agricultural Technology. 2: 100009. https:/ /doi.org/10.1016/j.atech.2022.100009.

Aishwarya, N. and Kumar, R.V. (2023). Banana Ripeness Classification with Deep CNN on NVIDIA Jetson Xavier AGX. Proceedings of the 7^th International Conference on I-SMAC. pp 663- 668. https://doi.org/10.1109/I-SMAC58438.2023.10290326.

Alamri, F.S., Sadad, T., Almasoud, A.S., Aurangzeb, A. and Khan, A. (2025). Mango disease detection using fused vision transformer with ConvNeXt architecture. Computers, Materials and Continua. https://doi.org/10.32604/ cmc.2025.061890.

Ali, S., Ibrahim, M., Ahmed, S.I., Nadim, M., Rahman, M.R., Shejunti, M.M. and Jabid, T. (2022). MangoLeafBD Dataset (Version 1) [Dataset]. Mendeley Data. https://doi.org/10.17632/ hxsnvwty3r.1.

Bhat, A., Khan, F. and Mir, I.A. (2023). Impact of image preprocessing techniques on the performance of deep learning models for mango fruit classification. Information Processing in Agriculture. 10(3): 450-462.

Bhogal and Singh (2025). A Comprehensive Review of ResNet- 18: Architecture and Applications. In Advances in Data- Driven Computing and Intelligent Systems (Lecture Notes in Networks and Systems). Springer.

Bobde, S., Jaiswal, S., Kulkarni, P., Patil, O., Khode, P. and Jha, R. (2021). Fruit quality recognition using deep learning algorithm. Proceedings of the International Conference on Smart Generation Computing, Communication and Networking (SMART GENCON 2021). https://doi.org/ 10.1109/SMARTGENCON51891.2021.9645793.

Das, J., Chakraborty, S. and Ghosh, D. (2022). Non-destructive mango quality assessment using support vector machine on near-infrared spectral data. Postharvest Biology and Technology. 185: 111780.

Dhiman, B., Kumar, Y. and Hu, Y.C. (2021). A general-purpose multi-fruit system for assessing fruit quality using recurrent neural networks. Soft Computing. 25(14): 9255-9272. https://doi.org/10.1007/s00500-021-05867-2.

Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., Gelly, S., Uszkoreit, J. and Houlsby, N. (2021). An image is worth 16x16 words: Transformers for image recognition at scale. International Conference on Learning Representations (ICLR).

Fan, S., Xiaoting, L., Wenqian, H., Vincent, J.Z., Qi, P., Xin, H., Lianjie, L. and Chi, Z. (2022). Real-time defect detection for apple sorting using NIR cameras with pruning-based YOLOv4 network. Computers and Electronics in Agriculture. 193: 106715. https://doi.org/10.1016/j.compag.2022.106715.

Faye, D., Diop, I., Mbaye, N., Diedhiou, M.M. and Dione, D. (2023). Sen Mango Fruit DDS (Version 4) [Dataset]. Mendeley Data. https://doi.org/10.17632/jvszp9cbpw.4.

Geerthik, S., Senthil, G.A., Oliviya, K.J. and Keerthana, R. (2024). A System and Method for Fruit Ripeness Prediction using Transfer Learning and CNN. Proceedings of the International Conference on Communication, Computing and Internet of Things (IC3IoT 2024). https://doi.org/10.1109/IC3Io T60841.2024.10550209.

Gharpure, A.A., Jain, N., Narawade, E.V. (2026). Vit-CNN fusion for robust mango quality evaluation based on classification across multiple public datasets. Indian Journal of Agricultural Research. 1-7. doi: 10.18805/IJARe.A-6493.

Goel, S. and Pandey, K. (2022). A survey on deep learning techniques in fruit disease detection. International Journal of Distributed Systems and Technologies. 13(8): 1-19. https://doi.org/10.4018/IJDST.307901.

Han, J., Kamber, M. and Pei, J. (2022). Data Mining: Concepts and Techniques (4^th ed.). Morgan Kaufmann.

James, G., Witten, D., Hastie, T. and Tibshirani, R. (2021). An Introduction to Statistical Learning with Applications in R (2^nd ed.). Springer.

Joseph, T., Mathew, G. and Abraham, S. (2021). Development of a mobile application for mango variety identification using fine-tuned convolutional neural networks. Applied Engineering in Agriculture. 37(6): 987-995.

Li, W., Zhang, Q., Wei, R. and Chen, S. (2022). High-accuracy mango variety classification using deep convolutional neural networks with attention mechanisms. Computers and Electronics in Agriculture. 198: 107025.

Liu, K., Yang, X., Ding, W., Ju, H., Li, T., Wang, J. and Yin, T. (2025). A survey on rough feature selection: Recent advances and challenges. IEEE/CAA Journal of Automatica Sinica. 12(4): 842-863. https://doi.org/10.1109/JAS.2025.125231.

Meena, H.S., Singh, A.N. and Shukla, P.K. (2024). Fruit and vegetable disease detection and classification: Recent trends, challenges and future opportunities. Engineering Applications of Artificial Intelligence. 133: 108187. https://doi.org/10.1016/j.engappai.2024.108187.

Mehta, D., Sehgal, S., Choudhury, T. and Sarkar, T. (2021). Fruit quality analysis using modern computer vision methodologies. Proceedings of the IEEE Madras Section International Conference (MASCON 2021). https://doi.org/10.1109/ MASCON51689.2021.9563427.

Mehta, R.A., Kumar, P., Prem, G., Aggarwal, S. and Kumar, R. (2025). AI-powered innovations in agriculture: A systematic review on plant disease detection and classification. Indian Journal of Agricultural Research. 59(9): 1321- 1330. doi: 10.18805/IJARe.A-6371.

Mumuni, A. and Mumuni, F. (2022). Data augmentation: A comprehensive survey of modern approaches. Array. 16: 100258. https:/ /doi.org/10.1016/j.array.2022.100258.

Oltean, M. (2025). Fruit-360 Dataset [Dataset]. Kaggle. https:// www.kaggle.com/datasets/moltean/fruits.

Patel, H.B. and Patil, N.J. (2024). Enhanced CNN for fruit disease detection and grading classification using SSDAE-SVM for postharvest fruits. IEEE Sensors Journal. 24(5): 6719-6732. https://doi.org/10.1109/JSEN.2023.3342833.

Patel, R., Sharma, V. and Gupta, A. (2023). Real-time mango maturity grading based on lightweight CNN architectures for embedded systems. Journal of Agricultural Engineering Research. 34(2): 112-125.

Patil, R.G. and More, A. (2025). A comparative study and optimization of deep learning models for grape leaf disease identification. Indian Journal of Agricultural Research. 59(4): 654- 663. doi: 10.18805/IJARe.A-6242.

Patil, S.P. and Deshpande, A.A. (2024). Disease detection and classification in pomegranate fruit using hybrid convolutional neural network with honey badger optimization algorithm. International Journal of Food Properties. 27(1): 815- 837. https://doi.org/10.1080/10942912.2024.2365927.

Patil, S.S. and Sawant, P.V. (2023). Deep learning-based crop disease classification using transfer learning and optimization techniques. IEEE Access. 11: 121456- 121468. https://doi.org/10.1109/ACCESS.2023.3311234.

Qin, D., Chas, L., Manolis, D., Marco, F., Shixin, L., Fan, Y. et al. (2024). MobileNetV4: Universal models for the mobile ecosystem. arXiv preprint arXiv. 2404.10518. https:// arxiv.org/abs/2404.10518.

Raghavendra, A., Guru, D.S., Rao, M.K. and Sumithra, R. (2020). Hierarchical approach for ripeness grading of mangoes. Artificial Intelligence in Agriculture. 4: 243-252. https:/ /doi.org/10.1016/j.aiia.2020.10.003.

Rahman, M.M., Basar, M.A., Shinti, T.S., Khan, M.S.I., Babu, H.M.H. and Uddin, K.M.M. (2023). A deep CNN approach to detect and classify local fruits through a web interface. Smart Agricultural Technology. 5: 100321. https://doi.org/ 10.1016/j.atech.2023.100321.

Reddy, L., Devi, K. and Rao, M. (2020). Transfer learning for efficient mango disease classification with limited data. Plant Pathology Journal. 36(4): 380-387.

Shree, V.N.S., Rajarajeswari, S. and Basavaraj, G.N. (2025). Transfer learning-based Areca nut (Areca catechu) disease detection using CNN and SVM approaches with ResNet-50 for improved deep learning performance. Indian Journal of Agricultural Research. 59(9): 1385-1394. doi: 10.18805/IJARe.A-6404.

Srinivasan, S., Jain, A.K. and Sharma, P.R. (2025). DBA-ViNet: An effective deep learning framework for fruit disease detection. BMC Plant Biology.

Tan, M. and Le, Q.V. (2020). EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks. Proceedings of the 37^th International Conference on Machine Learning (ICML). PMLR. 6105-6114.

Yuan, Y., Chen, J., Polat, K. and Alhudhaif, A. (2024). Detecting fruit and vegetable freshness through integration of convolutional neural networks and bidirectional long short-term memory networks. Current Research in Food Science. 8: 100723. https://doi.org/10.1016/j.crfs.2024. 100723.

Zhang, A., Lipton, Z.C., Li, M. and Smola, A.J. (2024). Dive into Deep Learning. Cambridge University Press.

Zhang, X., Zhu, X., Li, B., Guan, Z. and Che, W. (2023). LA-ShuffleNet: A strong convolutional neural network for edge computing devices. IEEE Access. 11: 116684-116695. https:// doi.org/10.1109/ACCESS.2023.3324713.

Zhuang, F., Qi, Z., Duan, K., Xi, D., Zhu, Y., Zhu, H., Xiong, H. and He, Q. (2021). A comprehensive survey on transfer learning. Proceedings of the IEEE. 109(1): 43-76. https:/ /doi.org/10.1109/JPROC.2020.2978386.

Disclaimer :

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.

Copyright :

This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Published In

Indian Journal of Agricultural Research

Article Metrics

Views

Citations

Reviewed By

Full Research Article

Automated Mango Disease Detection using ViT-CNN Fusion Across Multiple Public Datasets

Anuja A. Gharpure^1,*

0009-0006-6993-118X

Neha Jain¹

Vaibhav E. Narawade²

Email aagharpure@gmail.com

Affiliations

¹Pacific Academy of Higher Education and Research University, Udaipur-313 024. Rajasthan, India.

²Ramrao Adik Institute of Technology, D Y Patil Deemed to be University, Nerul, Navi Mumbai-400 706, Maharashtra, India.

Submitted16-01-2026|
Accepted14-04-2026|
First Online 30-04-2026|
doi 10.18805/IJARe.A-6510

ABSTRACT

KEYWORDS

INTRODUCTION

Fig 1: Architecture of transfer learning.

MATERIALS AND METHODS

Table 1: Dataset description.

Fig 2: Implementation of the ViT transform with pretrained ResNet 18.

RESULTS AND DISCUSSION

Table 2: Accuracy table for proposed ViT transformer with pretrained shufflenet, mobilenet and ResNet Model.

Fig 3: Comparative chart for accuracy for both dataset using ViT-CNN fusion.

CONCLUSION

ACKNOWLEDGEMENT

CONFLICT OF INTEREST

REFERENCES

Abayomi-Alli, O.O., Damaševièius, R., Misra, S. and Abayomi-Alli, A. (2024). FruitQ: A new dataset of multiple fruit images for freshness evaluation. Multimedia Tools and Applications. 83(4): 11433-11460. https://doi.org/10.1007/s11042-023- 16058-6.

Abbas, A., Jain, S., Gour, M. and Vankudothu, S. (2021). Tomato plant disease detection using transfer learning with C- GAN synthetic images. Computers and Electronics in Agriculture. 187: 106279. https://doi.org/10.1016/ j.compag.2021.106279.

Abbasi, R., Shamsuddin, M.F.H. and Rahman, K.A. (2022). The digitization of the agricultural industry: A systematic review. Smart Agricultural Technology. 2: 100009. https:/ /doi.org/10.1016/j.atech.2022.100009.

Aishwarya, N. and Kumar, R.V. (2023). Banana Ripeness Classification with Deep CNN on NVIDIA Jetson Xavier AGX. Proceedings of the 7^th International Conference on I-SMAC. pp 663- 668. https://doi.org/10.1109/I-SMAC58438.2023.10290326.

Alamri, F.S., Sadad, T., Almasoud, A.S., Aurangzeb, A. and Khan, A. (2025). Mango disease detection using fused vision transformer with ConvNeXt architecture. Computers, Materials and Continua. https://doi.org/10.32604/ cmc.2025.061890.

Ali, S., Ibrahim, M., Ahmed, S.I., Nadim, M., Rahman, M.R., Shejunti, M.M. and Jabid, T. (2022). MangoLeafBD Dataset (Version 1) [Dataset]. Mendeley Data. https://doi.org/10.17632/ hxsnvwty3r.1.

Bhat, A., Khan, F. and Mir, I.A. (2023). Impact of image preprocessing techniques on the performance of deep learning models for mango fruit classification. Information Processing in Agriculture. 10(3): 450-462.

Bhogal and Singh (2025). A Comprehensive Review of ResNet- 18: Architecture and Applications. In Advances in Data- Driven Computing and Intelligent Systems (Lecture Notes in Networks and Systems). Springer.

Bobde, S., Jaiswal, S., Kulkarni, P., Patil, O., Khode, P. and Jha, R. (2021). Fruit quality recognition using deep learning algorithm. Proceedings of the International Conference on Smart Generation Computing, Communication and Networking (SMART GENCON 2021). https://doi.org/ 10.1109/SMARTGENCON51891.2021.9645793.

Das, J., Chakraborty, S. and Ghosh, D. (2022). Non-destructive mango quality assessment using support vector machine on near-infrared spectral data. Postharvest Biology and Technology. 185: 111780.

Dhiman, B., Kumar, Y. and Hu, Y.C. (2021). A general-purpose multi-fruit system for assessing fruit quality using recurrent neural networks. Soft Computing. 25(14): 9255-9272. https://doi.org/10.1007/s00500-021-05867-2.

Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., Gelly, S., Uszkoreit, J. and Houlsby, N. (2021). An image is worth 16x16 words: Transformers for image recognition at scale. International Conference on Learning Representations (ICLR).

Fan, S., Xiaoting, L., Wenqian, H., Vincent, J.Z., Qi, P., Xin, H., Lianjie, L. and Chi, Z. (2022). Real-time defect detection for apple sorting using NIR cameras with pruning-based YOLOv4 network. Computers and Electronics in Agriculture. 193: 106715. https://doi.org/10.1016/j.compag.2022.106715.

Faye, D., Diop, I., Mbaye, N., Diedhiou, M.M. and Dione, D. (2023). Sen Mango Fruit DDS (Version 4) [Dataset]. Mendeley Data. https://doi.org/10.17632/jvszp9cbpw.4.

Geerthik, S., Senthil, G.A., Oliviya, K.J. and Keerthana, R. (2024). A System and Method for Fruit Ripeness Prediction using Transfer Learning and CNN. Proceedings of the International Conference on Communication, Computing and Internet of Things (IC3IoT 2024). https://doi.org/10.1109/IC3Io T60841.2024.10550209.

Gharpure, A.A., Jain, N., Narawade, E.V. (2026). Vit-CNN fusion for robust mango quality evaluation based on classification across multiple public datasets. Indian Journal of Agricultural Research. 1-7. doi: 10.18805/IJARe.A-6493.

Goel, S. and Pandey, K. (2022). A survey on deep learning techniques in fruit disease detection. International Journal of Distributed Systems and Technologies. 13(8): 1-19. https://doi.org/10.4018/IJDST.307901.

Han, J., Kamber, M. and Pei, J. (2022). Data Mining: Concepts and Techniques (4^th ed.). Morgan Kaufmann.

James, G., Witten, D., Hastie, T. and Tibshirani, R. (2021). An Introduction to Statistical Learning with Applications in R (2^nd ed.). Springer.

Joseph, T., Mathew, G. and Abraham, S. (2021). Development of a mobile application for mango variety identification using fine-tuned convolutional neural networks. Applied Engineering in Agriculture. 37(6): 987-995.

Li, W., Zhang, Q., Wei, R. and Chen, S. (2022). High-accuracy mango variety classification using deep convolutional neural networks with attention mechanisms. Computers and Electronics in Agriculture. 198: 107025.

Liu, K., Yang, X., Ding, W., Ju, H., Li, T., Wang, J. and Yin, T. (2025). A survey on rough feature selection: Recent advances and challenges. IEEE/CAA Journal of Automatica Sinica. 12(4): 842-863. https://doi.org/10.1109/JAS.2025.125231.

Meena, H.S., Singh, A.N. and Shukla, P.K. (2024). Fruit and vegetable disease detection and classification: Recent trends, challenges and future opportunities. Engineering Applications of Artificial Intelligence. 133: 108187. https://doi.org/10.1016/j.engappai.2024.108187.

Mehta, D., Sehgal, S., Choudhury, T. and Sarkar, T. (2021). Fruit quality analysis using modern computer vision methodologies. Proceedings of the IEEE Madras Section International Conference (MASCON 2021). https://doi.org/10.1109/ MASCON51689.2021.9563427.

Mehta, R.A., Kumar, P., Prem, G., Aggarwal, S. and Kumar, R. (2025). AI-powered innovations in agriculture: A systematic review on plant disease detection and classification. Indian Journal of Agricultural Research. 59(9): 1321- 1330. doi: 10.18805/IJARe.A-6371.

Mumuni, A. and Mumuni, F. (2022). Data augmentation: A comprehensive survey of modern approaches. Array. 16: 100258. https:/ /doi.org/10.1016/j.array.2022.100258.

Oltean, M. (2025). Fruit-360 Dataset [Dataset]. Kaggle. https:// www.kaggle.com/datasets/moltean/fruits.

Patel, H.B. and Patil, N.J. (2024). Enhanced CNN for fruit disease detection and grading classification using SSDAE-SVM for postharvest fruits. IEEE Sensors Journal. 24(5): 6719-6732. https://doi.org/10.1109/JSEN.2023.3342833.

Patel, R., Sharma, V. and Gupta, A. (2023). Real-time mango maturity grading based on lightweight CNN architectures for embedded systems. Journal of Agricultural Engineering Research. 34(2): 112-125.

Patil, R.G. and More, A. (2025). A comparative study and optimization of deep learning models for grape leaf disease identification. Indian Journal of Agricultural Research. 59(4): 654- 663. doi: 10.18805/IJARe.A-6242.

Patil, S.P. and Deshpande, A.A. (2024). Disease detection and classification in pomegranate fruit using hybrid convolutional neural network with honey badger optimization algorithm. International Journal of Food Properties. 27(1): 815- 837. https://doi.org/10.1080/10942912.2024.2365927.

Patil, S.S. and Sawant, P.V. (2023). Deep learning-based crop disease classification using transfer learning and optimization techniques. IEEE Access. 11: 121456- 121468. https://doi.org/10.1109/ACCESS.2023.3311234.

Qin, D., Chas, L., Manolis, D., Marco, F., Shixin, L., Fan, Y. et al. (2024). MobileNetV4: Universal models for the mobile ecosystem. arXiv preprint arXiv. 2404.10518. https:// arxiv.org/abs/2404.10518.

Raghavendra, A., Guru, D.S., Rao, M.K. and Sumithra, R. (2020). Hierarchical approach for ripeness grading of mangoes. Artificial Intelligence in Agriculture. 4: 243-252. https:/ /doi.org/10.1016/j.aiia.2020.10.003.

Rahman, M.M., Basar, M.A., Shinti, T.S., Khan, M.S.I., Babu, H.M.H. and Uddin, K.M.M. (2023). A deep CNN approach to detect and classify local fruits through a web interface. Smart Agricultural Technology. 5: 100321. https://doi.org/ 10.1016/j.atech.2023.100321.

Reddy, L., Devi, K. and Rao, M. (2020). Transfer learning for efficient mango disease classification with limited data. Plant Pathology Journal. 36(4): 380-387.

Shree, V.N.S., Rajarajeswari, S. and Basavaraj, G.N. (2025). Transfer learning-based Areca nut (Areca catechu) disease detection using CNN and SVM approaches with ResNet-50 for improved deep learning performance. Indian Journal of Agricultural Research. 59(9): 1385-1394. doi: 10.18805/IJARe.A-6404.

Srinivasan, S., Jain, A.K. and Sharma, P.R. (2025). DBA-ViNet: An effective deep learning framework for fruit disease detection. BMC Plant Biology.

Tan, M. and Le, Q.V. (2020). EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks. Proceedings of the 37^th International Conference on Machine Learning (ICML). PMLR. 6105-6114.

Yuan, Y., Chen, J., Polat, K. and Alhudhaif, A. (2024). Detecting fruit and vegetable freshness through integration of convolutional neural networks and bidirectional long short-term memory networks. Current Research in Food Science. 8: 100723. https://doi.org/10.1016/j.crfs.2024. 100723.

Zhang, A., Lipton, Z.C., Li, M. and Smola, A.J. (2024). Dive into Deep Learning. Cambridge University Press.

Zhang, X., Zhu, X., Li, B., Guan, Z. and Che, W. (2023). LA-ShuffleNet: A strong convolutional neural network for edge computing devices. IEEE Access. 11: 116684-116695. https:// doi.org/10.1109/ACCESS.2023.3324713.

Zhuang, F., Qi, Z., Duan, K., Xi, D., Zhu, Y., Zhu, H., Xiong, H. and He, Q. (2021). A comprehensive survey on transfer learning. Proceedings of the IEEE. 109(1): 43-76. https:/ /doi.org/10.1109/JPROC.2020.2978386.

Disclaimer :

Copyright :

APC

APC cover the cost of turning a manuscript into a published manuscript through peer-review process, editorial work as well as the cost of hosting, distributing, indexing and promoting the manuscript.

Publish With US

Submit your manuscript through user friendly platform and acquire the maximum impact for your research by publishing with ARCC Journals.

Become a Reviewer/Member

Join our esteemed reviewers panel and become an editorial board member with international experts in the domain of numerous specializations.

Open Access

Filling the gap between research and communication ARCC provide Open Access of all journals which empower research community in all the ways which is accessible to all.

Products and Services

We provide prime quality of services to assist you select right product of your requirement.

Support and Policies

Finest policies are designed to ensure world class support to our authors, members and readers. Our efficient team provides best possible support for you.

Published In

Indian Journal of Agricultural Research

Editorial Board

View all (0)

Full Research Article

Automated Mango Disease Detection using ViT-CNN Fusion Across Multiple Public Datasets

ABSTRACT

KEYWORDS

INTRODUCTION

MATERIALS AND METHODS

RESULTS AND DISCUSSION

CONCLUSION

ACKNOWLEDGEMENT

CONFLICT OF INTEREST

REFERENCES

In this Article

APC

Publish With US

Become a Reviewer/Member

Open Access

Products and Services

Support and Policies

Full Research Article

Automated Mango Disease Detection using ViT-CNN Fusion Across Multiple Public Datasets

ABSTRACT

KEYWORDS

INTRODUCTION

MATERIALS AND METHODS

RESULTS AND DISCUSSION

CONCLUSION

ACKNOWLEDGEMENT

CONFLICT OF INTEREST

REFERENCES

In this Article

APC

Publish With US

Become a Reviewer/Member

Open Access

Products and Services

Support and Policies

Editorial Board