Tea as a crop attributes economic value for its cultivators. It provides livelihood to the people of the surrounding regions. Tea is widely used as a beverage around the world. It also contributes to a number of health benefits for the human body. So, quality and quantity of tea production is crucial to maintain the steady supply across the world markets (
Chaudhuri and Jamatia, 2021). But tea production is affected by diseases resulting from pest attacks and environmental conditions. Timely and precise disease detection is necessary so that effective counter measures can be deployed. Manual disease detection through naked eye requires expert guidance which is time consuming and costly. Cutting edge computing technologies can be rendered for this purpose as they have become highly efficient in terms of speed, accuracy and scalability (
Cho, 2024). Computer vision is one of such technologies that can be used for this purpose. Computer vision paradigms have profoundly transfigured the domain of autonomous phytopathological diagnostics. The convolutional neural Network (CNN)
(Zhao et al., 2024), an intricate visual cognition framework, functions as a formidable apparatus for this endeavor.
CNNs contain a number of convolutional layers which perform automatic feature extraction by employing convolution operation on an image which is a dot product between the pixel values of the input image and the values of the kernel used. The output of a convolutional layer is a feature map. Feature map generated by convolutional layers are fed to dense layers for classification. A number of CNN architectures have been proposed by researchers by varying the number of convolutional layers used, different kernel size, connections between layers,
etc. The models were developed to improve the performance of its predecessor. CNNs are deep neural networks based on the idea that deeper network gives better performance. So, models like AlexNet, VGG16, VGG19
etc. were made deeper to achieve better performance. But it was observed that after a certain depth the performance of the network begins to saturate and then it degrades rapidly. This occurs due to the vanishing gradient problem
(Hu et al., 2021). ResNet (Residual Network) tries to address this problem by introducing the concept of residual blocks. Also to reduce over fitting, drop out layers were introduced to CNNs. Attention modules (
Soydaner, 2022) are another computer vision techniques that can be integrated with CNN to enhance its performance.
Attention module mimics the human vision mechanism wherein humans give attention to only the relevant features to identify an object instead of attending to all features. Attention module uses weighted mechanism to give higher priorities to the most relevant features of an image. Channel attention
(Wan et al., 2023), spatial attention
(Awan et al., 2021), self-attention (
Li et al., 2023) are some of the attention modules that are used to enhance model’s performance. Here, we have integrated the channel attention module with our CNN architecture.
In our work, we have used CNN architectures to detect diseases in tea leaves. We performed an empirical study of five different CNN models
viz. VGG19, ResNet152V2, InceptionV3, MobileNetV2 and DenseNet201 for detecting diseases in tea leaves. We assessed these models and ranked them in order of their performances. DenseNet201 was found to be the best performing. DenseNet201 can efficiently reuse features and has lower computational overhead compared to other advance models. So, it was enhanced by integrating channel attention module mechanism and named it E-DenseNet201. The modified model yielded better performance when tested against the same evaluation parameters. The rest of the paper is structured as follows: section 2.0 contains a review of the existing literature. Section 3.0 details the methodology employed in our study and section 4.0 describes the experimental setup. The results of the experiments were presented in section 5.0. The proposed method is given in section 6.0. Section 7.0 includes a discussion of the findings and future work. Finally, in section 8.0 we give the concluding remarks.
Literature review
Deep learning along with machine learning algorithms have dominated the field of disease detection in plants in recent times. CNN architectures like AlexNet, VGG, ResNet, etc. have shown to yield good performance for detecting diseases in plants. In a study,
Chen et al., (2019) used a modified version of AlexNet for tea leaves disease detection. The model’s performance was better than Support Vector Machine and Multilayer Perceptron classifiers.
Hu et al., (2019) modified a CNN of a CIFAR-10 quick model by integrating depth-wise separable convolutions to reduce the number of parameters, which showed higher performance. CNNs use optimizers to modify the weights and biases of the network while it is being trained. So, a good optimizer is crucial for the performance of a CNN.
Ozden (2021) found Adagrad optimizer to perform better for MobileNet and EfficientNet architecture while detecting diseases in apple leaves.
Soeb et al., (2023) have used YOLO V7, an object detection algorithm, for detecting disease in tea leaves and have found comparatively better results over its peers.
Bao et al., (2022) modified RetinaNet, another object detection algorithm and termed it as AX-RetinaNet which uses a module that fuses multi-scale features to obtain quality feature maps. The model shows improved performance for disease recognition in tea leaves.
A significant quantity of samples is needed for training a deep learning architecture. So, to deal with fewer number of training samples
Ramdan et al., (2020) employed transfer learning models for disease detection in tea leaves. The models were fine-tuned for the target dataset.
Abbas et al., (2021) generated synthetic images using Conditional GAN (CGAN) to boost the quantity of samples for training DenseNet121.
Falaschetti et al., (2022) have used resource constraint CNN on a low powered, inexpensive machine vision camera for plant disease classification in real time.
Jung et al. (2023) implemented ResNet50, AlexNet, GoogleNet, VGG19 and EfficientNet for disease identification in bell pepper, potato and tomato.
Harakannanavar et al., (2022) employed CNN, K Nearest Neighbors (KNN) and Support Vector Machine for disease classification in tomato leaves. Images were processed using histogram equalization. Extraction of features were done using Discrete Wavelet Transform, Principal Component Analysis and Gray Level Co-occurrence Matrix.
Andrew et al., (2022) employed transfer learning CNN models like DenseNet121, ResNet50. VGG16 and InceptionV4 for disease identification in different plant species. DenseNet-121 emerged as the best-performing model, outperforming other models.
Saleem et al. (2022) presented a dataset containing images of kiwifruit, apple, pear, avocado and grapevine. An improved Region-based Fully Convolutional Network (RFCN) was proposed by using a fixed-shape resizer with a bicubic interpolator, a random normal weight initializer, batch normalization and the stochastic gradient descent (SGD) optimizer with momentum. Translational and rotational data augmentation techniques were found to be the most effective for improving performance.
Mathew and Mahesh (2022) employed YOLOV5 for detection of bacterial spot disease in bell pepper. The model was employed for real-time disease detection.
Benfenati et al., (2023) used auto-encoder for detecting diseases in cucumber leaves. Unsupervised disease detection was performed. The study used multi spectral images.
Patil and More (2025) deployed five predefined CNN models namely Densenet121, VGG16, VGG19, InceptionV3 and ResNet50V2 for detecting grape leaf diseases.
Kalmani et al., (2025) proposed a hybrid model for crop yield prediction of wheat and rice. The model integrated 1D CNN with Long Short-Term Memory (LSTM) and an attention layer. Table 1 summarizes the methods, plant type, dataset used and results obtained by the researchers discussed here.