Modern disease prediction systems are another process that can take a long time to understand the type of disease. Disease prediction is related to computer vision and machine learning and involves detecting the type of disease in different plants. This is a long-term research problem with many real-world applications, like pomegranate leaf disease, grape leaf disease
etc. Disease prediction is related to finding the accurate disease of the plant. Accurate disease prediction is one of the crucial requirements in the field of data science. The performance of existing methods is not improved for disease prediction in this challenging environment. A prediction scheme is proposed, an ML model that improves the efficiency of disease prediction. With the advancement of new advancements, the area of study of agriculture becomes more important as it is also utilized as a meal for a multitude of the population but additionally in a few of additional uses. Floraiscrucial in our lives because they furnish a source of power and help to surmount the issue of climate change. Today, flora is impacted by a few diseases, as an example those that cause harsh economic, social and ecological damage, amongst other people. Therefore, it is of great importance to accurately and rapidly detect plant diseases.
The authors (
Amrita and Raul, 2019) have explained the visual observations are often used to determine the severity of disease in production areas. Image processing has made the greatest progress in agriculture. Diverse neural links approaches as an example back propagation and considerable component examination (PCA) have been utilized to recognize fungal diseases. Discover plant leaf diseases by enhancing the need rate in classification approaches. So far, responsible lenders can be found in this place. A linear SVM is utilized and this is a multi-class classification that categorizes the information into only two classes, but it has been unproductive and outcomes in less precise classification. Pests result in the destruction of crops or plant parts, lowering meal fabrication and guiding to meal insecurity.
Below equations and methods are used to find the type of the disease and accuracy percentage of disease in the plant and machine learning algorithms are used to find the type of the disease using classification algorithms and predicting the percentage of the disease attacked.
Evaluation models
Various types of measures are introduced to check the performance of vision-based and classification techniques. Accuracy, precision, recall and F1 scores are the most used checking measures for classification models. Parameters such as precession, recall and F (8), true positive (TP), false positive (FP), false negative (FN) and true negative (TN) for the proposed model the authors
(Thakur et al., 2022) were calculated.
Accuracy
It’s defined as the ratio of the complete number of right predictions to the complete number of samples (T). Precision is utilized interchangeably with recognition rate (RR), hit rate (SR) or precision rate (CR) in the introduction.
..........(1)
Precision
Precision is defined as the percentage of accurate positives discovered over the complete number of predicted determined samples.
..........(2)
Recall
Recall is defined as the ratio of correctly predicted positive outcomes to actual positive outcomes.
..........(3)
F1-score
The harmonic mean of precision and recall is defined as the F1-score.
..........(4)
Specificity
Specificity is defined as the ratio of the exact pre-detected negative to the actual negative.
..........(5)
mAP
The intended precision was defined as the place below the accuracy-retract arc(AP). The intend ordinary exactness (mAP) is the ordinary of the intend precision AP
i, i ∈ 1……, n Over a batch of n samples.
..........(6)
This paper is structured according to the following manner. In section I, Brief introduction about plant leaf disease detection concepts, In the II section related work is outlined in detail, like supervised and unsupervised machine learning and in supervised machine learning again first section is regression algorithm, decision tree, random forest, classification algorithms like KNN, Naive Bayas, SVM and regression algorithm, in section III brief summary of all the machine learning algorithms, In section IV final conclusion is set.
Types of Machine Learning and its algorithms are used to classify and detect plant disease.
In Fig 1, the various types of supervised and unsupervised machine learning algorithms are listed that are used to find the different types of diseases and also classifying various types of the disease in plant.
• Traditional methods are the manual checking methods and takes more time and the plant disease detection made based on symptoms of the plant.
• The use of ML methods such as SVM, K-NN and Naive Bayas is used to separate diseased and uninfected leaves.
• Currently, machine learning-based methods have improved the accuracy of predictions for crops such as grapes, pomegranates, corn; however, disease prediction performance still needs to be improved in a harsh environment.
In the below line have explained about the different machine learning algorithms and uses of algorithms for implementation of various tasks in the agriculture field for identifying the various types of disease and percentage of the disease attacked to the plant.
Supervised machine learning
Well trained machines are used for supervised machine learning “labeled” training dataset based on this training dataset output will be predicted by machines. Labeled data means that some input data has been labeled with the correct output in the trained dataset. In supervised machine learning, training data is fed to a machine that functions as a supervisor that can teach the machine to correctly predict output given a given set of input data. It applies to the same concepts that students learn under the supervision of their homeroom teachers. It’s a procedure that feeds the accurate input along with the result information to the artificial intelligence algorithm. The goal of a supervised acquiring procedure is to locate a mapping operate that maps the input factor (X) to the product factor (Y).
Regression
The equation used for linear regression can be written as:
Y = Mx+C OR y=bo+b*x+e
Where,
M and C = Gradient and constant values (to be determined) defining the straight line.
bo = Intecept,
b = Regression weight or coefficient associated with the predictor variable x.
e = Residual error.
It’s a technique for locating the relationship between autonomous factors or characteristics and a measured factor or result and it’s utilized as a way for predictive modeling in device learning, whereat a procedure is utilized to forecast continual result.
Investigating machine learning regression techniques for leaf rust detection using ultrasound measurement in this paper the authors
(Davoud et al., 2016) have discussed the composite impact of symptoms and disease stages on the main characteristics of the plant leading to limiting the severity of the disease detected by different techniques. In addition, ML techniques are used to detect plant diseases; the influence of these symptoms on their performance is not taken into account. Spectra of infected and uninfected leaves under various disease symptoms were measured with a non-imaging spectrophotometer in the electrical region of 350-2500 nm.
Adhao et al., (2017) have explained about the cotton is the most important cash crops in India and its influence on the economy. Cotton production is decreasing every year due to various disease attacks. Usually, crop diseases are caused by harmful insects and pathogens, reducing yields on a large scale if not promptly prevented. This work presents a system for the detection and control of cotton leaf disease.
Logistic regression
Calculating the probability, we used logistic regression to find the binary outcome: to check if something happened or not can be shown in Yes/No or Pass/Fail. Unbiased variables are analyzed to decide on a binary outcome, with consequences falling into one of two categories. The independent variables can be expressions or numbers, but the dependent variable is always an expression. Write like:
Calculating probability is X given independent variable, the Y dependent variable.
The convolutional neural network using optimized proposed logistic decision regression
(Priyanka et al., 2022), have discussed the agricultural nature is very important to cultivate crops with the aid of machine learning and artificial intelligence. In this work, we will detect leaf diseases by implementing image classification and analysis algorithms. Conventional manual identification of medicinal plants is a quick process that requires the help of plant identification specialists to find plant diseases.
Decision tree
In machine learning, when we know the cost of each probability and the outcome it will produce, it can be calculated using the following formula:
Expected Value (EV) = (First possible outcome × Probability of outcome) + (Second possible outcome × Probability of outcome) - Cost
Steps used for making decision tree
Step-I: Take a set of rows list (dataset) which can get to make for making decision tree.
Step-II: Dataset calculation for uncertainty or impurity in dataset or checking the mixed dataset
etc.
Step-III: List of all questions generation which need to be asked to that node.
It is perfect for classroom problems because it can schedule instructions to a specific level. It works like a diagram, isolating elements of information into two similar categories, from ‘stem’ to ‘branch’ to ‘leaf’ where the final categories look more similar. It creates categories within categories, in the spirit of natural classification with limited human supervision
(Rajesh et al., 2020) have explained the detection and classification of foliar diseases according to the decision tree in, agricultural production depends on the economy.
In Machine Learning domain the authors
(Nalin et al., 2023) have discussed the identification of plant disease by using different types of algorithms and they have been started for finding excellent result for data by using algorithm, this is an intriguing procedure in daily life application. The need for efficient algorithms is met when implementing these algorithms, the main objective of which is to focus on the prediction of plant diseases in the agricultural field in practice by providing resources in agricultural and business sectors. This bulletin’s conceptual engineering inculcates tracking each leaf inside a tree through a machine learning model.
In Fig 2, the authors explained the gist that we use the results to provide, digest and approve a version used to predict fateful events and suggest possible effects. Capabilities while building predictive modeling methods. Regression and neural networks are two of the most well-known and widely used predictive modeling techniques. In a word, several types of Bayesian assessment, selection wood and statistical profile processing are complementary techniques. Each of the many prediction models has an important function to play. As we all understand, a metamodel is a model that can be used many times and is created by teaching an algorithm to use the information that has been collected, processed and saved for later use in the future evaluation of the final result.
Random forest algorithm
Random forest algorithm is a popular supervised machine learning technique; it is used to solve both classification and regression problems in machine learning.
Random forests converge
Given an ensemble of classifiers h1(x), h2(x) . . . hK (x) and with the training set drawn at. random from the distribution of the random vector Y, X, outline the margin characteristic as explained the in (
Leo Breiman Statistics, 2001).
mg [(X, Y) = avkI(hk (X) = Y ) - max j = Y avk I(hk (X) = j)].
In which I(•) is the indicator feature. The margin measures the volume to which the common wide variety of votes at X, Y for the proper magnificence exceeds the average vote for another elegance.
The extra self-belief within the category. The generalization errors are given
via.
PE* = PX, Y [mg (X, Y ) <zero]
Where in the subscripts X, Y indicate that the possibility is over the X, Y space.
A hybrid approach to detect and classify apple diseases using a random forest classifier in (
Hernández et al., 2020). Today, foreign trade has increased significantly in some countries. Range of overseas fruit products from trading countries,
e.g., oranges, apples,
etc. Manual evidence to distinguish infected fruit is extremely tedious. The authors have
(Meghana et al., 2019) discussed the diagnosis of tomato plant diseases using the Random Forest algorithm, Random Decision Forest is a synthetic training method for regression, classification,
etc. The random forest consists of a number of decision trees. The random forest summarizes the results of the entire decision tree at the time of training and class generation in the case of classification problems and average prediction in the case of regression model. Random decision trees overcome the problem of overfitting to their training set, which is the main problem of decision tree algorithms. In this work, they were divided into two phases. In the first stage, feature extraction is performed on all images and stored in the feature table.
In Fig 3, random forest decision tree algorithm are used to diagnose rice leaf diseases and includes several procedures like first obtaining an image of rice leaves; preprocess the images, extract capabilities from these images and classify the images by disease name and the main dataset is divided into many units. Two thirds of the records in the dataset,
i.e., a total of 352 images (276 captcha and the latest image added from) are available for training in the above concepts the authors
(Sristy et al., 2012).
Classification algorithms
Class algorithms come under supervised device getting to know approach that is used to perceive the category of latest observations primarily based on the training dataset. In class set of rules an application learns the given dataset or observations after which classifies new remark into a number of groups.
K-nearest neighbor (KNN)
K-nearest Neighbor (k-NN) is a sample reputation set of rules that utilizes the education dataset to find k closest relatives in coming cases. Whereas k-NN is utilized in the layer, you work out to place the records in its closest neighbor kind. If OK = 1, then he could be placed in the lesson closest to 1. K is categorized by one vote as said by the number of his acquaintances.
In the Table 1, we have summarize the various KNN algorithms for finding and comparing the result of the different papers and in
(Eftekhar et al., 2019) have explained the Color and Texture Based Approach for the Detection and Classification of Plant Leaf Disease Using KNN Classifier, Modern organic farming is gaining popularity in the agriculture of many developing countries. This work proposed a way to detect and classify leaf diseases using nearest neighbor classifier (KNN).
Naive bayes
Naive Bayes calculates the probability that a record point belongs to a particular category. Text analysis can be used to classify words and terms that belong to predefined tags (categories). The probability that X occurs if Y is true is equal to the probability that Y occurs if X is true multiplied by the probability that X is true divided by the probability that Y is true. Apply the formula for the probability of an event.
The authors have explained papaya plant disease
(Wahyuni et_al2020). Detection Using Fuzzy Naive Bayes Classifier Algorithm Papaya is one of the common fruits grown in Indonesia. With weather conditions infested with pests and diseases, a professional papaya disease detection system was developed. Expertise is reflected in the system, allowing farmers to determine status without an expert.
Support vector machine (SVM)
SVM uses algorithms that shape and classify data into levels of polarization to go beyond X/Y prediction to some extent. For simple image rationalization, we use tags: red and blue, the same two actual functions: X and Y and train a classifier to output X/Y coordinates to pink or blue.
Rajleen and Kaur (2015) in Enhancement of SVM classification to improve plant disease detection, have proposed work in leaf imaging or even leaf imaging.
SVM errors = Margin error + classification error.
Here the higher the margin, the lower could-be margin errors and vice versa.
In this post, we run an SVM with records. One is school grades and train grades. First the original image is captured and then miles are used for processing. The image’s black and legacy pixels are then segmented and the color and saturation portions of the image are also separated. Third, detect disease and sickness from photos and segment healthy parts from them. This study will tell you the percentage of the area where the disease occurs and may even give you the name of the disease. In
Vagisha et al., (2020), the authors discussed leaf comparisons of different plants using SVM.
In the Table 2, have summarize the various SVM algorithm for finding and comparing the result, accuracy of the different author’s paper. Machine learning techniques have been used by
(Majji et al., 2021) to find plant diseases and traditionally, crop diseases are common causes of low yields and reduced yields. Accurate identification of plant diseases can help find cures as soon as possible to control losses. These authors attempted to develop a new method using ML techniques to predict plant diseases and compare different classification techniques; a comparison is given in Table 3.
The authors
(Ramesh et al., 2020) have explained machine learning methods for the detection and classification of foliar diseases, a detailed review of the benchmarks of various advanced ML algorithms for identification. Identification and classification of foliar diseases.
Comparison of machine learning techniques
Strengths and weaknesses of various machine learning techniques are discussed in Table 3 and 4.
In the Table 5, we have explained the comparison between the different machine learning algorithms ML approaches and what they have designed the model, highlight of that model and gap in the research. Agriculture provides food for everyone, even in times of rapid population growth. Unfortunately, however, the disease occurs in the main stages of crops (Sunil S.
Harakannanavar et al., 2022). The notion behind this document is to enlighten farmers on state-of-the-art to decrease leaf diseases in crops.