The descriptive statistics
viz., minimum, maximum, mean, Standard Deviation (SD) and Coefficient of Variation (CV) were measured for 13 characters under normal and water stressed condition (Table 2 and 3). The highest variation was observed for number of branches with CV of 12.23 and number of pods per plant with CV of 21.29 in normal and water stressed conditions, respectively. In water stressed condition, the lowest level of coefficient variation was observed by the leaf protein content (0.34).
The success of plant breeding depends on the availability of genetic variation, knowledge about desired traits and efficient selection strategies that make it possible to exploit existing genetic resources
(Nachimuthu et al., 2014). For the development of new variety, collection of germplasm and their systematic evaluation is needed, in order to know its various morphological, physiological and developmental characters including some special features such as stress tolerance, pest and disease resistance. Thus, appropriate and most efficient approaches should be used for germplasm evaluation and characterization. After the evaluation and characterization, it is important to analyse the genotypes and characters statistically for drawing a valid conclusion. PCA plays an important role in studying a large set of data by extracting the most significant data from the data points.
Hotelling (1933) indicated that PCA is an exploratory tool designed by
Pearson (1901) to identify unknown trends in a multi-dimensional data set. PCA can be used to uncover similarities between variable and classify the genotypes
(Leonard and Peter, 2009). PCA measures the importance and contribution of each component to the total variance.
Evaluation under normal condition (T0)
In this study, out of 13 components, the first five principal components explained most of the total variations present in the genotypes. Principal components were selected by the eigen values more than one suggested by
Brejda et al., (2000). The first five principal components with eigen value >1 contributed about 78.89% of the total variability among twenty-one blackgram genotypes evaluated for different morphological and physiological characters under normal condition (Table 4). The remaining eight components contributed only 21.11%. The Principal Component (PC) 1 contributed maximum variability of 26.95 % followed by PC 2 showed variability of 17.75%. PC 3 recorded 14.21% variability. 10.99% and 8.99% variability were recorded by PC 4 and PC 5, respectively.
Jeberson et al., (2018) estimated the principal components among twenty-five blackgram genotypes and reported 84.52% total variation for the first three components and the remaining four components were responsible for 15.48% variation only.
Interpretation of the principal components is based on finding which variables are most strongly correlated with each component. Eigen values close to -1 or 1 indicate that the variable strongly influences the component. Values close to 0 indicate that the variable has a weak influence on the component. The important characters contributed in positive factor loading value for PC 1 were seed yield per plant (0.4774), number of pods per plant (0.4616) followed by number of seeds per pod (0.3594) and plant height (0.3523). The trait days to first flowering (-0.3529) contributed to PC 1 negatively. PC 2 was contributed positively by the characters pod weight (0.4658), pod length (0.4240) and 100-seed weight (0.3269) while number of branches (-0.5115) contributed negatively. The PC 3 related to the characters number of seeds per pod (0.4346) and leaf protein content (0.3839) contributed positively whereas chlorophyll content (-0.5506) and number of clusters per plant (-0.4419) contributed negatively. The first three principal component axes explained more than half of the total variability (58.91 %). Hence, it indicated a high degree of correlation among the traits studied (
Jain and Patel, 2016).
However, PC 4 expressed only negative factor loading values for leaf protein content (-0.4842), number of clusters per plant (-0.4304), plant height (-0.3874), days to first flowering (-0.3072) and pod weight (-0.3022). In PC 5, 100-seed weight (0.6442) contributed to maximum positive factor loading value while seed size (-0.6462) contributed to maximum negative factor loading value. As a whole, PCA was able to identify important characters that were responsible for the variability in a population. Similar studies were also conducted by
Jeberson et al., (2018) and
Sridhar et al., (2020) in blackgram.
Screeplot explained the percentage of variation associated with each principal component by drawing a graph between eigen values and principal components (Fig 1).
The length of the vector is based on the contribution of the character to the principal component (Fig 2). Moreover, the angle of the character vectors is reflecting the correlation of variables. If the angle between two trait vectors is <90° (an acute angle), indicates a positive correlation. The two vectors in the 4
th quadrant
viz., seed yield per plant and number of pods per plant were highly correlated variables. Similarly, the vectors in 3
rd quadrant number of seeds per plant and plant height were highly correlated variables. These four variables also strongly correlated with the first principal component by the factor loading values. If the angle between two traits is >90° (an obtuse angle), indicates negative correlation. While if the angle is equivalent to 90°, indicates that no correlation between the characters. The character days to first flowering recorded negative correlation with seed yield per plant.
The genotype G8 projects onto the vector of seed yield per plant and number of pods per plant above the origin indicating a positive interaction (Fig 2). It concluded that by comparing the twenty-one genotypes, the genotype G8 was a superior genotype for characters seed yield per plant and number of pods per plant. Moreover, the genotypes G7 and G10 also had a positive interaction with those characters.
Among the twenty-one genotypes, three genotypes namely G8 (VBG-11011), G10 (VBG-12062) and G7 (VBG-10010) formed a distinct cluster in the right side of 3
rd and 4
th quadrant (Fig 3). The genotypes G11 (VBG-13017), G14 (RU-16-13), G15 (RU-16-14), G18 (VBN(Bg)-4), G20 (VBN(Bg)-7) and G1 (IC-343943), G3 (IC-343962), G6 (TBG-104), G13 (RU-16-9), G17 (T-9), G19 (VBN(Bg)-6), G21 (MDU Local) were formed two different clusters in between the 1st and 2nd quadrant. The genotypes G4 (ABG-11013), G5 (KU-11680), G12 (ADT-5) and G16 (KGB-28) were formed a cluster in 4th quadrant. Genotypes with a high positive principal component score for PC 1 was G7 (3.4322) followed by G10 (3.3172) and G8 (3.3007) (Table 6). These genotypes can be selected by the high principal component score in this environment (T
0).
Overall, it was observed that seed yield per plant, number of pods per plant, number of seeds per pod, plant height and days to first flowering had high influence on the PC 1 and the genotypes G8, G7 and G10 had high principal component score for PC 1. Based on the relationship of characters and genotypes to the PC 1, it can be concluded that the genotypes G8 (VBG-11011), G7 (VBG-10010) and G10 (VBG-12062) can be selected for above said characters for breeding purposes in normal environments.
Evaluation under water stressed condition (T1)
In water stressed condition, the principal component analysis condensed the thirteen traits into four major principal components which accounted for 77.94% of the total variation (Table 5 & Fig 4). The first four principal component axis recorded eigen values greater than one whereas, the fifth and further principal components recorded value less than one. Thus, those PC (<1) could be discarded to further shorten the set of data at disposal.
PCA analysis is able to identify the key traits that are responsible for the variability in a population (
Subramanian et al., 2019). PC 1 accounted for 38.14% of total variability and it was positively contributed by the characters chlorophyll content (0.3905) while seed yield per plant (-0.3569), pod length (-0.3569), number of pods per plant (-0.3200), plant height (-0.3037) and number of clusters per plant (-0.3005) contributed negatively. PC 2 accounted for 15.4% of total variability. The positively related traits were 100-seed weight (0.4197) and number of clusters per plant (0.3110) whereas number of seeds per pod (-0.4168) was negatively related to PC 2. The first PC was related to seed yield and yield related traits like pod length, number of pods per plant, number of clusters per plant and plant height. PC 2 was related to 100-seed weight, number of clusters per plant and number of seeds per pod. The first two principal components explained more than half of the total variability of 53.54%. Similarly,
Ghanbari and Javan (2015) reported that the first two principal components explained 58.28% variability under drought stress condition in mungbean.
PC 3 contributed 14.59% to total variability and the characters seed size (0.5301), 100-seed weight (0.3640) and number of pods per plant (-0.4107), number of clusters per plant (-0.3392) contributed to PC 3 positive and negative respectively. PC 4 contributed 4.35% of variability to the total variance. The characters namely number of clusters per plant, plant height, number of pods per plant and 100-seed weight grouped together in different principal components. Thus, the prominent characters placed together in different principal components and explaining the variability have the tendency to remain together
(Mahendran et al., 2015). This may be taken into consideration during utilization of these characters in drought breeding programs.
The two vectors in 1st quadrant namely seed yield per plant and pod length were highly correlated variables which were strongly associated negatively with the first principal component by the factor loading values (Fig 5). The characters leaf protein content and chlorophyll content showed a negative correlation with seed yield per plant of blackgram genotypes. PCA biplot was extensively used by several researchers to dissect the traits correlation in different crops (
Aslam et al., 2017 and
Maqbool et al., 2016).
The genotype G9 had projected in a positive direction for the vector seed yield per plant. It suggested that the genotype G9 (VBG-12005) is positively adapted to water stressed condition for the trait seed yield per plant.
A scatter plot drawn between the first and second principal components depicted a clear pattern of genotypes grouping in the factor plane (Fig 6). The distribution of genotypes based on PC 1 and PC 2 exhibits the phenotypic variation among the population and it explains how they widely dispersed along both axes. The genotypes G1 (IC-343943), G2 (IC-343947), G7 (VBG-10010), G8 (VBG-11011), G10 (VBG-12062), G12 (ADT-5) and G20 (VBN(Bg)-7) clustered as a group in 1
st quadrant. G13 (RU-16-9), G17 (T-9) and G18 (VBN(Bg)-4) grouped in 1
st and 4
th quadrant. The genotypes G3 (IC-343962), G5 (KU-11680), G16 (KGB-28) and G21 (MDU Local) were clustered in 2
nd and 3
rd quadrant. In 3
rd and 4
th quadrant the genotypes G6 (TBG-104), G14 (RU-16-13), G15 (RU-16-14) and G19 (VBN(Bg)-6) were clustered as another group. PCA did not show any distinct clustering in Fig 6. This could be due to the fact that the principal component analysis based on water stress. Genotypes with a high negative principal component score for PC 1 was G9 (-4.4696) (Table 6). This genotype can be selected by the high principal component score for water stressed environment (T
1).
On the whole, the characters namely seed yield per plant, pod length, number of pods per plant, number of clusters per plant, plant height and chlorophyll content had a high influence on PC 1 and the genotype G9 had the high principal component score for PC 1. Based on the relationship of the characters and genotype to the PC 1, it can be concluded that the genotype G9 (VBG-12005) can be recommended for the above said characters in drought breeding programs.