The data collected on 20 yield and yield contributing characters from 70 mung bean breeding lines were subjected to genetic divergence analysis employing Mahalanobis’ D
2 statistic, principal component and hierarchical cluster analysis. The magnitude of values suggested that there is considerable variability in the genotypes studied, which led to genetic diversity.
The per cent contribution of all 20 characters in 70 genotypes towards genetic divergence is presented in Table 1. Among the characters studied, calcium content (43.40%) showed maximum contribution followed by pods per plant (17.35%), plant height (17.27%), clusters per plant (6.09%), seed yield per plant (5.22%), harvest index (3.64%), seeds per pod (2.15%), seed hardness (1.95%), 100-seed weight (0.83%), pods per cluster (0.66%), primary branches per plant (0.62%), days to 50% flowering (0.25%), shelling % (0.21%), biological yield per plant (0.21%), protein content (0.12%) and leaf width (0.04%). While the contribution of other traits like days to 50% flowering, days to maturity, leaf length and pod length was zero per cent.
Parthasarathy et al., (2005) reported that days to maturity contributed maximum towards total genetic divergence followed by seed yield per plant.
Vivekanandan et al., (2005) observed 98% contribution of 100-seed weight.
Gadakh et al., (2013) found maximum contribution of protein content towards divergence.
Singh et al., (2015) showed the contribution of biological yield per plant, seed yield and pods per cluster.
Grouping of genotypes into various clusters
The 70 inter-specific derivatives were grouped into eight clusters using the Tocher’s method (Table 2). The distribution of 70 derivatives into eight clusters was at random with maximum number of genotypes in cluster I (26 genotypes) followed by cluster III having 21 genotypes. Cluster IV consists of eight genotypes. Cluster II and cluster VIII with six genotypes, whereas cluster V, cluster VI and cluster VII are monogenotypic consisting of one genotype.
Average intra and inter-cluster D2 value
The average intra and inter-cluster D
2 values were estimated as per the procedure given by
Singh and Chaudhary (2010) and are presented in Fig 1. The nearest and farthest cluster for each of the eight clusters is indicated in Table 3. Inter-cluster divergence expresses the diversification among clusters (group of genotypes resembling each other, hence with low intra-cluster divergence). Intra-cluster D
2 values ranged from 0.00 (cluster V, cluster VI and cluster VII) to 429.53 (cluster VIII). The intra-cluster distance indicated the diversity among genotypes grouped in that clusters. Genotypes grouped into the same cluster presumably differ little from one another as the aggregate of characters measured. General notion exists that the larger is the divergence between the parental genotypes, the higher will be the heterosis in crosses
(Falconer, 1964). Therefore, it would be desirable to attempt crosses between genotypes belonging to distant clusters for getting highly heterotic crosses which are likely to yield a wide range of segregants on which selection can be practiced.
In the present study, inter-cluster distances were worked out considering 20 characters and these distances ranged from 240.96 (between cluster V and cluster VII) to 1080.72 (between cluster V and cluster VIII). Wide diversity was also reported by earlier workers where
Ramana and Singh (1987) grouped 39 genotypes into eight clusters. Similarly,
Manivannan et al., (1998) grouped 30 genotypes into eight clusters.
Cluster I comprising eleven genotypes was nearest to cluster VII (314.56) followed by cluster V (352.85) and was farthest from cluster VIII (732.61) followed by cluster II (527.39). Cluster II consists of five genotypes. It was nearest to cluster VI (494.93) followed by cluster II (527.39) and was farthest from cluster III (932.25) followed by cluster V (839.08). Cluster III is second largest cluster which constitutes of 12 genotypes. It was nearest to cluster VI (462.15) followed by cluster I (471.01). It was farthest from cluster II (932.25) followed by cluster VIII (714.36). Four genotypes were grouped into cluster IV. It was nearest to cluster I (429.60) followed by cluster VI (614.19) and it was farthest from cluster VII (910.11) followed by cluster VI (792.78).
Cluster V also consist of four genotypes as that of cluster IV. It was nearest to the cluster VII (240.96) followed by cluster I (352.85) and it was farthest from cluster VIII (1080.72) followed by cluster II (839.08). Cluster VI comprised of three genotypes. It was nearest to the cluster I (280.93) followed by cluster II (755.34) and was farthest from cluster VII (4451.42) followed by cluster VI (3163.07). Cluster VI consists of five genotypes. It was closest to the cluster VII (300.16) followed by cluster VIII (421.52) and was farthest from cluster V (660.49) followed by cluster IV (614.19). Cluster VII consists of nine genotypes. It was closest to the cluster V (240.96) followed by cluster VI (300.16) and was farthest from cluster VIII (738.30) followed by cluster IV (640.97). Cluster VIII is the largest cluster consisting of 20 genotypes. It was nearest to cluster VI (421.52) followed by cluster II (618.80) and was farthest from cluster (1080.72) followed by cluster IV (910.11).
The maximum inter-cluster distance was between cluster V and cluster VIII (1080.72), followed by cluster II and cluster III (932.25), cluster IV and cluster VIII (910.11), cluster VII and cluster VIII (738.30), cluster I and cluster VIII (732.61), cluster VI and cluster V (660.49) and cluster II and cluster VI (494.93). This suggested that there is wide genetic diversity between these clusters. Based on these studies, crosses can be made between genotypes of these clusters to obtain desirable results either in transgressive breeding or in heterosis breeding.
Cluster mean values
The cluster means of all the 20 characters had presented in Table 4. Cluster means indicate average performance of all genotypes present in a particular cluster. The data indicated a wide range of mean values between the clusters.
For days to 50% flowering, least cluster mean was recorded by cluster II (36.83) followed by cluster II (36.90) and cluster VI (37.00) and high for cluster VIII (42.50). For days to maturity cluster II and cluster III (67.33) showed least cluster mean and cluster IV (73.44) showed high mean performance. For days to shattering, least cluster distance was recorded by cluster I (77.19) followed by cluster III (77.29) and high for cluster VIII (83.50). For plant height highest mean performance was recorded for cluster IV (57.95) followed by cluster II (44.08) and least for cluster V (30.84). For primary branches per plant lowest cluster mean was recorded for cluster V (3.06) followed by cluster VII (3.51) and cluster III (3.53) and highest for cluster II and cluster VIII (4.26). For leaf length cluster VI (15.31) showed highest cluster mean followed by cluster IV (12.23) and cluster V (12.08) and low mean performance for cluster VII (9.92). High mean performance for leaf width was recorded for the cluster IV (11.53) followed by cluster VI (11.24) and lowest for cluster VII (6.72).
The trait number of pod clusters per plant has recorded maximum cluster mean performance for the cluster II (8.28) followed by cluster VIII (7.84) and minimum for cluster V (4.86). The character number of pods per cluster had showed highest cluster mean for cluster II (5.53) followed by cluster VII (5.25) and least cluster mean for cluster IV (4.61). Number of pods per plant showed maximum cluster mean for cluster VI (48.90) followed by cluster II (45.95) and minimum cluster mean for cluster V (20.58). The trait seeds per pod had showed highest cluster mean for cluster VI (17.17) followed by cluster V (16.23) and least cluster mean for cluster IV (13.94). The character pod length had showed maximum cluster mean for cluster VI (9.17) followed by cluster V (8.73) and minimum cluster mean for cluster VII (6.32). In case of 100 seed weight, the highest cluster mean was observed in VIII (6.20) followed by cluster VI (5.74) while least for cluster VII (3.48). For shelling % highest cluster mean was found in cluster VIII (6.20) followed by cluster VI (63.83) and least cluster mean was found for cluster V (28.93). Biological yield has showed maximum cluster mean for cluster VI (45.68) followed by cluster VIII (42.92) and minimum for cluster VII (13.32). For harvest index maximum cluster mean was observed in cluster VIII (39.42) followed by cluster IV (36.18) and minimum cluster mean was observed in cluster V (25.45). For protein content the highest cluster mean was for cluster VII (25.00) followed by cluster VI (24.27) and least cluster mean was found in cluster VIII (20.29). For calcium content the highest cluster mean was for cluster V (120.63) followed by cluster II (119.68) and least cluster mean was found in cluster III (102.14). For seed hardness highest cluster mean was found in cluster V (4.53) followed by cluster II (3.17) and least cluster mean was found in cluster I (2.61). For seed yield per plant highest cluster mean was observed in cluster VIII (18.09) followed by cluster VI (12.49) and least cluster mean was found in cluster VII (4.48).
Thus, cluster VIII and cluster IV showed high mean values for most of the yield contributing traits like100-seed weight, shelling %, harvest index, pod length, primary branches per plant, days to 50% flowering, days to maturity, leaf width and days to shattering. So the genotypes from cluster IV (SPS-1, SPS-2-4, SPS-6-1-11, BPMR-145) and cluster VIII (SPS-6-4-2, SPS-45-1, SPS-42-1-10, SPS-40-1-11, SPS-40-1-13, SPS-6-20-3, SPS-6-44-13, SPS-6-47-3, SPS-6-53-13, SPS-6-67-3, SPS-7-4-16, SPS-8-20-7, SPS-42-1-12-1, SPS-6-42-9, SPS-6-52-5, SPS-6-52-6, SPS-6-52-15, SPS-6-67-8, SPS-7-4-10, SPS-8-20-4) can be used for mung bean yield improvement programme. Similar results were the findings of
Raje and Rao (2001),
Haritha and Reddy (2002),
Gupta et al., (2004),
Ahmed et al., (2006),
Yimram et al., (2009),
Ajay et al., (2012),
Gokulakrishnan et al., (2012),
Gadakh et al., (2013),
Sarkar and Kundagrami (2016) and
Razzaque et al., (2016).
Principal component analysis
PCA confirms the group constellations obtained by D
2 analysis. It determines the effective number of axis of differentiation primary and secondary or so based on number of canonical vectors. As a multivariate statistical technique, the principal components analysis (PCA) has the ability to transform a number of possibly correlated variables into a smaller number of uncorrelated variables called principal components. The eigen values are often used to determine how many factors to retain. If the characteristic value is lower than one, it explains that the explanatory power of principal components is lower than the average explanatory power of the original variables.
The eigen values (variances), per cent variability, cumulative per cent variability and component loading of the different characters are given in Table 5. The principal component analysis sorted only significant principal components out of the total 20 attributes. The contribution of the main characters for variance easily identified by the characters loaded on the PC
1 with high loading values.
Generally, if eigen value is higher than one, it can be used as an inclusion criterion. The principal components with eigen values less than one were considered as non-significant. In present investigation seven principal components PC
1 to PC
7, with eigen values more than one which are extracted from the original data, contributed 76.70% of the total variation. However, remaining 13 components contributed only 23.30% of total variation for this set of 70 breeding lines evaluated for 20 quantitative traits. These principal component scores might be used to summarize the original 20 variables in any further analysis of the data. PCA scores for the first seven principal components Table 5.
PC
1 to PC
7 showed 25.12%, 14.13%, 10.33%, 8.99%, 7.04%, 5.88% and 5.21% variability among different traits under advance breeding lines. Among seven principal components first two components revealed major contribution (39.25%) towards total variation. Eigen value and variance associated with each principal component decreased gradually and cumulative variability increased gradually. The result of principal components is depicted as a character loading score Fig 1.
The characters effective in the first factor had a high level of loading coefficients. Here, the first principal component (PC
1) had high positive component loading from 100-seed weight (0.401) followed by pods per plant (0.333), harvest index (0.326), pod length (0.289) and seed yield per plant (0.277). On the other hand, high negative component loading was observed in PC
1 with protein content (-0.196) followed by seeds per pod (-0.145), shelling % (-0.056) and leaf length (-0.081). Here positive component loading indicated that selection for yield associated traits were effective from PC
1. This revealed that, PC1 contributed maximum variability of yield associated traits.
The major contributing characters for variation in the second principal component (PC
2) were plant height (0.484) followed by days to maturity (0.285), leaf width (0.283), days to 50% flowering (0.251), protein content (0.191) and calcium content (0.066). High negative component loading was showed by characters
viz., leaf length (-0.306) followed by clusters per plant (-0.278), pods per cluster (-0.273), seed hardness (-0.259) and pods per plant (-0.223).
Likewise, the important traits
viz., shelling % (0.451) followed by biological yield per plant (0.408), seed yield per plant (0.369), seeds per pod (0.153) and clusters per plant (0.094) contributed more variation for PC
3. The characters like calcium content (-0.390) followed by seed hardness (-0.346), leaf length (-0.244), pods per plant (-0.194) and leaf width (-0.181) showed high negative loading.
Similarly, days to shattering (0.508) followed by days to 50% flowering (0.424), days to maturity (0.372), leaf length (0.263) and seeds per pod (0.167) had more variation in PC4 and characters had high negative loading in PC
4 are pod length (-0.358) followed by plant height (-0.255), shelling % (-0.210), calcium content (-0.189) and harvest index (-0.176).
PC
5 had high positive loading for the characters
viz., calcium content (0.305) followed by harvest index (0.277), days to maturity (0.231), days to 50% flowering (0.198) and leaf length (0.176), whereas high negative loading for components seeds per pod (-0.526) followed by primary branches per plant (-0.438), leaf width (-0.275), protein content (-0.222) and pod length (-0.154).
PC
6 had high positive loading for the characters leaf length (0.395) followed by leaf width (0.283), pod length (0.218), biological yield per plant (0.129) and primary branches per plant (0.111). High negative loading was showed by components like, protein content (-0.460) followed by pods per cluster (-0.389), seed hardness (-0.290), days to maturity (-0.275) and clusters per plant (-0.271).
PC
7 had high positive loading for characters like, calcium content (0.444) followed by biological yield per plant (0.423), seed yield per plant (0.385), protein content (0.240) and days to 50% flowering (0.050). Again in the same principal component, characters which showed high negative loading are shelling % (-0.423) followed by pods per cluster (-0.086), leaf width (-0.206), leaf length (-0.142) and days to shattering (-0.094).
Usually, only one variable was selected from these identified groups. Hence, for the first group 100-seed weight was best choice, which had the largest loading from PC
1. Likewise, plant height, shelling %, days to shattering, calcium content, leaf length and calcium content was the best choice for second, third, fourth, fifth, sixth and seventh principal component, respectively.
The PCA scores for 70 breeding lines in the first three principal components with eigen value more than one were computed. These three PCA scores for 70 genotypes plotted in graph to get the 3D (PCA I as X axis, PCA II as Y axis and PCA III as Z axis) scatter diagram (Fig 2).
The genotypes of divergent clusters like SPS-42-1-8, SPS-6-1-26 and SPS-7-4-10 scattered far apart in the 3D (Fig 2) plots. The combinations between these genotypes would give better recombinants which will be utilized for development of elite cultivars in mung bean.
The principal component scores of genotypes were used as input for clustering procedures in order to group the genotypes into various clusters and to confirm the results of principal component analysis. Principal component scores were used as characters instead of attributes for clustering procedures, making the results equivalent to those from initially standardized data as the correlation matrix was used for principal component analysis.
From 3D graph, it can be concluded that the breeding lines which placed far apart on graph are SPS-42-1-8, SPS-6-1-26 and SPS-7-4-10. These lines are also grouped in different clusters as per D
2 analysis. So these lines are quiet divergent from each other and in future can be utilized for developing elite cultivars in mung bean breeding programme. The results of D
2 analysis are confirmed with this PCA analysis and it can be proved that this genotypes are divergent form each other with maximum inter-cluster distance.