Eigen values and eigen vectors were computed using the principal component analysis of ninety-eight accessions using a correlation matrix. Table 1 presents the eigenvalues along with the cumulative percentage of variation indicated by the eigen vectors. In the original data units, the first six principle components contributed 93.3% to the total variation. PC1 may represent overall yield potential. PC2 could indicate quality or compositional traits (
e.
g., kernel colour or seed size/weight). This separation could be helpful in genotype selection, where high PC1 and moderate PC2 scores identify high yielding varieties. PC1 appears to capture variation related to traits such as PFW, HPW, PDW, HSHW, since these have the highest positive loadings. PC2 reflect traits like seed weight and kernel characteristics (HSW, KC), with strong negative loadings. SP (shelling percentage) and KC (kernel colour) contribute negatively, possibly indicating contrast or trade-offs with yield components. These findings are in agreement with
Vishnuprabha et al., (2024) and
Sukrutha et al., (2023), who highlighted pod and kernel traits as primary contributors in groundnut diversity studies.
Fig 1 illustrates the scree graph, depicting the variation defined by the different principal components associated with cattle. The first principal components eigenvalue represented 36.7% of the total variance. The second to sixth principal components, which comprised 15.2%, 14.6%, 10.4%, 9.7%, 6.7% and 4% of the total, respectively. The scree plot supports the use of 2 to 3 principal components for further analysis, depending on the desired trade-off between simplicity and information retention. Beyond PC3, the marginal gain in explained variance is minimal, indicating those dimensions can likely be excluded without significant loss of important data structure.
The relative weight given to the variable in each of the nine primary components is represented by the eigenvector, which has been calculated so that the greatest element in each vector is set to one. The orientation of vector is depends on the value of variance and covariance of the principle component. The aim of principal component analysis is to develop new variables that account for 90% of the total variance in the dataset
(Vishnuprabha et al., 2025 and
Sudhir et al., 2010).
The study incorporates nine quantitative traits in the PCA analysis. Nine Principal Components (PCs) are utilized to decompose the entire dataset. Table 2 indicates that the first three principal components (PCs) possess eigenvalues greater than 1, collectively accounting for 66.5% of variability. PC1, possessing an eigenvalue of 1.81, accounted for 36.7% of the total variability. According to PC1, DFF (0.16) and PH (0.16) demonstrate the strongest positive correlation. Following this, the values are PDW (0.47), HSHW (0.45), PFW (0.44), HPW (0.44) and HSW (0.26). In contrast, PC1 exhibits a significant negative correlation with KC (0.08) and SP (0.26). Similarly it was reported by
Dama, (2022) for the PC1 eigenvalue.
PC2 has an eigenvalue of 1.16 and represents 15.2% of the total variability, as shown in Table 2. PC2 exhibits a strong positive correlation with DFF (0.18), followed by PH (0.04), PFW (0.14), PDW (0.17) and HSHW (0.02). There exists a strong negative correlation with attributes including HPW (-0.28), HSW (-0.62), KC (-0.65) and SP (-0.17), beside a weak negative correlation with PH (-0.086). Similarly it was reported by
Mubai et al., (2020) and
Danalakoti et al., (2024).
Biplot analysis
Fig 2 illustrates the biplot of PCA, classified into four quadrants, with Dimension 1 on the X-axis and Dimension 2 on the X-axis, accounting for the maximum variance. The biplot demonstrates the variance between traits and genotypes, with a colour indicating the degree of this variance. The green colour represents high variance, whereas yellow, brown and black denote moderate and low variance, respectively. Quadrant 1 (Q-1) is located directly above the centroid and exerts a positive effect on both Dimension 1 and Dimension 2. Seventeen genotypes, specifically 1, 75, 82, 54, 63, 25, 12, 4, 78, 60, 41, 88, 72, check 1, check 2, check 3 and check 5 in Q-1, exhibit a significant correlation with PDW, subsequently followed by HSHW, HSW and PH.
Mekonen, (2022),
Kohar et al., (2023) and
Isler et al., (2023) reports similar findings by constructing a biplot with groundnut accessions, positioning the hundred seed, plant height and shelling percentage on the positive side of the graph.
Thirty-two accessions includes the genotypes 58, 51, 49, 43, 85, 23, 59, 86, 94, 52, 56, 39, 14, 23, 26, 3, 61, 70, 22, 81, 95, 44, 90, 67, 87, 31, 68, 38, 35, 24, 20 and 42 are classified in quadrant 2 (Q-2), located to the left higher than the centre. This positioning indicates a positive effect on Dim.2 and a negative effect on Dim.1, thus influencing the variance in traits such as SP and DFF. Genotypes 58 and 52 and the trait SP exhibit significant diversity and are suitable for selection purposes.
The third quadrant (Q-3) is located to the left of the centre and has a negative impact on both Dimension 1 and Dimension 2. Twenty-five genotypes are categorized within this quadrant, including 77, 92, 27, 15, 69, 89, 36, 91, 18, 58, 84, 62, 8, 48, 6, 46, 17, 9, 2, 19, 66, 93, 10, 79 and check 4.
In quadrant 4 (Q-4), directly beneath the centroid, Dim.1 exhibits positive effects while Dim.2 shows negative effects. The variables PFW and HPW demonstrate significant correlations with 24 genotypes: 7, 28, 47, 37, 30, 64, 40, 50, 76, 5, 80, 13, 45, 57, 34, 32, 88, 55, 21, 83, 11, 71 and 73. The contribution of variance is substantial, with values of 73, 83 and 21 observed for genotypes and HPW in Q-4.
Mamun et al., (2022) and
Sukrutha et al., (2023) conducted a principle component analysis revealing a significant association between hundred kernel weight, alongside plant height’s contribution to total variability.
The biplot representation clearly discriminated the genotypes and revealed that PFW, PDW and NPPP are the most important variables in determining diversity. Hence, these traits can be used to select genetically diverse and superior genotypes. Similar findings were reported by
Vishnuprabha et al., (2024) and
Sukrutha et al., (2023), who also identified pod and kernel traits as major contributors to diversity.
Ward’s hierarchical cluster analysis, utilizing nine principal component scores that accounted for over 90% of total variation, identified eight distinct clusters (Fig 3). The initial cluster consisted of four accessions, including one check variety and one novel germplasm line. The second cluster included 21 accessions, comprising 4 check varieties and 17 new germplasm lines. The third cluster includes twelve new varieties, while the fourth cluster consists of 29 new germplasm lines. The fifth cluster contains 14 new germplasm lines, while the sixth cluster consists of 13 new germplasm lines. The seventh cluster comprises 8 accessions, while the eighth cluster contains 15 germplasm accessions. Similarly
Upadhyaya et al., (2006) evaluated groundnut core collection and identified three clusters through principal component analysis. These results agree with earlier reports by
Kumar et al., (2024) and
Patel et al., (2024) who also emphasized the role of pod and yield traits in genetic diversity analysis.