Indian Journal of Agricultural Research

  • Chief EditorV. Geethalakshmi

  • Print ISSN 0367-8245

  • Online ISSN 0976-058X

  • NAAS Rating 5.60

  • SJR 0.293

Frequency :
Bi-monthly (February, April, June, August, October and December)
Indexing Services :
BIOSIS Preview, ISI Citation Index, Biological Abstracts, Elsevier (Scopus and Embase), AGRICOLA, Google Scholar, CrossRef, CAB Abstracting Journals, Chemical Abstracts, Indian Science Abstracts, EBSCO Indexing Services, Index Copernicus

Principle Component Analysis and Diversity Studies in Sunflower Lines (Helianthus annuus)

Harshavardan J. Hilli1,*, Shobha Immadi1
1Department of Genetics and Plant Breeding, University of Agricultural Scieneces, Dharwad-580 025, Karnataka, India.
Background: Sunflower crop being a highly cross pollinated with high yield potential, can be grown in all seasons with its suitability in variety of inter and series cropping systems due to its photo insensitive nature and can be adapted to a variety of environmental conditions also. So keeping in mind of its potential and future use, research was conducted with available lines to study PCA and its diversity that might be used in hybridization programmes.

Methods: The 45 lines of sunflower were grown in a randomised block design with three replications at the Research Farm, Department of Genetics and Plant Breeding, College of Agriculture, Vijayapur during summer 2018-19 for study on principal component analysis and diversity analysis.

Result: At the genotypic stage, the analysis of variance showed highly significant variations among the genotypes for all of the characters. Whereas Divergence analysis using principal component analysis and hierarchical cluster analysis has been shown to be effective in determining which genetically distant parents should be used for hybridization. Divergence in genotypes were divided into nine major groups or clusters and out of nine components, four components had more than one eigen value (Table-2). The first principal component explained 33.40 per cent of total variation, while the second, third, fourth, fifth, sixth, seventh, eighth and ninth principal components explained 23.30, 14.10, 11.19, 6.30, 4.90, 3.60, 1.90 and 0.60 per cent, respectively.
Sunflower (Helianthus annuus L.), the world’s second-largest oilseed crop after soybean, is a member of the Asteraceae/Compositae family that originated in temperate North America and has a high content of unsaturated fatty acids with a desirable consistency. It’s a diploid crop with a 3000 Mb haploid genome and a diploid chromosome number of 2n=34. Sunflowers have been successfully grown in a wide variety of locations around the world. It is a highly cross pollinated crop with high yield potential that can be adapted to a variety of environmental conditions. It can be grown in all seasons and taken in a variety of inter and series cropping systems due to its photo insensitivity. Presently, it is cultivated in an area of 20.00 million hectares globally with production of 30.00 million tonnes and productivity of 1,500 kg ha-1. Asia accounts for nearly 20-22 per cent of the total sunflower area in the world, contributing 18 per cent of the total production.

India is the leading country in the Asian sub-continent with an area of 0.29 million hectares and annual production of 0.21 million tonnes and productivity of 738 kg ha-1. The leading sunflower growing states in India are Karnataka, Punjab, Bihar, Orissa andhra Pradesh and Maharashtra and in recent years, its cultivation is increasing to non-traditional states like Haryana, Uttar Pradesh, Gujarat and West Bengal. Karnataka is leading state in India contributing 64 per cent to the total area and 54 per cent to the total production and is popularly known as “sunflower state”. In Karnataka, it is cultivated over an area of 0.44 million hectares with annual production and average productivity of 0.29 million tonnes and 530 kg ha-1 respectively.

Adequate information about heritability, variability and the degree of association among different traits is needed for further yield improvement. It has been proposed that genetic tools be used to establish sustainable solutions to basic crop constraints, but this is difficult due to the large number of variation effects and a lack of adequate evaluation and classification techniques. There are a variety of statistics available for grouping lines in order to select for different types, although there are some drawbacks since data can become unmanageable at times. As a result, PCA and cluster analysis (Peeters and Martinelli, 1989) are useful tools for identifying important characters. The parents for the hybridization program should be chosen based on the magnitude of genetic distance, the contribution of different characters to total divergence and the magnitude of cluster means for distinct characters with the most heterosis. The effective hybrid breeding and selection needs presence of large amount of variability among the traits to be studied. Hence it is necessary to study variability in respect of quantitative characters with reference to genetic parameters.

PCA is a multivariate technique that analyses data and converts it into a series of new orthogonal variables called principal components by linearly combining the variables that account for the majority of the variance in the original variables (Abdi and Williams, 2010). In order to treat qualitative variables, PCA can be generalised as correspondence analysis (CA) and as multiple facto analysis (MFA). Divergence analysis using principal component analysis and hierarchical cluster analysis has been shown to be effective in determining which genetically distant parents should be used for hybridization (Choudhary et al., 2015).
The current investigation was conducted with 45 lines of sunflower (Table 7) at the College of Agriculture, Vijayapura, during the summer 2018 at H block which is situated at 16oN latitude and 75oE longitude and altitude of 593 MSL with a mean annual rainfall of 750 mm. The agroclimatic region and soil conditions are considered to be good for crop establishment. Each line was grown in a 3 m long row with 30cm between plants and 60cm between rows. To grow a good harvest, recommended agronomic practises were used. Days to 50% flowering, days to maturity, relative chlorophyll content and days to 50% flowering, days to maturity, relative chlorophyll content using SPAD chlorophyll meter at 45 DAS, relative chlorophyll content using SPAD chlorophyll meter at 60 DAS, head diameter, test weight, seed yield per hectare and the oil content were obtained from five randomly tagged plants in each lines from both replications. To assess genetic diversity among lines, principal component analysis (PCA) (Hotelling, 1936) and cluster analysis were used. The data was analysed using the R programming language.
 
Principal component analysis
 
The goal of principal component analysis is to find a limited number of linear combinations that account for the majority of the variation in the data being used. ANOVA for seed yield and its components in sunflower lines is depicted in Table 1 indicating significant difference obtained in all traits studied. Principal components analysis was used in this study to classify the most important characters and view them in more visual dimensions using linear combinations of variables that account for the majority of the variance in the original set of variables. Out of nine components, four components had more than one eigen value (Table 2). The first principal component explained 33.40 per cent of total variation, while the second, third, fourth, fifth, sixth, seventh, eighth and ninth principal components explained 3.30, 14.10, 11.19, 6.30, 4.90, 3.60, 1.90 and 0.60 per cent, respectively (Table 2 and Fig 1).

Table 1: Analysis of variance (mean sum of square) for seed yield and its components in sunflower lines.



Table 2: Total variance explained by different principal components in sunflower lines.



Fig 1: Variances shown by different components.



The principal factor study (Kaiser 1958) did not yield a simple image of the characters’ interactions. As some factors had very high variable loading and others had low (Table 3). The data in the table says that PF-1 was loaded on plant height and SPAD readings @ 45 days, similarly PF-8 loaded on SPAD @ 60 days and test weight respectively. The highest loading recorded was PF-9 and PF-8.

Table 3: Factor loading of different characters with respect to different principal factor in sunflower lines.



Principal component analysis was also used in oat by Vaisi et al., (2013) and Krishna et al., (2014), who proposed transferring several associated variables into a few separate principal components, which explained much of the heterogeneity in the original collection. Hemavathy (2020) performed a sweet corn principal component research study and found similar results. In the Fig 2 wider angle indicates maximum diverse which indicates negative correlation and lesser or acute angle indicating more correlation. Results in figure view also depicted at the end.

Fig 2: Figure depicting PCA and cluster distance between different components.



PCA will allow for the depiction of individual differences as well as the identification of possible groups. The original variables are linearly transformed into a new set of uncorrelated variables known as principle components to achieve the reduction. Maruthi Sankar et al., (1999) have assessed the variability of eight plant traits for growth of sunflower and reduced the dimensionality to two principal components, which extracted about 80% of variance in the original data.
 
Genotypic and phenotypic correlation
 
Correlations are useful for identifying the main factors that determine final grain production, they only give a partial picture of the relative importance of direct and indirect impact on the individual elements. Yield is a complicated trait that is influenced by the number of other characteristics that can have both positive and negative effects on it. As a result, understanding the mechanism of interaction, consequences and cause of relationship will aid in the selection of breeding methods for increasing yield in sunflower. The magnitude of phenotypic correlation is lower than genotypic correlation, but they have a similar trend in direction, according to correlation analysis. Low phenotypic association indicates that the environment has an effect on the expression of these traits. Certain traits had significant positive correlation and certain had significant positive correlation. Yield per plant and plant height had significant positive correlation and rest had significant negative correlation. Similar results were obtained by Arunkumar et al., 2014. This indicates that selection of traits like plant height and yield per plant can be used as major criteria for improving the yield (Table 4 and 5).

Table 4: Cluster groups and average intra cluster distance.



Table 5: Genotypic correlation coefficient among various traits in sunflower lines.



The lines which are having high heritability and greater amount genotypic coefficient of variation need to be selected and also the lines having high plant height and high seed yield per plant leads to increase in total yield per hectare basis. Thus there might be certain traits depicting negative correlation and certain positive correlation, but breeders need to concentrate on the traits which are having significant positive correlation in order to improve crop yield.
 
Diversity analysis
 
Genetic variety is required for any crop improvement program since it aids in the generation of superior recombinants by allowing for the selection of parents with greater variability for various traits. Many crops has lost its diversity in recent decades as local types have been replaced with high-yielding variants. Genetic divergence study assesses the genetic diversity of a population.

D2 statistics is a method of calculating genetic divergence in germplasm collections using a numerical technique. The parents for the hybridization program should be chosen based on the magnitude of genetic distance, the contribution of different characters to total divergence and the magnitude of cluster means for distinct characters with the most heterosis. The D2 study of 45 sunflower lines resulted in nine clusters, with cluster I and II having the most (20) and 14 lines and cluster VI to IX had one each. Cluster III and IV had the greatest intra-cluster distance. Clusters II, III and IV had stronger cluster means for most of the characters, hence lines 17, 11, 12, 45 and 43 can be used in crossing for crop improvement (Table 6,7). Similar results were obtained by Poonia and Phogat, 2017 while working in oat.

Table 6: Phenotypic correlation coefficient among various traits in sunflower lines.



Table 7: Sunflower lines used for experiment.

Knowledge of germplasm variability and genetic relations among breeding materials could prove to be a beneficial asset in crop development efforts. A variety of approaches for analyzing genetic diversity in germplasm accessions, breeding lines and populations are currently available. Principal components analysis is a multivariate powerful data reduction methodology that removes interrelationships between components and is helpful in detecting data set structures, genotyping grouping and estimate. Also breeders need to concentrate on the traits which are having significant positive correlation in order to improve crop yield.
None.

  1. Abdi, H. and Williams, L.J. (2010). Principal component analysis. Wiley interdisciplinary reviews: Computational Statistics.  2(4): 433-459.

  2. Arunkumar, B., Biradar, B.D. and Salimath, P.M. (2004). Genetic variability and character association studies in rabi Sorghum. Karnataka J. Agric. Sci. 17(3): 471-475.

  3. Chaudhary, S.P., Sagar, B.K., Hooda. and Arya, R.K. (2015). Multivariate analysis of pearl millet data to delineate genetic variation. Forage Res. 40: 201-208.

  4. Hemavathy, A.T. (2020). Principal component analysis in sweet corn [Zea mays (L.) saccharata.]. Forage Res. 45(4): 264-268.

  5. Hotelling, H. (1936). Relations between two sets of varieties. Biometrica. 28(3-4): 321-377.

  6. Kaiser, H.F. (1958). The varimax criterion for analytic rotation in factor analysis. Psychometrika. 23:  187.

  7. Krishna, A., Ahmed, S., Pandey, H.C. and Kumar, V. (2014). Correlation, path and diversity analysis of oat (Avena sativa L.) genotypes for grain and fodder yield. Journal of Plant Science and Research. 1(2): 9. ISSN: 2349-2805.

  8. Maruthi Sankar, G., Narasimha Murthy, D., Vanaja, M., Raghuram Reddy, P. (1999). A multiple selection index for selecting sunflower genotypes using principal component analysis. Indian Journal of Dryland Agricultural Research and Development. 14(2): 93-10.

  9. Peeters, J.P. and Martinelli, J.A. (1989). Hierarchical cluster analysis as a tool to manage variation in germplasm collections. Theor. Appl. Genet. 78: 42-48.

  10. Poonia, A. and Phogat, D.S. (2017).  Genetic divergence in fodder oat (Avena sativa L.) for yield and quality traits. Forage Research. 43: 101-105.

  11. Vaisi, H., Golparvar, A.R., Resaie, A. and Bahraminejad, S. (2013). Factor analysis of some quantitative attributes in oat (Avena sativa L.) genotypes. Scientia Agriculturae. 3(3): 62-65.

Editorial Board

View all (0)