Legume Research

  • Chief EditorJ. S. Sandhu

  • Print ISSN 0250-5371

  • Online ISSN 0976-0571

  • NAAS Rating 6.80

  • SJR 0.391

  • Impact Factor 0.8 (2024)

Frequency :
Monthly (January, February, March, April, May, June, July, August, September, October, November and December)
Indexing Services :
BIOSIS Preview, ISI Citation Index, Biological Abstracts, Elsevier (Scopus and Embase), AGRICOLA, Google Scholar, CrossRef, CAB Abstracting Journals, Chemical Abstracts, Indian Science Abstracts, EBSCO Indexing Services, Index Copernicus

Development and Validation of Soybean [Glycine max (L.) Merrill] Core Sets and Identification of Trait-specific Accessions from the Best Core Set

Kaveri Chawan1, P. Ravishankar1, S. Ramesh1, T. Onkarappa1, H.H. Sowmya1,*
1Department of Genetics and Plant Breeding, College of Agriculture, Gandhi Krishi Vigyana Kendra, University of Agricultural Sciences, Bengaluru-560 065, Karnataka, India.
  • Submitted27-07-2023|

  • Accepted07-03-2024|

  • First Online 14-05-2024|

  • doi 10.18805/LR-5215

Background: Core collection of germplasm accelerates breeding objective of a crop. A core set of trait specific accessions reduces time, money and valuable man power in any crop breeding system and Standard stratified clustering approach is the most preferred approach to construct the coreset.

Methods: During summer of 2020-21, the genetic variability of 2000 soybean germplasm accessions were assessed at University of Agricultural Sciences, Bengaluru. The clustering approach was deployed to develop 8 core sets from the base collection based on 13 qualitative and 7 quantitative traits. The core sets were validated using various univariate and multivariate statistics to assess their representativeness of the base collection. During kharif 2021, 300 germplasm accessions (15% core size) were evaluated at two sites viz., GKVK, Bengaluru and KVK Doddaballapur, to identify trait-specific accessions from the best core set.

Result: Eight core sets were developed in the current study using the SSC approach. Logarithmic sampling with preferred allocation approach-based core set of 15% size was identified as the best representative of the base collection. Many trait-specific accessions were found promising for the combination of desirable traits from the best core set suggesting their preferential use in breeding programme.

Soybean [Glycine max (L.) merrill] is an important pulse and oilseed crop rich in protein (40%) and oil (20%). Over 1,70,000 soybean germplasm collection is maintained at various gene banks throughout the world and forms the basis of the diversity available in soybean (Carter et al., 2004). Increased use of available genetic resources is required for diversifying genetic base cover to changing consumer and end-user preference and address biotic and abiotic stress emerging to the effect of climate change. The large size of germplasm, possibility of duplicate/redundant accessions and /or repeated sampling of the same accessions are obstacles to effective management and evaluation (Upadhyaya et al., 2001; Jain et al., 2017). The use of only a small portion of the available germplasm within any crop is attributed to lack of quality information on evaluation data (Brown 1989) emphasising the need for a core set. A good core collection includes cultivars, breeding lines, land races and wild species and should have distinguished characterstics like representativeness, low redundancy manageability, data completeness and usefulness (Van hintum et al., 2000). Core collections developed in medicago (Diwan et al., 1995), chickpea (Upadhyaya et al., 2001), pigeonpea (Reddy et al., 2005), cowpea (Mahalakshmi et al., 2007), rice (Yan et al., 2007) is effectively utilized in improvement of respective crops.

Several methods were developed and adopted to construct a core set like standard, stratified clustering (SSC) (Brown, 1989), M-Stat (Gousenard and Bataillon 2001), genetic distance sampling (Jansen and Van Hintum, 2007) and core hunter (Thachuk et al., 2009). The SSC approach is preferred by most researchers to ensure selection of common alleles (Crossa et al., 1995). Several workers (Minmin et al., 2020; Darai et al., 2020) have identified trait -specific accessions in soybean.

Current investigation was carried out to develop core sets from a large group of 2000 germplasm accessions using the SSC approach and the best set was selected using various univariable and multivariate validation statistics. Trait-specific accessions were finally listed as useful information for the scientific group.
A set of 2000 germplasm accessions along with three checks were characterized for 13 qualitative traits and 7 quantitative traits during summer 2020-21 following augmented design at University of Agricultural Sciences, Bangalore. Each block has consisted of 100 germplasm accessions and three checks (replicated twice). Observations were recorded on five randomly selected plants based on visual observation for 13 qualitative traits and 7 quantitative traits. A standard stratified clustering approach based on quantitative trait data was used to develop the core sets in the present study. The method used in the SSC approach for creating core sets is discussed below:
 
Stratification of the germplasm accessions
 
Accessions were classified into 10 clusters following Ward’s hierarchical clustering algorithm based on adjusted means for 7 quantitative traits. Two sampling strategies, i.e., proportional and logarithmic and two allocation strategies, viz., random and preferred, were followed to determine the number of accessions to be selected from each cluster for inclusion in core sets. Thus, a total of 8 core sets were developed following the SSC approach.
 
Validation of core sets
 
The chi-square statistic was used to test the homogeneity of accessions for qualitative trait-based frequency distribution of base and core collections. Retention of qualitative trait classes by the core collection was determined using ‘Shannon-Weaver diversity index’ (Shannon and Weaver, 1949) and ‘class coverage’ (Kim et al., 2007) statistics. To assess the representativeness of eight core sets, quantitative trait-based validation statistics such as standardized mean difference (SMD %), variance difference (VD %), coincidence rate (CR %) and variable rate (VR %) (Hu et al., 2000) were used.
 
Identification of trait-specific accessions from the core set
 
During kharif 2021, 300 (15% core size) soybean germplasm accessions and three check entries were evaluated at two locations to identify the trait-specific accessions from the best core set. The germplasm accessions were classified following a model-based “k-means’ clustering approach (Mac Queen, 1967) to unravel the organization of variability. Based on early flowering time [mean - 1 SD], i.e., ≤56.58 days from sowing, less plant height (mean - 1 SD) ≤36.45, early maturity (mean - 1 SD) ≤109.71 days from sowing and higherexpression of the accessions for the other four traits (mean + 2 SD), single trait-specific and multiple trait-specific accessions were identified.
Representativeness of core sets
 
The classes of 13 qualitative traits from eight distinct core sets were compared to those of the base collection. Except for proportional and logarithmic sampling with random allocation of sizes 10 percent and 15 percent, the frequency distribution of qualitative traits was comparable to that of the base collection (chi-square was significant for <4 traits). All of the core sets in the current study had H’ estimates that were comparable to the base collection and covered more than 80% of the defined qualitative trait classes, indicating their representativeness of the base collection for qualitative traits (Table 1).

Table 1: Summary of validation statistics to identify representative and best core set (s) of soybean germplasm accessions.



All the core sets were comparable to those in the base collection for quantitative trait means (‘t’ test was significant for ≤4 traits). The SMD (%) of all eight core sets was less than 4, confirming their representativeness for quantitative trait means. Logarithmic sampling with preferred allocation strategy (of 15% size) retained higher VD (%), CR (%) and VR (%) than other approaches-based core sets (Table 1).
 
Comparison of SSC strategies
 
In the current study, a core size of 15% has retained more VD (%), CR (%) and VR (%), indicating greater representativeness of the core sets (Table 2). Logarithmic sampling strategy-based core sets better represented the base collection than proportional sampling strategy-based core sets since they retained higher CR (%), VR (%) and VD (%) (Table 2). Among the two allocation strategies, preferred allocation was superior to random allocation, as evidenced by a lower SMD (%) and higher CR (%), VR (%) and VD (%) (Table 2) (Crossa et al., 1995).

Table 2: Comparison of core sizes, sampling strategies and allocation strategies of developing core sets in soybean.


 
Efficiency of SSC approaches
 
Among the eight representative core sets, logarithmic sampling with preferred allocation approach-based core set of size 15 per cent was identified as the best representative of the base collection since it retained higher CR (%), VD (%) and VR (%) based on quantitative traits, H’ estimates were comparable to those of the base collection and “class coverage” statistics covered more than 80 per cent of the defined qualitative trait classes.
 
Identification of trait-specific accessions from the core set
 
Qualitative traits
 
Plant growth, leaf and floral traits
 
Genotypes with purple hypocotyl (75.91%), Purple flowers (53.8%), indeterminate leaf shape (50.47%) with light green (43.89%) leaves dominated the collection (Table 3a).

Table 3a: Variability for plant growth, leaf and floral traits and their frequency in core set of 300 soybean germplasm accessions.


 
Pod and seed traits
 
Accessions bearing pubescent pods (97.36%) with tawny (64.69%) and light tawny (24.75%) colored pubescence, erect type pubescence (49.5%), dark brown pod accessions (38.61%) dominated the collection. Genotypes with yellow-colored seed coats (60.07%) were found to be prominent (Table 3b) (Shruthi et al., 2022).

Table 3b: Variability for pod and seed traits and their frequency in core set of 300 soybean germplasm accessions.


 
Quantitative traits
 
ANOVA exhibited a highly significant mean sum of square values for all quantitative traits except days to 80 per cent maturity, while mean sum of square values due to checks were significant for all traits except for pods plant-1 at KVK, Dodballapura (Table 4). Analysis of variance revealed highly significant mean squares attributable to “germplasm accessions” for all traits, while mean sum of square values due to checks were significant for all traits, except for pods plant-1 at K-Block GKVK, University of Agricultural Sciences, Bengaluru. These results indicated differential performance of accessions and checks at both locations.

Table 4: Analysis of variance of a core set of 300 soybean germplasm accessions for seven quantitative traits.



The accessions were highly variable for pods plant-1, 100 seed weight and seed yield plant-1 traits. Broad-sense heritability was higher (> 60%) for all the traits. Estimates of expected GAM were higher for all the traits except days to 80 per cent maturity (10.89%) (Table 5) (Darai et al., 2020; Shruthi et al., 2020).

Table 5: Descriptive statistics for seven quantitative traits in a core set of 300 soybean germplasm accessions.


 
Organization of variability among 300 soybean germplasm accessions
 
With the exception of secondary branches plant-1, the quantitative trait mean differences and variances between the ten clusters were significant for all traits (Table 6 and 7). The estimates of the means of 7 quantitative traits were highest among the accessions included in cluster X and cluster VII and were least among the accessions included in cluster IX.

Table 6: Estimates of quantitative traits means of a core set of soybean germplasm accessions belonging to different clusters.



Table 7: Estimates of quantitative traits variances among a core set of soybean germplasm accessions belonging to different clusters.


 
Trait-specific accessions
 
In the present study, some of the germplasm accessions were comparable to or superior to the check JS-335 with respect to seven quantitative traits (Table 8) (Minmin et al., 2020). The accessions listed in Table 9 were promising for combination of desirable traits.  

Table 8: Promising trait-specific accessions in a core set of 300 soybean germplasm.



Table 9: Promising accessions identified for multiple traits in a core set of 300 soybean germplasm.

The soybean core collection developed in this study will provide valuable genetic resources for soybean breeders and researchers for screening soybean germplasm and identifying desirable genotypes for economically important traits and addressing the climate change challenges. The soybean core collection created in the current study can also be used in association mapping studies to identify the genes and QTLs linked to numerous economically significant features. The trait-specific accessions from the best core set are suggested for preferential use in crossing programme to generate variability for developing farmer-acceptable varieties with consumer/end-user-preferred traits.
All authors declared that there is no conflict of interest.

  1. Brown, A.H.D. (1989). Core collections: A practical approach to genetic resources management. Genome. 31: 818-824.

  2. Carter, T.E., Nelson, R.L., Sneller, C.H. and Cui, Z. (2004). Genetic Diversity in Soybean. In: Soybeans: Improvement, Production and Uses. [Boerma, H., Specht, J. (eds)], American Society of Agronomy, Madison, (Soybean monograph). p. 303-450.

  3. Crossa, J., Delacy, I.H. and Taba, S. (1995). The use of Multivariate Methods in Developing a Core Collection. In: Core Collections of Plant Genetic Resources [Hodgkin, T., Brown, A.H.D., Van Hintum, TH.J.L. and Morales, E.A.V. (eds.)]. John Wiley and Sons, UK. Pp. 77-92.

  4. Darai, R., Dhakal, K.H. and Sah, R.P. (2020). Genetic variability of soybean accessions for yield and yield attributing traits through using multivariate analysis. Int. J. Plant Res. 4(3): 10-16.

  5. Diwan, N., Mcintosh, M.S. and Bauchan, G.R. (1995). Methods of developing a core collection of annual Medicago species. Theor. Appl. Genet. 90: 755-761.

  6. Gousenard, B. and Bataillon, T.M. (2001). MSTART: An algorithm for building germplasm core collections by maximizing allelic or phenotypic richness. J. Hered. 92(1): 93-94.

  7. Hu, J., Zhu, J. and Xu, H.M. (2000). Methods of constructing core collections by stepwise clustering with three sampling strategies based on the genotypic values of crops. Theor. Appl. Genet. 101: 264-268.

  8. Jain, R.K., Joshi, A., Chaudhary, H.R., Dashora, A. and Khatik, C.L. (2017). Study on genetic variability, heritability and genetic advance in soybean [Glycine max (L.) Merrill]. Legume Res. 41(4): 532-536. doi: 10.18805/LR-3874.

  9. Jansen, J. and Van Hintum, J.L. (2007). Genetic distance sampling: A novel sampling method for obtaining core collections using genetic distances with an application to cultivated lettuce. Theor. Appl. Genet. 114(3): 421-428.

  10. Kim, K.W., Chung, H.K., Cho, G.T., Ma, K.H., Chandrabalan, D., Gwag, J.G., Kim, T.S., Cho, E.G. and Park, Y.J. (2007). Power core: A program applying the advanced M strategy with a heuristic search for establishing core sets. Bioinformatics. 23: 2155-2162.

  11. Mac Queen, J.B. (1967). Some Methods for Classification and Analysis of Multivariate Observations, Proceedings of 5th Berkeley Symposium on Mathematical Statistics and Probability, Berkeley, University of California Press. 1: 281-297.

  12. Mahalakshmi, V., NG, Q., Lawson, M. and Ortiz, R. (2007). Cowpea [Vigna unguiculata (L.) Walp.] core collection defined by geographical, agronomical and botanical discripters. Plant Genetic Resources: Characterization and Utilization. 5(3): 113-119.

  13. Minmin, L., Ying, L., Chunsheng, W., Xue, Y., Dongmei, L., Xiaoming, Z., Chongjing, X., Yan, Z., Wenbin and Lin, Z. (2020). Idnetification of trait contributing to high and stable yield in different soybean varieties across three chinese lattitudes. Front. Plant. Sci. 10: 1642. https://doi.org/ 10.3389/fpls.2019.01642.

  14. Reddy, L.J., Upadhyaya, H.D., Gowda, C.L.L. and Singh, S., (2005), Development of core collection in chickpea [Cajanas cajan (L.) Millspaugh] using geographic and quaalitative morphological discripters. Genet. Resour. Crop Evol. 52: 1049-1056.

  15. Shannon, C.E. and Weaver, W. (1949). The Mathematical Theory of Communication. University Illinois Press, Urbana, USA.

  16. Shruthi K., Siddaraju R., Naveena K., Ramanappa T.M., Vishwanath K. (2020). Assessment of variability based on morphometric characteristics in the core set of soybean germplasm accessions. Legume Resarch. 44(4): 375-381. doi: 10.18805/ LR-4286.

  17. Shruthi K., Siddaraju R., Naveena K., Ramanappa T.M., Gireesh C., Vishwanath K., Nagaraju K.S. (2022). Trait based modelling approach for selection of elite germplasm accessions in soybean [Glycine max (L). Merrill]. Legume Res. 45(7): 822-827. doi: 10.18805/LR-4567.

  18. Thachuk, C., Crossa, J., Franco, J., Dreisigacker, S., Warburton, M. and Davenport, G.F. (2009). Core hunter: An algorithm for sampling genetic resources based on multiple genetic measures. BMC Bioinformatics. 10: 243. doi: 10.1186/ 1471-2105-10-243.

  19. Upadhyaya, H.D., Bramel, P.J. and Singh, S. (2001). Development of a chickpea core subset using geographic distribution and quantitative traits. Crop Sci. 41: 206-210.

  20. Van Hintum, T.H.J.L., Brown, A.H.D., Spillane, C. and Hodgkin, T. (2000). Core collections of plant genetic resources. IPGRI Technical Bulletein 3. International Plant Genetic Resources Institute, Rome, Italy.

  21. Yan, W., Rutger, J.N., Bryant, R.J., Bockelman, H.E., Fjellstrom, R.G., Chen, M.H., Tai, T.H. and Mcclung, A.M. (2007). Development and evaluation of a core subset of the USDA rice germplasm collection. Crop Sci. 47: 869-878.

Editorial Board

View all (0)