Soybean protein and fat content in F2 population
The protein and fat content was shown to vary greatly between the parents and to have a near normal distribution and a wide distribution frequency in F2
population (Fig 1); this was typical of a quantitative genetic model. Protein content variation was 37.29%~44.50% and the average value was 40.90% in F2
. The protein content is the separation of mid parent, biased towards the female parent. Regarding fat content in the F2
group, the variation was 17.31%~23.34% and the average value was 19.73% (Table 1). The fat content is also the separation of mid parent, biased towards the female parent. Super parent isolated single plants were detected in the offspring for both protein and fat content.
Fig 1: Histogram of soybean protein and fat content in F2. (A) Frequency distribution of soybean protein content (B) Frequency distribution of soybean fat content.
Genetic analysis of soybean protein and fat content
Table 1: Distribution of soybean protein and fat content in parent plants and F2.
Based on analysis of the major gene plus polygene genetic model, the likelihood function and the Akaike information criterion (AIC) value of the protein and fat content under different genetic models were obtained by the IECM algorithm. According to the principle of minimum AIC value, C and E-2 were preliminarily determined as alternative models for protein content, C and E-1 as alternative models for fat content. Further fitness test results showed that the C model (polygenic genetic model) was the most suitable model for protein and fat content (Supplementary Table 1 and Table 2) and the polygenic heritability was 71.15% and 79.15%, respectively.
Supplementary Table 1: Suitability test for protein content genetic models.
QTL mapping of soybean protein and fat content
Supplementary Table 2: Suitability test for fat content genetic models.
In this study, 380 pairs of SSR primers were used to screen the polymorphic primers between the parents. The results showed that 118 pairs of primers showed polymorphism in the parents and the polymorphism rate was 31.05%. SSR-PCR was carried out on 236 individual plants of F2
isolated population using polymorphic primers. Finally, a SSR linked genetic map containing 102 markers was constructed. Fourteen QTLs related to protein content were detected, which were distributed in six linkage groups, including 4 (C2), 6 (A1), 12 (G), 13 (C1), 17 (M) and 22 (F). Among them, four QTLs had a phenotypic variation of more than 20% and one was stable in 2 years. Ten QTLs related to fat content were detected that were distributed in five linkage groups, including 1 (A1), 4 (C2), 12 (G), 17 (M) and 22 (F). Among them, one stable QTL in three continuous years was detected in the 12 (G) linkage group Sat_287~Sat_342 marker interval and one stable QTL in two continuous years was detected in the (C2) linkage group Satt100~Sat_238 marker interval. Additionally, three major QTLs related to soybean fat content were also detected (Supplementary Table 3 and Fig 2).
Supplementary Table 3: QTLs of soybean protein and fat content detected by ICIM-ADD.
Marker-assisted selection of soybean protein and fat content
Fig 2: Location of additivity effect QTLs on linkage groups.
Note: Red triangles represent QTL of protein content in F2 generation. Blue triangles represent QTL of protein content in F2:3 families. Purple triangles represent QTL of protein content in F3:4 families. Red diamonds represent QTL of fat content in F2 generation. Blue diamonds represent QTL of fat content in F2:3 families. Purple diamonds represent QTL of fat content in F3:4 families.
High fat and high protein molecular markers were detected using six stable SSR markers (Satt 100, Sat_287, Satt 150, Satt 636, Sat_238 and Sat_342) related to soybean protein and fat content in 108 soybean materials from Biotechnology Center of Jilin Agricultural University. The detection coincidence degree of high-oil molecular markers was (high to low): Sat_238 (95.12%) > Satt 100 (87.80%) > Satt150 (75.61%) > Satt636 (71.95%) > Sat_287 = Sat_342 (68.29%). The detection coincidence degree of high-protein molecular markers was (high to low): Satt150 = Sat_342 (84.00%) > Satt 100 (77.33%) > Satt636 (70.67%) > Sat_287 (52%) > Sat_238 (50.33%) (Table 2 and Fig 3).
Table 2: High fat marker selection screening in 108 soybean materials using six SSR markers.
Comparison of major gene plus polygene model analysis with QTL mapping for soybean main quality traits
Fig 3: Electrophoresis using Sat_342 and Satt150 primers of soybean materials.
Note: M: Marker; 1: Jiyu 50; 2: Jinong 18; 3–20: selected soybean materials.
The quantitative analysis of genetic traits and molecular marker loci can be carried out on the genetic traits of quantitative traits and the results of the two analyses can be used to confirm each other. The inheritance of protein content was shown to follow a polygene genetic model according to the results of the fifth generation (P1
) populations. Therefore, it should be possible to locate several QTLs with LODs of similar sizes. QTL mapping showed that nine minor QTLS with similar LOD sizes were detected in soybean F2
populations (phenotypic variation rate <10%) and only one major QTL was detected (phenotypic variation rate >10%) protein content. The inheritance of fat content was also shown to follow a polygene genetic model according to the results of the fifth generation (P1
, P2, F2
) populations. In the F2
populations, seven minor QTLs with similar LOD sizes were detected and only two major QTLs. In general, the results of the model analysis are similar to those of QTL mapping. Our findings are also consistent with those reported by Xu (2006)
and Wang (2001)
, but differ to those reported by Zheng et al. (2007)
. This may be because the isolation analysis method can only detect genes with strong effects in QTL mapping analysis, while other genes are classified as micro-polygenes. Thus, the number of major genes detected in QTL mapping usually exceeds the number of major genes detected by model analysis, which is consistent with the findings of Wang (2000)
and Xu (2006)
. Additionally, because the population used in our study was the early F2
populations after hybridization, not the stable RIL population, the genetic parameters were less relative to the RIL population and the F2
data used for statistical analysis did not represent average values. Thus, the experimental results were affected by the environment, so should be verified by extending the parental analysis.
Marker-assisted selection of soybean main quality traits
Using the six pairs of SSR markers, which were localized and stable in relation to soybean protein and fat content, 108 soybean seed resources were analysed for high oil and high protein. The detection coincidence degree of SSR markers exceeded 50%, with a maximum of 95.12%. We found that the Satt100 marker was closely linked to the fat content and was identified as a marker of repeated positioning more than twice, which is consistent with previous studies (Hou et al., 2014).
The marker related to fat content was stable under different genetic background conditions, indicating that it could be used in marker-assisted selection breeding for high oil and protein content in soybean. The detection coincidence degree was high (>90%) for Sat_238, but was lower (<90%) for the other five markers (Satt100, Satt150, Sat_342, Satt636 and Sat_287). This could be because the protein and fat content are quantitative traits controlled by multigenes. Therefore, future work should further investigate markers that are closely linked with protein and fat content (Yang et al., 2008).
Additionally, although we identified QTLs that were stable in different generations, we did not verify the stability under different environmental conditions. Further testing and verification is under way to confirm the selection effect of these molecular markers.