Overview of the Dolang sheep ovarian transcriptome
The objective of this study was to identify the changes in gene expression in Dolang sheep ovaries during the estrus and anestrus (10 days after estrus) periods and at ~45 days of pregnancy. Toward this end we carried out high-throughput sequencing using the Illumina HiSeq 2000 platform, which generated ~32.99 million raw reads and ~31.69 million clean reads across all samples (Table 1). Subsequently, the clean reads from each sample were aligned to the
Ovis aries genome sequence, resulting in 76.45% to 77.95% of clean reads with no more than a 3 bp mismatch.
To profile ovarian gene expression, we counted the number of clean reads that aligned to the
Ovis aries gene sequences and performed a normalization step using the fragments per kilobase of transcript per million reads mapped (FPKM) method. After filtering for weakly expressed genes (<5 RPKM), we identified a total of 28,717 genes across all samples. Of these, 586 genes showed differential expression. A total of 3 genes were commonly expressed across all samples, and 192, 247 and 18 genes were expressed exclusively during the estrus, anestrus, and pregnancy periods, respectively (Fig 1). Upon using a significance level cutoff of p<0.05, we identified 242 genes that were upregulated and 66 that were downregulated during the estrus period relative to their expression level in the anestrus period. Similarly, 22 genes were upregulated and 36 were downregulated during the anestrus period relative to their expression level in the pregnancy period. Further, 81 genes were upregulated and 271 were downregulated during the estrus period relative to their expression level in the pregnancy period (Fig 2). Of these, 36 DEGs were potential transcription factors and 15 candidates remained after removing duplicates (XM_101121392, bHLH; XM_101120004, zf-C2H2; XM_101120262, Homeobox; XM_101107885, HMG; XM_101113274, Transcription_Cofactors,
etc.) We identified 347,659, 312,750, and 287,191 single nucleotide polymorphisms (SNPs) in ovary samples from estrus, anestrus and gestating sheep, respectively. The direct sequencing of transcriptomes coupled with downstream analysis is an effective method to discover new or important functional genes
(Qian et al., 2014; Wickramasinghe et al., 2014; Fan et al., 2015) and has been applied to various organisms
(Gui et al., 2013; Kordonowy et al., 2017; Ruan et al., 2015; Tian et al., 2019). Usually, only a few common genes or DEGs are identified, although the absolute numbers may be changed by using a wider grouping or different parameter settings. However, these genes can provide valuable information for further identification of genes related to the regulation of fertilization, ovulation, estrus, immunity, or disease in model organisms.
GO and KEGG pathway analysis
In the GO analysis, 16,557 of the identified genes were classified as belonging to the ontology component, with a p≤ 1. Of these, 79.5% clustered in the cell and cellular component class and 73.2% clustered with genes related to intracellular components. Additionally, we found that 15,895 genes clustered in the molecular function class and the most enriched molecular function was molecular binding, with 12,482 annotated genes (78.5% of 15 895 genes), followed by catalytic activity (38.5% of genes). From the biological processes class, we identified 15,488 genes related to cellular process, metabolic process and primary metabolic process, representing ~77.10%, ~52.8% and ~43.8% of the genes in these groups, respectively, while ~7.3% and ~4.0% genes were involved in reproduction and embryo development processes, respectively. Partial GO annotations are shown in Fig 3.
The KEGG pathway database is a powerful tool for analyzing gene and protein function in regulation networks
(Kanehisa et al., 2012). To identify the biological pathways connected to the genes expressed in the ovaries at different physiological stages, all functional genes were submitted for a KEGG pathway enrichment analysis. A total of 9731 genes were assigned to 298 KEGG pathways (Fig 4) and clustered into 30 main categories, including metabolic pathways (575 genes), endocytosis (139 genes), pathways in cancer (136 genes), the PI3K-Akt signaling pathway (120 genes), Huntington’s disease (104 genes), focal adhesion (104 genes), spliceosome (104 genes) and RNA transport (101 genes). Among the 30 enriched categories, the metabolism pathway had the largest number of represented genes. The metabolic pathways with DEGs included genes for energy metabolism, carbohydrate metabolism, lipid metabolism, intracellular respiratory metabolism, and amino acid metabolism. Of these, DEGs enriched for the lipid metabolism and amino acid metabolism processes were associated with the synthesis of steroid hormones. Thus, the results of the GO and KEGG analysis provided a valuable resource for investigating the high prolificacy and year-round estrus processes.
Bioinformatics analysis of the DQA, DQB and LOC101106374 genes
We identified the
DQA,
DQB and LOC101106374 genes as being upregulated during the gestation period. GO and KEGG analysis indicated that these genes may be involved in the improvement of resistance to diseases. This finding indicated that these genes play a role in maintaining the sheep’s health during pregnancy and possibly help toward preventing abortion, brucellosis, toxoplasmosis and globidiosis.
The 903 bp
DQA gene is located on chromosome 20. In the KEGG pathway analysis, this gene was linked to various roles such as in the production of cell adhesion molecules (CAMs; ID:oas04514), graft-versus-host disease (ID:oas05332), viral myocarditis (ID:oas05416), intestinal immune network for IgA production (ID:oas04672), asthma (ID:oas05310), autoimmune thyroid disease (ID:oas05320), inflammatory bowel disease (IBD; ID:oas05321), leishmaniasis (ID:oas05140), antigen processing and presentation (ID:oas04612),
Staphylococcus aureus infection (ID:oas05150), phagosome (ID:oas04145), influenza A (ID:oas05164), herpes simplex infection (ID:oas05168), systemic lupus erythematosus (ID:oas05322) and tuberculosis (ID:oas05152). A GO functional analysis revealed that
DQA is associated with antigen processing and presentation (GO:0019882), the MHC protein complex (GO:0042611), the MHC class II protein complex (GO:0042613), immune system process (GO:0002376), plasma membrane protein complex (GO:0098797), immune response (GO:0006955), cell periphery (GO:0071944), protein binding (GO:0005515), protein complex (GO:0043234), membrane protein complex (GO:0098796), binding (GO:0005488), response to stimulus (GO:0050896), membrane (GO:0016020), cell (GO:0005623) and other functions. The DQA protein binds to specific epitopes and this binding is important for triggering the immune system to attack virus-infected cells through proteins
(Juul-Madsen et al., 2012; Fan et al., 2019; Xie et al., 2019).
The 558 bp
DQB gene is located on chromosome 20. In the KEGG pathway analysis, this gene was associated with several functions, including in the production of CAMs (ID:oas04514), graft-versus-host disease (ID:oas05332), allograft rejection (ID:oas05330), viral myocarditis (ID:oas05416), intestinal immune network for IgA production (ID:oas04672), asthma (ID:oas05310), autoimmune thyroid disease (ID:oas05320), inflammatory bowel disease (IBD) (ID:oas05321), leishmaniasis (ID:oas05140), antigen processing and presentation (ID:oas04612),
Staphylococcus aureus infection (ID:oas05150), toxoplasmosis (ID:oas05145), phagosome (ID:oas04145), influenza A (ID:oas05164), herpes simplex infection (ID:oas05168), systemic lupus erythematosus (ID:oas05322) and tuberculosis (ID:oas05152). A GO functional analysis indicated that
DQB is associated with processes for protein binding (GO:0005515), binding (GO:0005488) and molecular function (GO:0003674). In a study by
Diana et al., (2016), a 172 bp fragment of exon 2 of the MHC Class II genes at the
DQB locus was genotyped for 80 blue whales and 22 putatively functional
DQB allotypes were identified. These results indicate that the immunogenic variation in blue whales is comparable to that in terrestrial mammals.
Additionally, the
HLA-DQB2 gene may contribute to genetic susceptibility to tuberculosis
(Wang et al., 2018) and recurrent aphthous stomatitis
(Najafi et al., 2016).
The 792 bp LOC101106374 gene is located on chromosome 20. In the KEGG pathway analysis, this gene was associated with various functions, including in the production of CAMs (ID:oas04514), antigen processing and presentation (ID:oas04612), type I diabetes mellitus (ID: oas04940), autoimmune thyroid disease (ID:oas05320), viral myocarditis (ID:oas05416), graft-versus-host disease (ID:oas05332), allograft rejection (ID:oas05330), phagosome (ID:oas04145), herpes simplex infection (ID:oas05168), HTLV-I infection (ID:oas05166) and endocytosis (ID:oas04144). A GO functional analysis indicated that
LOC101106374 is associated with the immune system process (GO:0002376), immune response (GO:0006955), binding (GO:0005488), molecular function (GO:0003674), biological_process (GO:0008150), response to stimulus (GO:0050896) and protein binding (GO:0005515). There were no reports in the literature for LOC101106374 function, but the gene was found to be related to improving immunity.
Collectively, these findings indicate that
DQA,
DQB and LOC101106374 are related to immunity or disease processes. Because all three genes were significantly upregulated in the pregnant sheep in this study, it can be concluded that these genes are involved in maintaining a high disease resistance state during gestation in sheep. Moreover, based on these and previous findings,
DQA,
DQB and LOC101106374 genes can be considered potential biomarkers for monitoring disease progression, brucellosis, toxoplasmosis, globidiosis and abortion risk during pregnancy in sheep.