Experiment material
There were 12 pig breeds involved in this experiment, including Duroc, Landrace, Large White, Sujiang pigs, Banna pigs, Erhualian pigs, Wuzhishan pigs, Bama pigs, Tibetan pigs, Meishan pigs, Fengjing pigs and Wild boars. Ear tissues were collected in parallel to agricultural procedures
(i.e., pulling in ear tags). All collected samples were stored at -80°C and detailed information is shown in Table S1. This experiment was carried out from September 2020 to August 2021 in the Laboratory of in College of Animal Science and Technology, Yangzhou University.
Primer design and synthesis
Primers were designed by Oligo7 software according to Fig 1A. For a SINE insertion site, the primers were designed in the flanking regions at both ends of the SINE insertion position. Only one larger band could be got for the homozygous SINE insertion (+/+) individuals and only one smaller band for homozygous SINE deletion (-/-) individuals and two bands will be got for the heterozygotes (+/+) (Fig 1B). The LINE and endogenous retrovirus (ERV) inserts involved in this article are less than 2000bp and the primers are also designed according to the SINE insertion detection method. Primers were synthesized by Nanjing Kinco Biotech Co., Ltd (Nanjing, China) and their coordinates in the reference genome, annealing temperature and sequences were supplied in Table S2. The PCR
(Kaushik et al., 2017) amplification time was determined according to the target fragment size.
RIP annotation of porcine IGF2BP genes
Sequence acquisition of porcine IGF2BP genes and their flanks
The online website Ensembl (
http://asia.ensembl.org/index.html) was used to obtain the sequences of the 3 genes of Duroc pig IGF2BP as reference sequences (IGF2BP1: ENSSSCG0000002312; IGF2BP2: ENSSSCG00000011 795; IGF2BP3: ENSSSCG00 0000366 95) and the sequence was extended to the 5'flanking region and 3'flanking region by 5000 bp and 3000 bp, respectively. NCBI Blast (
https://blast.ncbi.nlm.nih.gov/Blast.cgi?PROGRAM=blastnandPAGE _TYPE= BlastSearchandLINK_ LOC=blasthome) was used to compare the reference sequences with the pig non-reference genomes in the WGS library and obtain the gene sequence fragments of the 3 genes of pig IGF2BP in the non-reference genomes. Finally, the sequence fragments were spliced together according to the reference sequence and a complete sequence from each genome for each gene was obtained.
Retrotransposon annotation
RepeatMasker
(Caballero et al., 2014) (versions: 4.0.7, -cutoff 250 -nolow) was used to perform retrotransposon annotation on the gene (including flanking region) sequences of all genomes obtained and only retaining the annotations with an alignment score of more than 1000 and size more than 100 bp. Multiple sequence alignment of each gene got from 16 genomes was done by using Clustalx (2.0 version) software, then the structural variations (more than 50 bp) were counted manually. All statistics are recorded but they may not be accurate due to the uncertainty of sequencings, such as the existence of gap or long sequence N, which had not been counted. Then, the structural variations that overlap with retrotransposons over 80% of the length were identified as predicted retrotransposon insertion polymorphic sites (RIPs) and be further verified by PCR.
Verification for the predicted RIPs in porcine IGF2BPs gene
Two DNA pools were prepared for each breed after DNA was extracted from the above 12 breeds by using the Tiangen Genome Extraction Kit (DP304) kit and the extraction steps were carried out strictly following the instructions. Each pool is made up of equal amounts of DNA from three individuals and the final concentration was adjusted to 40 ng/mL. Using pool DNA as a template, PCR amplification was performed to verify each predicted RIP
(Zheng et al., 2020) (Fig 2).
Conservation analysis of porcine IGF2BP genes
The Fasta format file of the corresponding gene sequence of each gene in 7 different species (cow, sheep, dog, horse, human, mouse. See Table S2 for detailed species information) and the annotation information of each porcine IGF2BP gene was downloaded from Ensembl. Then the conservation analysis was done using the online program of mVISTA (
http://genome.lbl.gov/vista/mvista/submit.shtml). The gene sequence of the pig genome was used as a reference sequence for conservative analysis and a conservative peak map was generated in Fig S2.
Population genetic analysis
Six pig breeds information of the samples is shown in Table S1. HWE and Polymorphic information content (PIC) analysis. The genotype and allele frequencies were calculated and Hardy-Weinberg equilibrium
(Jadhav et al., 2020) was tested using the chi-square test in the Popgene32 software
(Yeh et al., 1999). PIC was calculated according to the formula:
Linkage disequilibrium for four RIPs in GHR genes was performed by Haploview
(Barrett et al., 2005).