Genome features
The mt genome of
G. spengleri was 17,448 bp long and conformed to other consensus vertebrate mitochondrial form (Fig 2). It consisted of 13 protein-coding, 2 rRNA, 22 tRNA genes and one control region (D-loop), all of which were similar in length to their counterparts in other turtles (Supplementary Table 1). There were few or small noncoding intergenic spacers, whereas a 26 bp intervening sequence existed between tRNAAsn and tRNACys genes, a 16 bp intervening sequence existed between ND4 and tRNAHis genes,and three partially overlapping,
i.e., ATP8-ATP6, ND4L-ND4 and ND5-ND6 sharing 22, 7 and 5 nucleotides, respectively, were found (Table 1).
Totally, it encodes 3,775 amino acids. The most frequently used amino acids was Leu (16.2%), Thr (9.2%), Ile (8.1%) and Ser (7.5%), while the least common amino acids were Cys (0.9%), Asp (1.8%), Arg (1.8%) and Lys (2.3%) (Fig 1A). Relative synonymous codon usage (RSCU) values for the third positions of the 13 PCGs was shown in Fig 1B. The usage of both two-fold and four-fold degenerate codons was biased toward the use of codons abundant in T or A, in accord with other turtles.
The nucleotide compositions of the 18 mitochondrial genomes were shown in Table 2. The base composition of the major coding strand of
G. spengleri mtDNA was A, 33.7%; G, 13.1%; C, 25.6% and T, 27.6%, showing high average A%+T% and low G% contents as in most other vertebrates. Strand asymmetry of nucleotide composition is usually described by AT and GC skews. The mitochondrial genome AT skews of these chelonian species ranged from 0.098 to 0.136, while the GC skews were all negative, arranging from -0.375 to -0.320. Most of the
G. spengleri mt genome protein-coding genes had a start codon of ATG, ND2 start with an ATA initiation codon, COI start with a GTG initiation codon, while ND6 uses CCT as the initiation codon. Protein-coding genes of six terminated with TAA, three ended with TAG, two ended with CAT, one ended with AGG and the remainder of the reading frames had an incomplete termination of a single nucleotide T (Table 1), where the post-transcriptional polyadenylation could produce a TAA stop codon. Besides, we discovered an extra base “G” at a specific position in the ND3 gene of
G. spengleri.
Ribosomal and transfer RNA genes
16S rRNA is located between the tRNAIle and tRNAVal genes and 12S rRNA is located between the tRNAVal and tRNATrp genes. The lengths of 12S rRNA and 16S rRNA genes of
G. spengleri are 964bp and 1681bp, respectively. All the mitochondrial genomes of the 18 chelonian species contain 22 tRNAs, 14 of which are encoded by the H-strand and 8 encoded by the L-strand. Among the 22 tRNAs, two forms of tRNALeu (UUR and CUN) and two forms of tRNASer (UUR and CUN) were observed in all chelonian species, ranging in size from 62 to 76 bp and in few of their end sequences overlapping with neighboring tRNAs or rRNAs. All tRNAs except tRNASer (AGY) could be folded into the typical clover-leaf secondary structure.
Control region (CR)
The major noncoding CR of G. spengleri was 1,937 bp long, flanked by tRNAPro and tRNAPhe genes (Table 1). In this 67.99% A+T rich region, several conserved motifs including three conserved sequence blocks (CSB1, CSB2 and CSB3) were identified. The most striking feature of this control region was the presence of longer tandem repeats of (ATATTATTATATTATTATATATC) n (n=3) at the 3’ end, downstream to the CSB3. Furthermore, it was followed by a AT-enriched microsatellite sequence flanking the tRNAPhe gene (Supplementary Fig 1).
Phylogenetic analysis
The ML trees are illustrated in Fig 3. From the resultant ML tree, the ingroup species were divided into two major clades: the Pleurodira (
Chelodina longicollis, Pelusios castaneus, Podocnemis unifilis) and an assemblage of 20 cryptodires. Within the Cryptodira, 9 geoemydids formed a clade with a bootstrap value of 100%. The phylogenetic tree shows that Geoemydidae was monophyletic in origin and the genera and species in the family were gradually clustered. There were four clades in the family: 1. The
C. dentata-
S. quadriocellata assemblage and
N. platynota formed a clade. 2.
C. mouhotii and
M. annamensis formed a clade. 3.
Heosemys annandalii was a separate clade. 4.
G. spengleri and the
B. trivittata-
P. sylhetensis assemblage forms a clade. In general, the Geoemydidae was the sister group to the monophyly of turtle (
M. emys) of Testudinidae and was then clustered with the
D. coriacea-
N. depressus assemblage fllowed by a clade comprising of
P. megacephalum and three representative turtles of Emydidae. The assemblage of the above species was in order clustered with
M. temminckii followed by
K. leucostomum and
P. cantorii. Finally, the Carettochelyidae comprising of
C. insculpta was sister group to all remaining Cryptodira used in this study.
Nine controversial geoemydid genera (nine species) were studied, based on morphological characters, Bramble (
Bramble, 1974) hypothesized that Cyclemys, Pyxidea and Cuora formed a closely related assemblage (Cyclemys group) and originated from a Heosemys-like ancestor. In our resultant trees, Cyclemys, Sacalia and Notochelys formed a monophyletic group, Cuora and Mauremys formed another monophyletic group. These two monophyletic groups and their common ancestor formed a new monophyletic group and the intimate relationships between the common ancestor of all the descendants of this new monophyletic group and Heosemys were well supported (100% BP under ML), which seems to be consistent with their morphological characters.
As far the relationships of the Geoemydidae, Testudinidae and Emydidae, some authors suggested a closer affinity of the geoemydid turtles with the emydids (
Ernst and Barbour, 1989;
Iverson, 1992) and the others suggested the Geoemydidae and Testudinidae originated from a common ancestor and sister-relationship of the two families inferred from morphological and molecular data (
e.g.,
Shaffer et al., 1997; Hirayama, 1984;
Honda et al., 2002; Lee, 2019).Our ML analyses supported the latter view that the geoemydid turtles have closer affinities to the testudids than to emydids with high statistical supports (Fig 3).
Based on morphological and partial mtDNA data, the Chelydridae and Platysternidae were considered as sister taxa (
Gaffney and Meylan, 1988;
Gaffney et al., 1991; Shaffer et al., 1997). In contrast, most other workers elucidated that Platysternon was more similar to emydids and testudinoids based on morphological evidence (
Whetstone,1978), karyotype characteristics (
Haiduk and Bickham,1982) and molecular data
(Parham et al., 2006). The present ML analysis implied that Platysternon is a sister taxon to the Emydidae (
C. picta,
M. terrapin and
T. scripta) with a strong support of 98%. Obviously, Platysternon had more intimate relationship with emydidae than Chelydridae according to the phylogram. But some workers agreed with placing Platysternon within Testudinoidea (
Wu, 2004). Therefore, further verifications are definitely needed to determine their exact phylogenetic relationships.
In this study, the distribution maps of 9 species of the Geoemydidae were drawn to illustrate the species distribution of 9 genera of the Geoemydidae. Looking at the habitat distribution of
G. spengleri, the
B. trivittata and the
Pangshura sylhetensis, we found something interesting (Fig 4). All three species are found in countries or regions with abundant water resources, but there is no overlap in their habitats. In addition, these three species are only distributed in the southern hemisphere and are mostly located in the area between 8 degrees and 28 degrees North latitude.
Several other species of the family have a wider range of habitats in terms of latitude than these three species. The latitudinal zonality is mainly manifested by the regular change of climate, soil, organisms and their environment from the equator to the poles (
Das, 2018). Because the latitude distribution of these three species is more similar than that of the other turtles and the longitude distribution of the habitat of
B. trivittata and
G. spengleri is more similar than that of
P. sylhetensis, we guess that
G. spengleri and
B. trivittata are more closely related. This is also consistent with the results shown in the phylogenetic tree.