Gene organization and arrangement
The mitochondrial genome of
B. dero is a circular molecule with 16,613 base pairs (Fig 1).
Vertebrate mitogenomes typically have a length of 16-20 kb with conserved order of genes (
Boore, 1999;
Roe et al., 1985;
Bibb et al., 1981). Accession number MK461139 was obtained after submitting the whole nucleotide sequence to NCBI GenBank. The mitogenome of
B. dero was comparable to that of other Labeoninae fish such as
Bangana decora (16,607 bp),
Cirrhinus reba (16,597 bp) and
Parasinilabeo longicorpus (16596 bp) (
Li et al., 2016;
Islam et al., 2020;
Pan et al., 2020). The mitogenome’s base frequency was A=32.49%, T=25.23%, C=26.53% and G=15.73%, being biased to A+T (57.72%). In the present study, the AT and GC skews were 0.126 and -0.225, respectively.
Sharma et al. (2020) also found similar skewness in other cyprinid fishes. A typical collection of 37 genes that make up the mitogenome comprises of 1 non-coding control region, genes, 22 tRNA genes, 2 rRNA genes (12S and 16S) and 13 protein-coding genes (PCGs) (Table 1).
Among them, 9 genes (ND6 and 8 tRNAs) are encoded on the light strand (L) and the remaining 28 genes are located on the heavy strand (H) similar to that of
B. decora and other cyprinid fish (
Li et al., 2016).
In the mitogenome, overlapping nucleotides between the adjacent genes were observed (Table 1). The sequences overlapped between tRNA-Ile and tRNA-Gln (2 bp), ATP8 and ATP6 (7 bp), ND4L and ND4 (7 bp), ND5 and ND6 (4 bp), tRNA-Thr and tRNA-Pro (1 bp) (2 bp) and other locations on the H strand. Their lengths ranged from 1-7 bp. The overlapping nucleotide sequences between adjacent genes helped compact the mitogenome and were also a distinctive feature of the teleost mitogenome (
Satoh et al., 2016;
Yang et al., 2018). Fish in the Cyprinidae family often exhibit overlapping sequences between ATP8/ATP6 and ND4L/ND4 (
Sharma et al., 2020).
Protein coding genes
The PCGs in the
B. dero mitogenome have NADH dehydrogenase’s 7 subunits of (ND1-6 and ND4L), cytochrome c oxidase’s 3 subunits (COI, COII, along with COIII), 2 subunits of ATP synthase (ATP8 and ATP6), as well as 1 subunit of cytochrome b oxidase (CYT B). The 13 PCGs length varied from 165 bp of ATP8 to 1824 bp for ND5, paralleling those of
Cyprinion semiplotum (
Sharma et al., 2020). However, the length of ATP8 is different from that of
B. decora, which is 168 bp (
Li et al., 2016). ATP8 and ND5 being the shortest and longest genes, respectively, have been found in most fishes classified as cyprinid (
Gutiérrez et al., 2015;
Satoh et al., 2016). PCGs had an average A+T content of 57.74%; COIII had the lowest average (53.69%), while ATP8 had the highest average (63.03%). Fig 2 display the skew values of AT and GC of all 13 PCGs and illustrated the trend.
All the PCGs except ND6 showed negative GC skewness which is in corroboration with other teleost mitogenome (
Lu et al., 2019;
Li et al., 2019). While 12 PCGs used ATG as the start codon, like other cyprinid mitogenomes, COI used GTG, which is frequently seen in fish mitogenomes
(Siva et al., 2018). Eight PCGs’ stop codons contained the typical termination codons (TAA as well as TAG), whereas the stop codons of the other five PCGs had truncated codons (TA and T), which may have indicated post-transcriptional polyadenylation (
Siva et al., 2018;
Yu et al., 2019). Fig 3 shows the graphic representation of PCGs RSCU (relative synonymous codon usage).
Serine and leucine are represented by six codons in contrast to other cyprinid fish, while tryptophan and methionine are encoded by a single codon (
Zhang et al., 2022). Two or four codons are employed to encode the remaining amino acids. The top utilized codon was CGA (Arginine), followed by CUA (Leucine), CCA (proline), UCA (Serine), GGA (Glycine), GUA (Valine) and ACA (Threonine). It was observed that the codons most used are biased towards A in its third position, which is consistent with other Cyprinid fishes (
Sharma et al., 2020).
Transfer and ribosomal RNAs
The
B. dero mitogenome has 22 tRNAs, ranging in size from 67 bp (tRNA-Cys) to 76 bp (tRNA-Leu1 and tRNA-Lys). Eight of the 22 tRNAs (tRNA-Gln, tRNA-Ala, tRNA-Asn, tRNA-Cys, tRNA-Tyr, tRNA-Ser1, tRNA-Glu and tRNA-Pro) are encoded on L strand, whereas the remaining 14 tRNA genes have been encoded by H strand. Anticodons for GTC (tRNA-His) and CTC (tRNA-Glu) was different from GTG (tRNA-Glu) and TTC (tRNA-Glu) (
Sharma et al., 2020;
Li et al., 2022). There were two anticodons for serine (UGA and GCU) and leucine (UAA and UAG). A typical characteristic of fish mitogenomes is the presence of multiple tRNAs with distinct anticodons for a single amino acid (
Cui et al., 2017;
Villela et al., 2017). Fig 4 shows the secondary structures predicted by tRNA scan 2.
All tRNAs except tRNA Ser (GCT), which lacks dihydrouridine arm, exhibit the conventional clover leaf shape. A similar case was found in
B. decora (
Li et al., 2016).
Much as in other vertebrates, the ribosomal RNAs (12S rRNA and 16S rRNA) were found on the H strand, divided by tRNA-Val (
Yang et al., 2018;
Zhang et al., 2019). The 12S rRNA gene showed a distance of 957 bp between tRNA-Phe and tRNA-Val. Positioned between tRNA and tRNA-Leu1, the 16S rRNA gene measured 1691 bp in length. In comparison to other fish, the length of rRNA gene in the
B. dero mitogenome was observed to be longer (
Sam et al., 2021;
Zhang et al., 2023).
Non coding regions
The origin of light chain replication (OL) was found between tRNA-Asn and tRNA-Cys. The OL was 33 bp long and forms a hairpin structure similar to
B. decora (Li et al., 2016). A stem with 19 bases and a loop with 14 bases make up the secondary structure of OL (Fig 5a).
The stem of the hairpin structure had a sequence of 52 -GGCGG-3, similar to
Garra species but different from
Scomber species, which has 52 -GCCGG-32 (
Zhang et al., 2022:
Catanese et al, 2010.
The mitochondrial genome of
B. dero contains a CR region that is 942 bp long, situated between the genes of tRNA-Pro and tRNA-Phe. The origins of transcription and replication are found in the longest noncoding region (CR) of the vertebrate mitogenome (
Bronstein et al., 2018). The A+T content, AT skew and GC skew in CR were 67.83%, 0.045 and - 0.198 respectively which were consistent with other Cyprinid fishes (
Sharma et al., 2020). The conserved sequenced blocks (CSBs), ETAS (extended terminal associated sequences) and CD (central domain) were identified within the CR (Fig 5b). The control domain was arranged in different boxes (CSB-F, CSB-E and CSB-D) and there were three regions (CSB1, CSB2 and CSB3) in the CSB domain similar to arrangement in the mitogenome of many fishes (
Satoh et al., 2016).
These arrangements were similar to those found in the mitogenome of many fishes. In addition to OL and CR, there were non-coding intergenic spacers ranging in sizes from 1 to 46 bp (Table 1). Presence of long noncoding sequences has been reported in other fishes which may serve as a mechanism of gene rearrangement supporting the tandem duplication-random loss (TDRL) model (
Satoh et al., 2016).
Phylogenetic analysis
Phylogenetic analysis was performed in the current work using 39 mitogenome sequences from various species, including
B. dero. The phylogenetic tree was rooted with
Psilorhynchus species as an outgroup (Fig 6).
In the tree, two major clades were observed, which further split into subclades. In Clade I, the genera present were
Labeo, Bangana, Cirrhinus, Incisilabeo and Schismatorhynchos. Clade II consists of
Garra, Discocheilus, Discogobio, Labiobarbus, Lobocheilos, Parasinilabeo, Prolixicheilus, Ptychidio, Semilabeo and Sinocrossocheilus. A bootstrap value of 100 supports the close grouping that of
B. dero formed with
Bangana species (AP013327) and Labeo boggut (NC_029450) in sub clade I of Clade I. Based on the mitochondrial 16S rRNA sequence of
B. dero, a similar outcome was found in an earlier investigation (
Basudha et al., 2019). The same subclade also contains
Bangana tungting (KF752481). However,
Bangana decora was present in clade II along with
Garra and other species, similar to the previous report (
Zhang et al., 2022). They have found that all
Garra species, with the exception of
G. pingi pingi and
G. imberba, are grouped within a single clade, while the two species are clustered alongside other Labeoninae species which also encompasses
B. decora. They suggested that the alteration of mitochondrial rearrangement in one group of
Garra has caused a distant relationship with another group that has not undergone similar changes. The occurrence of various species from the same genus in different clades of a phylogenetic tree can be attributed to the non-monophyletic nature of the genera. It has been reported that the genus
Labeo, Garra, Bangana, Cirrhinus and
Crossocheilus are non monophyletic (
Yang et al., 2012). It suggests that certain species within a particular genus may be more closely related to species of other genus.