Nucleotide sequence and genetic divergence analysis
The study represented 29 species of 26 genera, 21 families and 7 orders. The species list and the GenBank accession numbers are provided in Table 1. According to the current study, many researchers utilised the BLASTn similarity approach to check sequences
(Rathipriya et al., 2019; Lee and Kim, 2020;
Das and Choudhury, 2021;
Saravanan et al., 2021). They studied 16 rRNA gene sequences showed both conserved and variable regions. This could be responsible for the differences in sequence length generated. This may also enable the identification of appropriate taxon-specific primers for phylogenetic applications. The average GC content was (48.1%) similar to the one reported by
Ward et al., (2005) in teleost (47.10%). The 16S rRNA sequences of fishes were aligned to yield a final size of 523 bp. A total of 260 sites were constant (Fig 1), 263 sites exhibited variable, of which 210 were parsimony informative and 53 were singletons. The estimated transition/transversion bias (R) was 1.297. The sequence analysis revealed average nucleotide frequencies as 28.85% (A), 21.82% (T/U), 25.18% (C) and 24.15% (G). The average overall mean distance was 0.103. The highest distance was observed between
Cynoglossus arel and
Stolephorus commersonnii (0.184) and the lowest distance was observed between
Lutjanus fulviflamma and
L. johnii (0.020). In this study, K2P% was used to assign an unknown specimen to a known species, to detect novel sequences and to determine whether an unknown specimen is a distinct new species. The average genetic distances between species, genera, families and orders were 18.4%, 21.4%, 23.2% and 25.6%, respectively (Table 2), which was comparable with patterns observed in several fish barcoding studies
(Bingpeng et al., 2018; Wang et al., 2018).
Phylogenetic tree analysis
The species phylogenetic relationships were established, with comparable genera clustered under the same nodes and different genera clustered under separate nodes. High bootstrap values (90-99%) supported the nodes.
Basheer et al., (2015), Bineesh et al., (2015) and
Lakra et al., (2011) all found similar findings in marine fish species in Indian waters. Several researchers have employed 16S rRNA gene sequences to investigate the phylogeny of various groups, including fish. With a revised classification of the Epinepelini using 12S and 16S rRNA sequences, Craig and Hastings (2007) explained the molecular phylogeny of groupers of the subfamily Epinephelinae (Serranidae).
Sparks and Smith (2004) on cichlid fishes,
Vinson et al., (2004) on Sciaenid fishes,
Wiley et al., (1998) on lampridiform fishes and Ilves and Taylor (2009) on Osmeridae are some other examples of phylogenetic studies. The summary form of the NJ tree and ML tree is given in Fig 2 and Fig 3.
Character-based species classification of species
In terms of DNA barcode sites, the character-based method aims to identify specimens of essential diagnostic nucleotides for species. In this study, we developed 112 positions of character-based keys using 16S rRNA gene sequences (Table 3). After the species formula was identified, the Primer 3.0 tool was used to check the possible secondary structures to finalise the species-specific probes for each species under study (Table 4). The melting temperature (Tm) of primers in the 55-60°C range and 5°C below the annealing Temperature give the best results, the primers were selected within the range.
Paine et al., (2007) developed a character-based key for identifying the 17 members of the Scombridae family which is common to the western Atlantic Ocean.
Lowenstein et al., (2009) described 40 distinguishing locations for blue-fin tuna.
Puncher et al., (2015) developed a character-based key for identifying Atlantic blue fin tuna larvae (
Thunnas thynnus).
Vargheese et al., (2019) utilised BLOG 2.0 to distinguish 82 species of elasmobranchs and found 214 diagnostic nucleotides, which is similar to
Rathipriya et al., (2021). Mahapatra et al., (2020) developed 25 positions of character-based keys for scombrid identification in Indian waters. These diagnostic molecular keys could be translated into a customised DNA chip for precise species identification.