Population Structure Analysis Using PCA
The GBS data files were submitted to NCBI SRA (BioProject: PRJNA843534; SRA accession number SRR26636791 to SRR26636836). The hierarchical distance tree constructed for the forty-six canine samples revealed distinct clustering patterns that align with geographic origins. Branch lengths and confidence values (100%) indicated genetic distances and sample similarities. Breeds from Punjab and Haryana exhibited closer genetic relationships, whereas Karnataka breeds demonstrated less resemblance, underscoring regional differentiation (Fig 1).
The Minimum Spanning Network (MSN), developed from reference-based SNP data, elucidated breed-specific genetic variations. Node Id 76, representing the Labrador retriever from Punjab, occupied a central position, signifying ancestral connections with shared SNPs across breeds. Branches radiating from this node highlighted regional diversity, with elongated branches corresponding to specific breeds such as Labrador retriever (21 and 79), Pug (44) and Belgian Malinois (90), affirming the genetic resemblance between Punjab, Haryana and Karnataka breeds (Fig 2).
Principle component analysis (PCA) further delineated genetic diversity, with PC2 effectively separating Punjab and Karnataka samples from a cluster predominantly including Punjab and Haryana breeds. The Karnataka samples formed a distinct orange ellipse, while the data of Punjab and Haryana exhibited overlapping clusters (Fig 3). Moreover, DAPC revealed consistent clustering with PCA results, assigning breeds to geographic states (Fig 4). The identification of genetic differences through discriminant analysis emphasized the contribution of breed-specific SNPs to regional clustering.
Membership probability analysis, depicted through colored vertical lines (Fig 5), quantified the likelihood of each sample belonging to a specific region. Visualizing posterior probabilities enhanced the interpretation of genetic affiliations, confirming shared ancestry between breeds from Punjab, Haryana and Karnataka. This shared lineage was supported by SNP genotype similarity, as reflected in the plot of genetic diversity (Fig 6).
Haplotype identification
A systematic approach was employed to identify haplotypes using stringent criteria: SNP distance thresholds of 80 kb or cumulative differences exceeding 150 kb, with a minimum of four SNPs constituting a haplotype. This methodology uncovered 15,552 haplotypes from 2,18,429 SNP locations.
Isolation and quality check of genomic DNA
High-quality genomic DNA was extracted from 21 canine blood (Table 3) samples using the conventional Phenol: Chloroform: Isoamyl Alcohol (25:24:1) technique (Green and Sambrook, 2018) and verified through spectrophotometric analysis (OD 260/280 nm values: 1.6-1.8). The integrity of DNA was confirmed
via gel electrophoresis on a 0.8% Agarose gel (70V for 45-50 min.), displaying unambiguous high molecular weight bands under GelDoc system, suitable for downstream applications.
Functional analysis of haplotypes
Three of the identified haplotypes were prioritized based on SNP frequency and minor allele frequency (MAF > 5%).
·
Haplotype 1: Located on chromosome 3, spanning 187 bp with four SNPs associated with the
AFAP1 gene.
·
Haplotype 2: Spanning 111 bp on chromosome 9 with five SNPs linked to the
GBGT1 gene.
·
Haplotype 3: Covering 117 bp on chromosome 10 with six SNPs connected to the
CELSR1 gene.
Identification of Polymorphisms
To identify SNPs, the reference sequence of the corresponding chromosomes of the
GBGT1,
CELSR1 and
AFAP1 genes was aligned with the FASTA sequences acquired after sequencing using Clustal Omega. Eleven SNPs were discovered for
GBGT1 and
CELSR1, whereas no SNP was detected in
AFAP1. In the
GBGT1 gene, SNPs such as 503 g>A and 510 g>A were prevalent across multiple samples, whereas 516 a>G and 426 g>C were breed-specific (Table 4). Whereas, for the
CELSR1 Gene, Six SNPs were identified, with specific variants (
e.
g., 414 g<A, 469 g>A) displaying consistent presence in specific samples (Table 5).
Dogs (
Canis lupus familiaris) have been integral companions to humans for thousands of years, fulfilling roles ranging from hunting and protection to emotional and therapeutic support. Domesticated from wolves over 14,000 years ago, dogs are considered the first domesticated animals and have become indispensable members of society (
Morell, 1997). Despite their historical and cultural significance, genetic studies on dog breeds reared in India remain limited, focusing primarily on breeds from Western and European origins. This research aims to bridge this gap by analyzing the genetic structure and diversity of significant dog breeds from North and South India, utilizing high-quality SNP data derived from ddRAD-GBS technology.
Genotyping-by-sequencing (GBS) has become a powerful tool for obtaining genome-wide SNPs, facilitating genetic diversity and population structure studies. The study identified 8,13,580 SNPs, of which 2,18,433 high-quality SNPs were used to analyze the population structure of divergent dog breeds. Previous studies, such as
Kaur et al., (2023), demonstrated the utility of GBS in analyzing genome-wide SNPs in indigenous Indian breeds like Gaddi dogs, identifying over 75,000 high-quality SNPs. This reinforces the role of GBS as a robust tool for population genetic studies and association mapping.
The population structure analysis revealed tight ancestral relationships among Indian dog breeds, with geographical proximity often correlating with genetic similarity. These findings align with
Zhang et al., (2018), who highlighted that regional factor strongly influence genetic structure in domestic animals. In line with this, the phylogenetic analysis confirmed breed-specific genetic differences shaped by environmental adaptations and historical breeding practices. Similar observations were reported by
Freedman et al., (2014), who documented how selection pressures and geography impact breed evolution in dogs globally.
A significant outcome of this study was the haplotype analysis, which revealed 15,552 haplotypes. These genome regions are critical for understanding genetic architecture, as they represent blocks of conserved alleles inherited together due to linkage disequilibrium. Three haplotypes associated with AFAP1, GBGT1 and CELSR1 genes were particularly interesting. The
AFAP1 gene on chromosome 3 plays a pivotal role in cellular motility and has been linked to physiological adaptations that might benefit specific dog populations
(Cunnick et al., 2015).
Similarly,
GBGT1 and
CELSR1, located on chromosomes 9 and 10, are implicated in glycosylation
(Indellicato and Trinchera, 2021) and planar cell polarity pathways
(Chen et al., 2022), respectively. These pathways are essential for neurological development and cellular signaling, potentially impacting behavioral traits and environmental adaptability. The functional implications of these genes underscore their importance in understanding breed-specific characteristics and genetic diversity.
These findings not only provide insights into the genetic structure of Indian dog breeds but also contribute to broader efforts in breed conservation and functional genomics. Identifying specific haplotypes linked to traits such as disease resistance or behavioural attributes could guide breeding programs and improve breed welfare
(Gutierrez-Reinoso et al., 2021). Furthermore, this study complements global efforts to map dog genetic diversity, as highlighted in studies of conserved haplotypes in humans and animals
(Guryev et al., 2006).