This study of molecular docking was based on the prediction of interaction between BRD2 protein of chicken and the viral matrix protein by using
in silico methods. The amino acid sequences of chicken BRD2 protein (NCBI GenBank Accession No. NP001025845) and matrix protein gene of Newcastle disease virus (GenBank Accession No. AFX98108) from the NCBI website. The overall length of the host BRD2 protein and the viral matrix protein-coding gene was 729 amino acid (aa) and 364 aa, respectively. The sequence of the BRD2 protein belonged to the white leghorn chicken breed whereas the matrix protein gene amino acid sequence belonged to the NDV R2B mesogenic vaccine strain of India.
Tertiary structure prediction
The structures were obtained based on
ab initio method
i.e., all the parameters were picked up automatically by the server and the best available models were chosen based on maximum percentage identity as shown in (Fig 1 and 2). The molecular models “4yug.1.A” and “4g1g.1.A” were selected as the best structure for the host-BRD2 and the viral matrix proteins. The sequence similarity of the input peptide sequences of host BRD2 and the viral matrix proteins with the best-selected models (4yug.1.A and 4g1g.1.A) was 88.64% and 92.58%, respectively (Fig 1A and 1B). A ligand named as 1 x 4-((2S,4R)-1-Acetyl-4-((4-Chlorophenyl)Amino)-2-Methyl-1,2,3,4-Tetrahydroquinolin-6-yl) Benzoic Acid was present on the BRD2 protein. Hsieh
et al. (2016) conducted the research on molecular docking between the protein Disulfide isomerase A3 (PDIA3) protein of chicken and Fusion (F) protein of the Newcastle disease virus. They predicted the tertiary structure of NDV F protein and chicken PDIA3 protein by using the SWISS-MODEL tool. The results have shown that the catalytic domain of the chicken PDIA3 gene binds with the F protein of NDV. The outer side of the catalytic domain was covered by a unique binding pocket made up of 11 residues on the F-protein of NDV. They concluded that the unique binding pocket can be further investigated to develop the new vaccines which will decrease the harmful impacts of Newcastle disease.
Tertiary structure validation
The validation results for predicted structures of host-BRD2 and the viral matrix protein indicated that 100% and 97% of the amino acids fell within the highly preferred zone of conformations, respectively (Fig 2A and 2B; Table 1).
Tertiary structure uploading on docking server
The result files from the SWISS-MODEL tool were downloaded to upload on the online PatchDock Server to predict the sites of protein-protein interaction by visualizing them on the suitable software. The results obtained from PatchDock were based on shape complementarity criteria.
Protein-Protein interaction visualization
The files were opened in BIOVIA Discovery Studio visualizer software (Dassault Systems). The results shown that the interaction was reported between the host BRD2 protein and the viral matrix protein. The amino acid position Tyrosine-148 of the viral matrix protein has interacted with the Benzoic acid ligand present on the chicken via Pi-Alkyl Bond. The point at which Tyrosine-148 was attached on the benzoic acid ligand, amino acid position Leucine-330 of chicken BRD2 protein was also attached with the help of an alkyl bond (Fig 3A and 3B). Another amino acid of matrix protein, Valine-339 was found to interact with ligand via an alkyl bond. On the same point, the amino acid Leucine-330 of BRD2 protein was also attached via alkyl bonds (Fig 4). There was another amino acid of matrix protein identified as Alanine-341 was interacted with the hydroxyl group of the Benzoic acid ligand via an alkyl bond (Fig 5A). An amino acid from the BRD2 protein, Tryptophan-317 was also attached to the hydroxyl group along with Alanine-341 (Fig 5B). Tryptophan-317 has interacted on the ligand ring with the help of the Pi-Alkyl bond. Apart from that, Alanine-341 was also attached at two other sites at the ligand with other amino acids. Work on the molecular docking between DNA gyrase, Topoisomerase IV and Penicillin Binding Proteins (PBP3) proteins from
Pseudomonas aeruginosa bacteria and Ester derivative compound of Eugenol (an oil found in cloves) was carried out by Dhurga
et al. (2016). They visualized the docking results using Bioivia Discovery Studio 4.1 visualizer software. They found that Eugenol ester derivative compound had a significant binding connection with DNA gyrase, Topoisomerase IV and PBP3 proteins. The residue positions Serine-111 and Asparagine-298 of DNA gyrase, Alanine-290, Valine-289, Leucine-286 and Alanine-278 of Topoisomerase IV and Tyrosine-124, Glutamic acid-150 and Histidine-55 of PBP3 proteins bound with Eugenol ester derivative compound. They emphasized the future investigation of Eugenol ester derivative compound to use them as a bactericidal agent.
Evolutionary analysis of host (Chicken BRD2) and pathogen (NDV Viral Matrix protein) gene
Multiple sequence alignment
The multiple sequence alignment of nucleotide sequences of chicken BRD2 indicates that the cds are mostly conserved among species, however, some divergent species like gecko and alligator depicts clear differences with chicken BRD2 cds. The MSA results for viral matrix protein cds showed that most of the regions of matrix protein are conserved among all the sequences. There were some mismatches seen at some sites in the sequences. The multiple sequence alignment of 60 nucleotide sequences of the Drosha gene carried out by
Kaur et al. (2015) revealed that the sequences belonging to different types of organisms have a conserved region of domains between them. Some domains are conserved in different species with minor differences between some species.
Determination of best evolutionary model
The selection of the best model was done based on the lowest BIC (Bayesian information criterion) scores. In each model, AICc (Akaike information criterion, corrected) values, maximum likelihood value (InL) and other parameters were also mentioned. All the position gaps and missing data were removed. The GTR+G+I (General Time Reversible, with Gamma distribution, invariable) model using a discrete Gamma distribution for BRD2 nucleotide sequences was considered as the best evolutionary model and the GTR+G model was considered as the best evolutionary model for the viral matrix protein cds.
Calculation of pair-wise distance
The result showed that the sequences related to chicken and other avian species have the least pairwise distance, whereas the sequences of outgroup species such as
Gekko japonicus (Schlegel’s Japanese gecko lizard) (more than 0.20),
Alligator sinensis (Chinese alligator) (more than 0.17)
etc. have maximum distance from the chicken BRD2 cds (Fig 6A). The pairwise distance calculation of matrix protein of NDV R2B strain resembles close relation with many of the Indian strains and those NDV which was isolated from other avian species like peacock, quail and goose were closely related to matrix protein of chicken-NDV. Although, the R2B strain is closely related with those NDV strains that were isolated from Egypt, USA, China and Ukraine (less than 0.01), it was also evident the maximum distance from NDV strains that were isolated from other geographical locations like Ireland and Vietnam (0.056 to 0.114) (Fig 6B).
Determining the evolutionary selection of sequences
Selection pressure analysis of the chicken BRD2 indicated that the test statistic values of dN-dS were less than 0.05 among the galline BRD2 cds (P>0.05). This indicates that the null hypothesis of “strict” neutral selection (dN=dS) is not rejected. However, for the rest of the combination the null hypothesis is rejected (P<0.05) and the alternative hypothesis of positive selection (dN>dS) has been accepted (Fig 7A). In the case of viral matrix protein, the test of the null hypothesis of “strict neutrality” of selection (dN=dS; P> 0.05) was rejected for almost all the sequence pairs, except for NDV-R2B vs NDV|Chicken|Egypt and NDV|Quail| Chennai|India;similarly, NDV|Quail| Chennai|India vs ND and NDV|Peacock and NDV|Chicken|Egypt; NDV|B1| Takaaki vs certain strains of chicken and duck of China,
etc. As the number of synonymous substitutions (dS) is more than non-synonymous substitutions (dN), it depicts that the matrix coding sequences experienced purifying or negative selection (Fig 7B).
Jimenez et al. (2009) worked on molecular evolutionary analysis of Dormancy associated MADS-box (DAM) genes from peach. MADS-box gene sequences of
Arabidopsis were retrieved from the
arabidopsis Information Resource (TAIR), whereas Peach genes were cloned by the research team in their laboratory and Poplar MADS-box genes were downloaded from
Populus trichocarpa genome dataset v1.1. By using MEGA4 software, they found that purifying selection was observed in MADS-box genes of all three species. It was evident that no positive selection was observed in the MADS-box gene of
Arabidopsis, Peach and Poplar. The overall research concluded that purifying selection has a strong effect on the rates of molecular evolution.
Construction of phylogenetic tree
The phylogenetic analysis showed that the chicken BRD2 protein resembled 99% similarity with transcript variants of Red Jungle fowl BRD2 protein whereas it was distant from outgroup species like Chinese alligator and Japenese lizard
etc. (Fig 8A). It was observed that the NDV R2B strain closely relates with NDV isolated from India and Egypt, whereas it indicated distant relation with NDV isolated from other countries like Ireland, China, Vietnam and Ukraine. NDV isolated from other avian species like Peacock, toco toucan and Quail has shown close relation with R2B strain while NDV isolated from Pigeon, duck and goose has shown distant relation with NDV R2B strain (Fig 8B). There was a study conducted on the molecular phylogeny of the Bubaline Dicer enzyme by
Singh et al. (2015). They used 115 amino acid sequences of Dicer enzymes of different species using maximum likelihood method with 500 bootstrap resampling value. The phylogenetic tree shows that the Dicer1 transcript variants of some species were fell within the same branch. They found that Bubaline Dicer1 sequence was closely related to yak and cattle Dicer1 sequence with a bootstrap value of 92.
IFEL analysis
The codon-based selection pressure analysis was done on Datamonkey Server (https://www.datamonkey.org). This analysis was performed for chicken BRD2 and viral matrix protein by using different statistical methods, namely, Internal Branch Fixed Effects Likelihood (IFEL), Random Effects Likelihood (REL) and Evolutionary Fingerprinting.
The IFEL analysis for chicken BRD2 shown 22 positive selection sites and 129 negative selection sites with less than 0.05 p-value (Fig 12), whereas the IFEL analysis of the viral matrix protein used with a p-value of less than 0.05 depicted only one positive codon site, whereas 26 negatively selected sites were found. Only 52 codons have shown synonymous substitutions (Fig 9A).
swiderska et al. (2018) worked on the chicken TLR4 and TLR7 genes. They performed selection pressure analysis using online Datamonkey server and identified five sites in TLR4 and two sites in TLR7 gene which were undergone positive selection by using IFEL at a p-value of less than 0.05.
REL analysis
REL analysis of nucleotide sequences of BRD2 shown 50 positively selected sites and 371 negatively selected sites were found, respectively (Fig 9B). The REL analysis of the viral matrix protein shown no positive selection site and all sites were negatively selected. This depicted that the virus may hinder the transfer of their deleterious genes for their survival because the REL analysis has shown all the sites as negative selection sites.
Xia et al. (2020) selected six genes for selection pressure analysis from feline Coronavirus. The name of the genes chosen for the study was non-structural protein (nsp12-nsp14), Spike (S) protein, Nucleocapsid (N) protein and 7b genes. They performed selection pressure analysis through FEL, REL and MEME in the Bayes Factor was greater than 50 for REL method. They found that four positive sites were present in nsp12-nsp14 genes, whereas 12, 4 and 4 positive sites were present in S, N and 7b genes respectively. They identified 106, 168, 25 and 17 negatively selected sites for nsp12-14, S, N and 7b genes, respectively. The overall study concludes that most of the sequences may be conservatively maintained for the survival of the virus.
Evolutionary fingerprinting
The evolutionary fingerprinting results of chicken BRD2 shown that the best fitting model has five rate classes with an AICc value of 84194. 61 in which 37 parameters were considered by the server. The evolutionary fingerprinting analysis done for the NDV matrix protein had two rate classes with an AIC value of 32597. 83 with 28 parameters.
Murray et al. (2013) worked on the viral suppressors of RNA interference (VSRs) which helps in inhibition of RNA interference (RNAi) to clear the invasive route for entry of the virus into the host body. They selected single-stranded RNA viruses of plants for their study. The evolutionary fingerprinting analysis shown no relationship between VSRs of plant viruses. The coat proteins and polymerases of plant viruses also not shown any type of clusters among them.
GA branch analysis
The GA-branch analysis for the BRD2 protein shown that out of 9 rate classes, the number 8 rate class had been observed with the best dN/dS score with a value of 87227.4 AICc points and 6 branches showed the highest probability value of dN>dS (Fig 10A). For GA-branch analysis of viral matrix protein, it was found that the best c-AIC score was observed with the value of 32508. 7 and 3 branches had shown the highest dN>dS probability value (Fig 10B).
Sangula et al. (2010) selected eleven isolates of Foot and Mouth (FMD) SAT1 virus isolates from the Embakasi FMD laboratory in Nairobi (Kenya). 42 virus sequences were retrieved from NCBI GenBank which includes 17 sequences from East Africa and 25 from other countries of the African continent. To check the selective environments on phylogenetic branches, they applied the GA branch method to predict the values of dN/dS. They found that FMD is more prevalent in Kenya and Tanzania due to the exchange of transboundary wildlife movements. Uganda also has higher rates of FMD than other countries, but they have a lesser prevalence than in Kenya and Tanzania. These three countries share their borders, so the researchers emphasized approaching to control the transboundary animal disease.
Branch site REL analysis
In BRD2 protein, out of 26 sequences, 19 sequences come under episodic diversifying selection at a probability value of less than 0.05 (Fig 11A). In matrix protein gene, this analysis was done by using 16 nucleotide sequences and the results were shown that out of 16 sequences, 2 sequences were undergone episodic diversifying selection at a p-value lesser than 0.05 (Fig 11B).