Abstract

Genetic variations (mutation, crossing over, and recombination) act as a source for the gradual alternation in phenotype along a geographic transect where the environment changes. Posttranslational modifications (PTMs) predicted modifications successfully in different and the same species of living organisms. Protein diversity of living organisms is predicted by PTMs. Environmental stresses change nucleotides to produce alternations in protein structures, and these alternations have been examined through bioinformatics tools. The goal of the current study is to search the diversity of genes and posttranslational modifications of protease serine endopeptidase in various strains of Sordaria fimicola. The S. fimicola’s genomic DNA was utilized to magnify the protease serine endopeptidase (SP2) gene; the size of the product was 700 and 1400 base pairs. Neurospora crassa was taken as the reference strain for studying the multiple sequence alignment of the nucleotide sequence. Six polymorphic sites of six strains of S. fimicola with respect to N. crassa were under observation. Different bioinformatics tools, i.e., NetPhos 3.1, NetNES 1.1 Server, YinOYang1.2, and Mod Pred, to search phosphorylation sites, acetylation, nuclear export signals, O-glycosylation, and methylation, respectively, were used to predict PTMs. The findings of the current study were 35 phosphorylation sites on the residues of serine for protease SP2 in SFS and NFS strains of S. fimicola and N. crassa. The current study supported us to get the reality of genes involved in protease production in experimental fungi. Our study examined the genetic biodiversity in six strains of S. fimicola which were caused by stressful environments, and these variations are a strong reason for evolution. In this manuscript, we predicted posttranslational modifications of protease serine endopeptidase in S. fimicola obtained from different sites, for the first time, to see the effect of environmental stress on nucleotides, amino acids, and proteases and to study PTMs by using various bioinformatics tools. This research confirmed the genetic biodiversity and PTMs in six strains of S. fimicola, and the designed primers also provided strong evidence for the presence of protease serine endopeptidase in each strain of S. fimicola.

1. Introduction

After translation, proteins require chemical changes for modifications of their functions in higher-living organisms. PTMs improve the functions of proteins by conjuring up functional groups such as lipids, acetate, phosphate, and carbohydrates. The molecules of modified sites which are associated with the union and disunion of functional groups are easily explained by the fields of bioinformatics and proteomics [1]. The eukaryotic higher multicellular cells indicate PTMs (acetylation, phosphorylation, methylation, and glycosylation), because the scientific tools have approached these modifications successfully. The posttranslational modifications link with DNA, RNA, proteins, lipids, and cofactors [2]. Glycosylation, phosphorylation, acetylation, carboxylation, s-nitrosylation, and methylation are commonly researched modifications [3]. The structure of protein and its dynamic are altered by PTMs, which finally change the protein function [4, 5].

The proteins after modifications perform their duty as a factor of regulation for several cellular processes like gene expression, protein regulations, differentiation of cells, and protein degradation. The class of genus Sordaria is ascomycetes, and its family is sordariaceae, which is closely related to Neurospora sp. [6]. Sordaria sp. has many industrially principal enzymes like proteases, cellulases, catalases, laccases, and xylanase. Proteases have a significant role in the protein turnover process in plants throughout their life [79]. Protease serine endopeptidases are vital hydrolytic proteins that use the residues of catalytic serine for cleaving the peptide bonds. They can exist in all eukaryotes and prokaryotes [10]. They have a large number of cellular tasks ranging from housekeeping, i.e., signal peptide cleavage [11], immune response, protein maturation, reproduction, and apoptosis [12].

Six strains of S. fimicola were obtained from two slopes from Evolution Canyon (EC). This canyon, due to environmental conditions, provides a suitable platform to study the genetic variations in living organisms. Environmental stress, spontaneous mutations, recombination, and gene conversion are strong driving agents for genetic variations [13]. SFS strains of S. fimicola express a high rate of genetic diversity than NFS strains [4, 14]. The SFS has very dry and severe conditions, whereas NFS has gentle environmental conditions. However, environmental stress is a solid source of genetic variations [15]. The changed DNA nucleotide sequences by the genetic variations transfer into proteins and have a strong effect on posttranslational modifications. The emic research revealed a deep understanding of the relationship between genetic diversity, environmental factors, and metabolites’ concentration [16]. As far as we know, no study on PTMs of protease serine endopeptidase in the experimental fungus has been conducted so far. Therefore, the current study was focused on the modifications of protease serine endopeptidase such as glycosylation, acetylation, phosphorylation, and methylation in experimental and reference fungi.

2. Methodology

2.1. The Subculturing of Experimental Fungal Strains

The NFS and SFS strains of S. fimicola were provided by the Molecular Genetics Research Laboratory of Botany Department, PU, Lahore. The experiments were performed at Molecular Genetics Laboratory, Faculty of Veterinary and Animal Sciences, Lasbela University of Agriculture, Water and Marine Sciences, Uthal, Balochistan, and Molecular Genetics Research Laboratory, Department of Botany, University of the Punjab, Lahore, Pakistan. The subculturing was done from the stock of fungus (preserved at -20°C) which was initially obtained from “Evolutionary Canyon” by Prof. Nevo. Different strains of S. fimicola had been obtained from three different places on the north-facing slope (NFS) and called N5, N6, and N7, whereas the S strains had been collected from three stations of the south-facing slope (SFS), and these were called S1, S2, and S3. S. fimicola strains were subcultured on growth media, i.e., PDA (Potato Dextrose Agar), under sterilized conditions to keep away any contamination and kept in an incubator at 20°C to get perithecium of S. fimicola within 10-12 days for DNA extraction.

2.2. Genomic DNA Extraction

S. fimicola strains were used to extract genomic DNA by following the protocol performed by [17]. The isolated DNA was utilized to 0.8% agarose for gel electrophoreses which was dyed with ethidium bromide with 1-kilobase DNA ladder and bands examined under gel docs (documentation system (U:Genius3-Syngene)). The designing of primers and their concentration’s preparation were done. The primers were designed for the amplification of protease serine endopeptidase gene by the NCBI Primer Blast and Primer 3 Plus tool. In a current study, two pairs of primers were used (Table 1). Then, the primers’ stock concentrations (100 μM) were prepared, and working stock of 10 μM of each primer was also prepared and stored at -20°C. The amplification of targeted protease serine endopeptidase genes was done through PCR. The reaction volume of PCR was made up to 30 μl, which contained 15 μl PCR master mix, 2.5 μl fungal genomic DNA, 1.50 μl for each forward and reverse primer, 1.50 μl of 25 mM MgCl2, and 9 μl of double-distilled H2O. The first round of amplification consists of initial denaturation at 95°C for 10 min followed by 30 cycles of denaturation at 95°C for 25 sec, annealing at 56°C for 45 sec, and extension at 72°C for 1 min, with a final extension step of 72°C for 8 min. The PCR products were run on gel electrophoreses on 0.8% agarose gel for the confirmation of amplification. Then, gene sequencing of the PCR products was achieved. The eluted bands were sequenced. Online server EMBOSS Transeq (https://www.ebi.ac.uk/Tools/st/emboss_transeq/) was used to obtain the protein sequence of genes. The nucleotide sequence of the reference organism protease serine peptidase gene was obtained from NCBI (http://blast.ncbi.nlm.nih.gov/Blast). The analysis of sequenced data Clustal Omega (http://ebi.ac.uk/Tools/msa/clustalo/) was used to check the nucleotide and amino acid variations in gene and protein sequences of different strains of experimental and reference fungi. The designed primers confirmed the presence of dipeptidyl-aminopeptidase protease in S. fimicola.

2.3. PTMs: Prediction Tools

Different bioinformatics tools for PTM prediction were used. Phosphorylation with NetPhos 3.1, glycosylation with YinOYang 1.2, methylation with Mod Pred, and NES (nuclear export signals) with NetNES 1.1 servers were predicted. EMBOSS Transeq, an online tool, was used for the amino acid sequences of amplified SP2 genes, whereas amino acid sequences of the reference strain were recovered from Uniprot.

2.4. Pymol: Prediction Tool

A prediction tool, Pymol, was used to form 3D models of protease SP2 with a high degree of confidence. This prediction tool is reliable.

3. Results

3.1. Multiple Sequence Alignment

The extracted DNA of S. fimicola strains was utilized for the amplification of the protease serine endopeptidase gene. Clustal Omega software was used for the study of polymorphism by aligning the nucleotide sequence of different strains of experimental and reference fungi. Eight polymorphic sites are observed in the SP2 gene of S. fimicola strains compared to the reference fungus (Figure 1). In the 1st polymorphic site, at the 137th position (121-180) of the reference fungus, “thymine” is changed with adenine in the experimental fungus (NFS, SFS), resulting in the change of CTC codon into CAC, which finally changed the amino acid leucine (Leu) into histidine (His). In the 2nd polymorphic site, at position 261 (241-300) of the reference fungus, “guanine” is replaced with “cytosine” in the experimental fungus (NFS, SFS); TGT codon is altered into TCT, which ultimately changed the amino acid cysteine (Cys) into serine (Ser). In the 3rd polymorphic site, at position 757 (721-780) of the reference fungus and NFS strains of the experimental fungus, “thymine” is replaced with “adenine” in SFS strains of the experimental fungus, resulting in CTC codon being altered into CAC, which ultimately changed leucine (Leu) into histidine (His). In the 4th polymorphic site, at position 977 (960-1020) of the reference fungus and NFS strains of the experimental fungus, “cytosine” is replaced with “guanine” in SFS strains of the experimental fungus, resulting in ACT codon being altered into AGT, which ultimately changed threonine (Thr) into serine (Ser). In the 5th polymorphic site, at position 1470 (1441-1500) of the reference fungus, “thymine” is replaced with “adenine” of SFS and NSF strains of S. fimicola, resulting in CTG codon being altered into GAG, which ultimately changed leucine (Leu) into glutamine (Gln). In the 6th polymorphic site, at position 1700 (1681-1740) of the reference fungus, thymine is replaced with “adenine” in SFS and NSF strains of the experimental fungus, resulting in TTT codon being altered into TAT, which ultimately changed phenylalanine (Phe) into tyrosine (Tyr). In the 7th polymorphic site, at position 1938 (1921-1980) of the reference fungus, “thymine” is replaced with “adenine” in NSF and SFS strains of the experimental fungus, resulting in CTA codon being altered into CAA, which ultimately changed leucine (Leu) into proline (Pr). In the 8th polymorphic site, at position 2341-2400 of the reference fungus, “cytosine” is replaced with “guanine” in SFS and NFS of the experimental fungus, resulting in a change in TCA codon into TGA, which ultimately changed serine (Ser) into stop codon (Figure 1).

The sequences were used in the Blast tool at NCBI to check homologous sequences in the experimental fungus with reference fungus. Amino acid sequence alignment of SFS and NFS strains of the experimental fungus with the reference fungus has shown a total of 8 polymorphic sites (Figure 2).

3.2. O-Glycosylation and YinOYang: Prediction Sites

The predicted sites of YinOYang and O-glycosylation at residues of threonine and serine for serine endopeptidase proteases of six strains of experimental and reference fungi were obtained by YinOYang 1.2 and are presented in Table 2. The findings of the YinOYang 1.2 server explained that glycosylation occurred on serine residue at seven positions, i.e., 15, 44, 277, 283, 360, 649, and 743, whereas on residues of threonine, glycosylation is observed at five positions, i.e., 443, 461, 630, 636, and 643 in N. crassa. There is glycosylation on serine residue at eight positions, i.e., 15, 44, 277, 283, 360, 505, 649, and 743, whereas glycosylation on the threonine residue is at six positions, i.e., 317, 443, 461, 630, 636, and 643, in S. fimicola (Table 2). A graphical representation of glycosylation in SP2 of six strains of experimental and reference fungi is shown in Figure 3. The sites of acetylation were observed at positions 50, 65, 72, 76, 83, 91, 119, 146, 149, 254, 314, 404, 411, 414, 427, 437, 545, and 649 on K (internal lysine) in strains of experimental and reference fungi (Table 2).

3.3. Phosphorylation and NetPhos 3.1’s Prediction Sites

The results of the NetPhos 3.1 server revealed that phosphorylation occurred on protease serine endopeptidase of reference fungus and six experimental fungal strains on the following residues, i.e., tyrosine (Y), threonine (T), and serine (S). The predicted sites on serine were S-6, S-39, S-44, S-48, S-82, S-88, S-100, S-108, S-128, S-141, S-195, S-240, S-266, S-308, S-332, S-335, S-342, S-344, S-352, S-425, S-431, S-491, S-606, S-742, and S-848; on threonine, 4, 144, 159, 195, 264, 345, 398, 426, 488, 548, 610, 624, 665, 682, 710, 772, 799, 815, 834, and 858; and on tyrosine, 36, 220, 336, 415, 642, 715, 747, and 829, observed in the experimental fungus and N. crassa in Table 3.

3.4. Methylation and Mod Pred’s Prediction Sites

In the reference fungus, the methylation residues of lysine (83, 389, 414, 427, 680, 734, 796, 798, 869, and 871) were investigated, whereas the following four methylation residues at arginine (3, 517, 809, and 872) and 12 lysine residues (83, 314, 389, 414, 427, 680, 734, 796, 798, 836, 869, and 871) were seen in six strains of the experimental fungus in Table 4.

3.5. Nuclear Export Signals (NES): Prediction Sites

In the current study, 675-L, 844-I, and 394-I are three NES of protease SP2 in the reference fungus, whereas in south-facing slope strains of the experimental fungus, 675-L, 688-L, 844-I, and 675-L; 844-I, 11-L, and 10-L positions were observed in the north-facing slope of the experimental fungus. Two sites 675-L and 844-I are highly conserved in the experimental fungus as shown in Table 3 and Figures 4 and 5.

3.6. Pymol for 3D Protein Structure

The SP2 proteins of two strains of S2 and N6 of experimental and reference fungi were seen by Pymol (Figure 6). The coil structures are represented by sticks, and β-sheets are represented by arrows, whereas α-helix is shown by sticks and cartoons. The dimensions of the reference fungus are (Å) : 73.428, : 97.011, and : 75.603 as shown in Figure 6(a). The dimensions of S2 strain of S. fimicola (Å): : 70.248, : 90.689, and : 75.603 as in Figure 6(b) and of N6 strain of S. fimicola (Å): : 70.248, : 97.011, and : 75.603 as in Figure 6(c).

4. Discussion

According to the information, the protease serine endopeptidase gene is examined for the first time in the experimental fungus. In the present research, variations of genes and PTMs of protease serine peptidase of different strains of S. fimicola were under observation. S. fimicola’s SFS and NFS strains showed 8 polymorphism sites (Figure 1). Protease serine peptidase regions have 9 sites of nonsynonymous substitutions (Figure 2). The south-facing slope (SFS) has xeric and stressful environmental conditions, so it exhibited more genetic variations as compared to the north-facing slope. Closely related organisms have more tendency of polymorphism in specific locations of their DNA [18]. Aspergillus niger and Penicillium sp. showed dominant polymorphism because Evolution Canyon 1 has a stressful environment which suggested that genetic diversity (polymorphism) is caused by a stressful environment. The biodiversity of genes of S. fimicola from Evolution Canyon was reported by [13, 19]. They said that genetic biodiversity is also caused by gene conversion and mutation. They concluded that S. fimicola strains of SFS had a high rate of mutation due to crossing over and spontaneous mutations. Natural selection of living organisms occurs by climatic conditions, and as a result, parental and genetic variants are produced. Evolution is caused by these variations, and these variations also originate in the evolutionary potential of living organisms. [4, 20] described frequency clock and mating type a-1 proteins of different strains of S. fimicola obtained from “Evolution Canyon,” Israel, where genetic variations have strong effects on PTMs of proteins. Hence, the polymorphism in the positions of protease serine endopeptidase in six S. fimicola strains is due to the strong impact of genetic variations (Tables 13). Genetic variations change DNA nucleotide sequences, and these nucleotide sequences produce proteins of unique PTM sites. After translation, these PTMs are responsible for protein diversity. Posttranslational modifications alter protein configuration. Furthermore, it is a necessity for scientific research to study PTMs and understand how these PTMs work to carry on the function of proteins [21]. The PTMs like glycosylation, acetylation, phosphorylation, methylation, and carboxylation are commonly studied in living organisms [22]. In the current study, glycosylation, phosphorylation, acetylation, and methylation of SP2 of Sordaria sp. were mainly examined.

Protein phosphorylation regulates biological processes. Phosphorylation is categorized by the reversible enzymatic addition of a phosphate group to amino acid side chains of serine (Ser), threonine (Thr), or tyrosine (Tyr), which alter the structure and stability of proteins [23]. The role of phosphorylation is significant in conformational changes like activation, deactivation of proteins, and specificity of binding [24, 25].

Our work observed 27 sites of phosphorylation on serine residues for SP2 in the experimental fungus and N. crassa (Table 2). In most of the eukaryotes, the phosphorylation events occur at specific serine residues which has effects of signaling [6]. The current research revealed that the following sites of phosphorylation on serine (S-39, S-141, and S-352) were conserved in experimental and reference fungi. Previously, three phosphorylation sites (S-32, S-160, and S-365) were predicted which have been confirmed by mass spectrometry in matrix metalloproteinases (MMP2) [26, 27]. In multicellular organisms, the role of Ser-248 was conserved in cell development control [28]. The threonine phosphorylation modifications play a vital role in the regulation of the functions of cells. The current study depicted 20 phosphorylation sites on threonine residues for protease serine endopeptidase in six strains of S. fimicola and N. crassa (Table 2). Among 20 phosphorylation sites on threonine, some sites (T-195, T-548, and T-624) were predicted whereas T-567 phosphorylation promotes MMP14 (matrix metalloproteinases), which induced the cellular invasion and migration [2933]. Moreover, it is reported that the phosphorylation at tyrosine causes cell signaling [34]. The current exploration investigated seven sites of phosphorylation on tyrosine residues for protease serine endopeptidase in six strains of S. fimicola and reference fungus (Table 1). Some sites (Y-220, Y-336, and Y-715) were investigated among 7 phosphorylation sites on tyrosine. Protein kinases play a very vital role in phosphorylation. The main task of these kinases is to transfer a phosphate group from ATP to the substrate and phosphorylate it. CDC2, CK2, UNSP, PKC, PKA, and DNA-PK are involved in phosphorylation for SP2 in experimental and reference fungi. Phosphorylation of cytochrome C oxidase (COX1) is carried out by PKA, PKC, CDC2, and UNSP [4]. The 19, 15, and 17 phosphorylation sites on serine residues for CpA1 (protease carboxypeptidase A1) in S1, N7 (strains of S. fimicola), and N. crassa, respectively, were reported by [35]. The PKA, PKC, CKII, and UNSP are actively involved in the phosphorylation of RKM4 protein of Sordaria species. The phosphorylation within a cell is carried out by protein kinase CK2. ATP or GTP is a major source of phosphate. In eukaryote DUBA deubiquitinating movement, CK2 kinases are involved in the phosphorylation of Ser 77. However, CK2 phosphorylation does not involve any organizational variation. The phosphate group reduces the substrate-protease collaboration but does not involve the active site [36, 37].

Acetylation of lysine is a dynamic and reversible PTM in proteins in eukaryotes [3842]. However, 50% acetylated proteins are present in Sordaria sp., and some observed acetylation sites on internal lysine (K) in the current study are 50, 65, 72, 76, 83, 91, 119, 146, 149, 254, 427, 545, and 649 in the experimental fungus and reference strain (Table 1). There are 16 acetylation sites of CpA1 in S. fimicola and 14 in N. crassa [35]. COX1 (cytochrome C oxidase-1) protein of Sordaria sp. has a lysine site at 437, and in the current study, the lysine site at 437 is also investigated in SP2 in the experimental fungus [43]. Besides this, the lysine site is present at 119 in CyC-1 (cytochrome C-1) of different strains of Sordaria sp. [44], whereas in the current study, this site is also present in the SP2 gene in the experimental fungus. All these predictions show the conservation of the site, and this site also actively participated in the regulation of SP2, COX1, and CyC-1.

In glycosylation, proteins and lipids are linked with saccharides by enzymes. N- and O-linked glycosylation is very common [4547]. Protein glycosylation played a vital role in cell adhesion, protein folding, and tracking inside and outside the cell. Diabetes, carcinoma, and brain diseases are linked with changes in the glycosylation pattern [48, 49]. In a current study, 12 sites of O-glycosylation for SP2 were investigated in the experimental organism and 13 sites in the reference fungus. T-461 is the threonine glycosylation site, whereas the T-460 site is for MMP14 (matrix metalloproteinases) as observed by [29, 50, 51]. Interestingly, two novel sites (S-505 and T-317) were examined in S. fimicola, but these are absent in the reference fungus (Table 1). All glycosylation sites are shown in Figure 3.

Methylation of proteins is an important reversible PTM. The researchers examined N-methylations of lysine and arginine residues due to their significance [52, 53]. Ascomycetes have many methylation sites. Methyltransferases are cordially engaged in biological pathways [54, 55]. The current study predicted 4 residues of arginine of methylation (R-3, R-517, R-809, and R-872), which are highly conserved for experimental and reference fungi. The following residues of lysine (K-83, K-389, K-414, K-427, K-680, K-734, K-796, K-798, K-869, and K-871) of methylation are highly conserved for experimental and reference fungi. The K-314 and K-836 (methylated sites) are involved only in the experimental fungus.

There are many pathways of nuclear export signals, and leucine-rich is the best one [56]. The nuclear export signals were studied in man immunodeficiency virus type-1 rev protein reported for the first time [57] and investigated in cAMP-dependent protein kinase inhibitor. The subcellular localization of the molecules of organisms is regulated by nuclear export signals [58]. Nuclear export signals help the factors and protein interactions to leave the cytoplasm [59]. In this study, the prediction of these signals is at the 675-L position of SP2 in S. fimicola and position 675-L in the same protein of the reference fungus, which is an indication of this protein through NES (Figures 4 and 5). The prediction of NES at position 328 of frequency clock protein in S. fimicola and position 323 in N. crassa is an indication of protein regulation through nuclear export signals [4].

5. Conclusion

This study suggested that genetic biodiversity and posttranslational modifications in six strains of S. fimicola tend to bear environmental stress which may be helpful in various industries such as brewing, detergent, leather, dairy, and food processing factory. The current study suggested the production of fungal proteases commercially.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that they have no known competing conflicts.

Authors’ Contributions

The authors have made the following declarations about their contributions: Uzma Naureen and Muhammad Saleem contributed to the study’s conception, design, material preparation, and data collection for the experiments. Ahmed Nawaz Khosa and Muhammad Azfar Mukhtar analyzed the data. The first draft of the manuscript was written by Uzma Naureen. Fazul Nabi and Nisar Ahmed commented on the revised paper.

Acknowledgments

The authors are thankful to the Higher Education Commission, Pakistan for providing funding for this study through the PSDP project titled “Establishment of National Center for Livestock Breeding, Genetics and Genomics at Lasbela University of Agriculture, Water and Marine Sciences, Uthal.”