Abstract

The information about the crystal structure of porcine reproductive and respiratory syndrome virus (PRRSV) leader protease nsp1α is available to analyze the roles of tRNA abundance of pigs and codon usage of the nsp1α gene in the formation of this protease. The effects of tRNA abundance of the pigs and the synonymous codon usage and the context-dependent codon bias (CDCB) of the nsp1α on shaping the specific folding units (α-helix, β-strand, and the coil) in the nsp1α were analyzed based on the structural information about this protease from protein data bank (PDB: 3IFU) and the nsp1α of the 191 PRRSV strains. By mapping the overall tRNA abundance along the nsp1α, we found that there is no link between the fluctuation of the overall tRNA abundance and the specific folding units in the nsp1α, and the low translation speed of ribosome caused by the tRNA abundance exists in the nsp1α. The strong correlation between some synonymous codon usage and the specific folding units in the nsp1α was found, and the phenomenon of CDCB exists in the specific folding units of the nsp1α. These findings provide an insight into the roles of the synonymous codon usage and CDCB in the formation of PRRSV nsp1α structure.

1. Introduction

Porcine reproductive and respiratory syndrome virus (PRRSV) is an economically important pathogen of swine. The PRRSV belongs to the order Nidovirales, family Arteriviridae, genus Arterivirus [1]. The PRRSV genome contains at least 9 open reading frames, including ORF1a encoding papain-like cysteine protease, ORF1b encoding RNA dependent RNA polymerase, ORFs 2–6 encoding envelop proteins, and ORF7 encoding the nucleocapsid protein [2, 3]. PRRSV strains can be divided into two distinct serotypes, namely, the North American isolate (US) and the European isolate (EU) [48].

The replicative enzymes of the PRRSV are encoded in ORF1a and ORF1b, which locate in the 5′ proximal three quarters of the viral genome. The two polyproteins encoded by ORF1a and ORF1b are cleaved extensively by the nonstructural protein 4 (nsp4) deriving from ORF1a, yielding a series of nonstructural proteins [9]. In particular, the nsp1 and the nsp2 proteases release themselves from the ORF1a polyprotein firstly, and the nsp1 can be further processed into two multifunctional proteases, namely, the nsp1α and the nsp1β [10, 11]. The arterivirus nsp1 region contains a tandem of papain-like autoprotease domains (PCPα and PCPβ), and the arterivirus PCPα and PCPβ domains were found to be active in the reticulocyte lysates and the E. coli systems [12, 13]. This biological feature might indicate that the active functions of PCPα and PCPβ are free from the different types of the expression systems and depend on the correct folding by themselves. As for the nsp1α, it plays an important role in regulating the accumulation of both genome- and subgenome-length minus-strand RNA and thereby fine-tuning the relative abundance of each of viral mRNAs in the infected cells [10, 14, 15]. The correct secondary structure of the nsp1α is required for the biological functions of the protease. Based on the crystal structure of the nsp1α, it was found that this nonstructural protein has three domains, namely, the N-terminal zinc finger (ZF) domain, the papain-like cysteine protease domain, and the carboxyl-terminal extension [16]. Recently, the role of the nsp1α in impairing the host immune response has been reported [17]; however, little information about the relationship between synonymous codon usage and the secondary structure of the PRRSV nsp1α is available to date.

The synonymous codon usage and translational speed of gene play important roles in many biological functions, like translation efficiency, genetic diversity, amino acid conservation, transfer RNA abundance, coevolution of the virus and its hosts, and context-dependent codon bias (CDCB), and so forth [1822]. The nucleotide composition of a coding sequence (CDS) is nonrandom, and the CDS nonrandomness is influenced by the preferences in the selection of synonymous codons pairing to the same amino acid (termed as the synonymous codon usage bias SCUB). The link between SCUB and specific folding unit of protein gives us a new insight into the correct formation of the secondary structure of proteins [2326]. It is noted that mRNA sequences generally have an additional potential to carry correct structural information in the forms of SCUB, which can be involved in a single codon or a nucleotide context of the target coding sequence [27, 28]. As for SCUB, neighboring nucleotides flanking a codon regulate the usage of the specific codon from the synonymous family, termed as context-dependent codon bias (CDCB) [20, 2931]. It has been reported that the most important nucleotide determining CDCB is the first nucleotide after a codon, termed as the context [32]. Although several evidences indicate the link between SCUB and the formation of the specific folding unit of viral protein, little information about the role of CDCB in the formation of the specific folding unit is reported up to date. In this study, we employed the structural information about the nsp1α of PRRSV and several simple formulas to analyze the relationship between the CDCB of the PRRSV nsp1α gene and the protease.

2. Materials and Methods

2.1. Information of PRRSV Gene and Structure of the nsp1α

The 191 coding sequences of PRRSV containing the nsp1α gene were downloaded from the National Center for Biotechnology Information (NCBI) (http://www.ncbi.nlm.nih.gov/Genbank/) and the accession numbers of the sequences were listed in Table S1 available online at http://dx.doi.org/10.1155/2014/765320. To investigate SCUB of the nsp1α, the related genes were obtained from these 191 coding sequences by the multiple sequence alignments performed with the Clustal W (1.7) computer programs [33]. The information about the secondary structure of the PRRSV nsp1α was obtained from protein data bank (PDB: 3IFU).

2.2. Analysis of the Overall tRNA Abundance of Each Codon Position along the nsp1α Gene

To identify the translation selection caused by the various tRNA copy numbers (reflecting tRNA abundance) of the pigs (http://gtrnadb.ucsc.edu/) at each codon position in the PRRSV nsp1α, we devised an index ( value) representing the overall tRNA abundance for a particular codon position in a target gene. Consider where value indicates the overall tRNA abundance for a particular codon position in the target gene, represents the tRNA copy numbers of a synonymous codon () for the corresponding amino acid (), represents the optimal tRNA copy numbers of a synonymous codon for the same amino acid, and means the number of the interesting gene. The value ranges from 0 to 1.0. The value less than 0.3 for a codon position represents low tRNA abundance, and the value more than 0.7 for a codon position represents high tRNA abundance.

2.3. Estimation of the Relationship between the Synonymous Codon Usage Bias and the Secondary Structure of the nsp1α

Based on the alignment between the amino acid sequences of the PRRSV (PDB: 3IFU) and the 191 nsp1α genes involved in this study, we can locate the different folding units in the target protein. We devised the formula for the value based on the previous research which analyzed the relationship between the codon usage bias and the structure of the target protein [25]. Consider where represents the amount of a specific synonymous codon for the corresponding amino acid in a specific folding unit (the α-helix, the β-strand, or the coil) of protein; sec- represents the corresponding amino acid in a specific secondary unit; represents the amount of the amino acid in the corresponding folding unit. In addition, represents the total number of amino acids in a specific folding unit; sec- contains the three kinds of folding unit, namely, α-helix, β-strand, and the coil; represents the total number of codons in the target genes. When the value is more than zero, the corresponding synonymous codon owns a potential to be selected in a specific folding unit. When the value is less than zero, the synonymous codon has no tendency to be chosen in a specific folding unit. Furthermore, we defined that when the value is more than 0.1, the synonymous codon has a strong ability to exist in the specific folding unit; on the contrary, when the value is less than −0.1, the synonymous codon has a strong tendency to avoid the specific folding unit.

2.4. Calculation of the Relative Abundance of Codons with Context

With the purpose to estimate the synonymous codons playing an important role in the formation of the specific folding units, codons having a significant tendency to exist in the specific folding unit of the PRRSV nsp1α were analyzed by the formula for the relative abundance of codons with context. Berg and Silva [32] defined that the context represents the first nucleotide after the target codon. Following this notation, we defined that the context represents the last nucleotide before the target codon. We devised a formula calculating value for the context (~) and the context (~) depending on the formula previously reported [20, 34]. Consider where is the frequency of the codon and is the frequency of nucleotide in the or context. (~) and (~) are the frequency of a codon with the context. It is noted that , , , and are the nucleotides and the codon is composed of . Here and elsewhere the tilde character separates codons (italic) or oligonucleotides (nonunderlined) from their mononucleotide context.

2.5. Calculation of the Relative Abundance of Mononucleotide and Dinucleotides in the nsp1α Gene

To investigate whether the and contexts are shaped by randomness or not, we calculated the frequencies of each nucleotide and dinucleotide , where , , , and are each one of the four nucleotides , and . Then we calculated the relative abundances ( value) of the mononucleotide and dinucleotides with a single nucleotide context: (~) = (~)/[], for mononucleotide with context ; (~) = (~)/[], for dinucleotide with context .

2.6. Statistic Analysis

One-way analysis of variance, namely, one-way ANOVA, is a technique used to compare means of two or more samples. In this study, the ANOVA test is applied for identifying whether the overall tRNA abundance of positions of a specific folding unit is different from other specific folding units or not. In addition, the ANOVA test is also employed to estimate whether the frequencies of codon usage in a specific folding unit are different from other specific folding units or not. This statistic analysis is carried out by the software SPSS 11.5.

3. Results

3.1. The Overall tRNA Abundance for Each Codon Position of the nsp1α Gene

Based on the values, the tRNA abundance for each codon position along the PRRSV nsp1α gene was mapped. The translation speed for the synthesis of the nsp1α is not stable in the pigs (Figure 1). The codon positions with the values much less than 0.30 have a tendency to cluster in nsp1α gene, including the positions 4–6, 8–10, 22–25, 27–30, 32–34, 38–40, 42–47, 50–53, 55–58, 68–70, 77–79, 83–85, 110–112, 119–122, 126–128, 139–141, 157–160, and 171–173. However, the codon positions with the values much greater than 0.70 have few chances to cluster in nsp1α gene which is translated in the pigs. Due to most codon positions with values much less than 0.70 existing in the target gene, these positions within the nsp1α might reduce the translation rate of this protein when the nsp1α was scanned by the ribosomes in pig cells. It is noted that there are no significant differences () of the overall tRNA abundance for the codon positions in the regions of the three specific folding units of the nsp1α. This result suggests that the fluctuation of the overall tRNA abundance pairing to each codon position along this nsp1α might not regulate the formation of the specific folding units but decrease the scanning speed of ribosomes in the pig cells.

3.2. The Relationship between the Synonymous Codon Usage Bias and the Structure of the nsp1α

Based on the values for the synonymous codons which are involved in the formation of the specific folding units in the nsp1α, we found the link between SCUD and the specific folding unit (). In detail, the synonymous codons have a strong propensity toward shaping the α-helix unit, including AUC for Ile, GUA for Val, AGC for Ser, AAG for Lys, and AUG for Met (Table 1). Turning to the effects of SCUB on shaping the β-strand unit, there are UUA for Leu, AUA for Ile, GUG for Val, UCA and AGU for Ser, ACA for Thr, UAC for Tyr, CGC for Arg, and two synonymous codons for His (Table 1). It is interesting that there are no codons which have a strong tendency to exist in the coil of the nsp1α (Table 1). As for the codons which have a strong tendency ( value > 1.0) to exist in the nsp1α, all of them strongly tend to exist only in the α-helix or the β-strand of this protein.

3.3. The Relative Abundance of the Codon with Context in the nsp1α Gene

As for the codons which have a strong tendency to exist in the specific folding unit of the nsp1α, their values, the relative abundance of codons with contexts, were calculated from the 191 nsp1α genes (Table 2). The data show that the occurrence of the codon with context or context is not random, and many codons with context or context have a strong tendency to exist in the specific folding units of the nsp1α. Based on the data of SCUB in the specific folding units (Table 1), the corresponding codon with context was found to have a trend to exist in the specific folding unit of the nsp1α. In detail, the codons with context or context (GUA~A, AGC~A, AAG~C, AGA~C, A~AUA, U~AGC, U~AAG, C~AAG, G~AGA, and U~AUG) have an obvious trend to exist in the helix unit of the nsp1α. Some codons with context or context have a strong tendency to exist in the β-strand of the nsp1α, including UUA~A, AUA~G, GUG~U, UCA~C, AGU~G, ACA~C, UAC~U, UAC~C, CAU~G, CAC~G, CGC~U, G~UUA, U~AUA, U~GUG, G~GUG, U~UCA, C~AGU, C~ACA, C~UAC, G~UAC, U~CAU, U~CAC, and U~CGC.

In order to identify the roles of nucleotide compositions (dinucleotide with context and mononucleotide with context) in shaping the codon with context or context, the values for these interesting codons with context or context which have a strong tendency to exist in the helix or the β-strand were compared with the values for the dinucleotide/mononucleotide with context (Tables 3 and 4). The value for the target codon with context is higher than the corresponding dinucleotide/mononucleotide with context or context. For example, as for GUA which tends to exist in the helix of the nsp1α gene, GUA~A has a tendency to exist in the helix unit, because the value (1.4751) of GUA~A for the helix unit is higher than the value for GUA~A for the β-strand and the coil (Table 2) and higher than the value for UA~A and the value for A~A (Tables 3 and 4). As for UUA which tends to exist in the β-strand of this gene, UUA~A has a tendency to exist in the strand unit, because the value (4.9268) for UUA~A is higher than the values for UUA~A of the helix and the coil and higher than the values for UA~A and A~A (Tables 3 and 4). As for AGC which tends to exist in the helix of this gene, U~AGC has a tendency to exist in the helix rather than in the strand or the coil, because the value (5.9004) for U~AGC is higher than the value of U~AGC of the strand and the coil and higher than the values for U~AG and U~A (Tables 3 and 4). As for UUA which tends to exist in the strand of this gene, G~UUA has a tendency to exist in the strand unit, because the value (3.7019) for G~UUA of the strand unit is higher than the values for G~UUA of the helix and the coil and higher than values for G~UU and G~U (Tables 3 and 4). Based on the standard mentioned above, GUA~A, AGC~A, AAG~C, AGA~C, A~GUA, U~AGC, U~AAG, C~AAG, G~AGA, and U~AUG have a strong trend to exist in the helix of PRRSV nsp1α gene and UUA~A, AUA~G, GUG~U, UCA~C, ACA~C, UAC~U, UAC~C, CAU~G, CGC~U, G~UUA, U~GUG, U~UCA, C~AGU, C~ACA, G~UAC, U~CAU, and U~CGC have a strong tendency to exist in the β-strand of the nsp1α gene.

4. Discussion

In this study, we have mapped the fluctuation of the overall tRNA abundance for each codon position along the PRRSV nsp1α gene and estimated the correlation between the synonymous codon usage and different folding units of the nsp1α. The performance of mapping the fluctuation of the overall tRNA abundance for each codon position along the target gene likely reflects the translation speed of ribosomes scanning caused by the tRNA abundance of the pigs to some degree, since the tRNA abundance plays an important role in the ribosome scanning along the target coding sequence [35, 36]. The previous report showed that the α-helix is preferentially coded by translationally fast mRNA regions while the slow segments often encode β-strands and coil regions [37]. In the study, no linkage between the fluctuation of the overall tRNA abundance pairing to the codon positions along the nsp1α gene and the specific folding units might suggest that the process of translation fine-tunes is not performed by variation of translation speed for each codon position along the nsp1α. The fine-tuning in vivo protein folding exists in the gene, and this regularity is largely believed to occur in a cotranslational process [38]. However, the PRRSV nsp1α derives from the posttranslational processing of the pp1a [10, 39]. The process of the cleavage of the nsp1α from the pp1a polyprotein of PRRSV performed by the posttranslation might be free from the fluctuation of tRNA abundance pairing to the each codon position along the nsp1α gene. As for the ribosomes scanning the nsp1α gene, there is no significant link between the fluctuation of the overall tRNA abundance and the specific folding units, and the translation elongation rate of this gene is not high. These results suggest that the low tRNA abundance controls the ribosomal traffic along the translated message to achieve the effective synthesized product of the PRRSV pp1a. The low translational elongation at the translation beginning step directs the target gene to generate the corresponding protein effectively [40].

Turning to the role of the synonymous codon usage in the formation of the specific units of the nsp1α, there is significant relationship between the synonymous codon usage bias and the specific folding units in the target protein. The synonymous codons assist messenger RNA to carry the information of the specific folding units, and a single codon or a contiguous nucleotide region plays roles in shaping the specific folding units [24, 25, 41, 42]. As for the PRRSV nsp1α, there is no synonymous codon which tends to exist in coil unit. However, many synonymous codons exist in the α-helix and β-strands regions of this gene, and no synonymous codon has a strong tendency to be selected by both the α-helix and the β-strands in the PRRSV nsp1α simultaneously. These results indicated that SCUB might play roles in shaping this protease with natural properties for the life-cycle of PRRSV. SCUB for formation of the specific folding units of the PRRSV nsp1α is influenced by the natural selection. As an example of the role for natural selection, the expressivity of genes is an important factor in shaping SCUB, both for prokaryotic and for eukaryotic organisms [18, 22, 43, 44]. Although the link between the SCUB and the formation of the specific folding units was reported [25, 35, 37, 38, 45], the role of CDCB in formation of specific folding units is not clear. In this study, we found that CDCB plays a role in the formation of specific folding units in the PRRSV nsp1α. The synonymous usage bias and CDCB, which play important roles in achieving accuracy and efficiency in protein synthesis, are particular manifestations of coding sequence nonrandomness [23, 46, 47]. Spatial interaction of ribosomal proteins with codon-anticodon RNA pairs inside the A and P sites of the ribosome could be preferable for particular codons with context [20, 48].

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

Authors’ Contribution

Yao-zhong Ding, Ya-nan You, and Dong-jie Sun contributed to the original draft of the paper and approved the final version. Hao-tai Chen, Yong-lu Wang, Hui-yun Chang, Li Pan, Yu-zhen Fang, Zhong-wang Zhang, Peng Zhou, Jian-liang Lv, Xinsheng Liu, Jun-jun Shao, Fu-rong Zhao, and Tong Lin downloaded the sequences and analyzed the data. Laszlo Stipkovits, Zygmunt Pejsak, Yong-guang Zhang, and Jie Zhang provided suggestive information to the Discussion and revised the paper. All authors read and approved the final version.

Acknowledgments

This work was supported in part by Grants from International Science and Technology Cooperation Program of China (no. 2012DFG31890) and Gansu Provincial Funds for Distinguished Young Scientists (1111RJDA005). This study was also supported by National Natural Science foundation of China (no. 31172335 and no. 31072143).

Supplementary Materials

Table 1: The accession number of the 191 strains of PRRSV

  1. Supplementary Material