Abstract

In this theoretical study, the role of the side chain moiety of C-terminal residue in influencing the structural and molecular properties of dipeptides is analyzed by considering a series of seven dipeptides. The C-terminal positions of the dipeptides are varied with seven different amino acid residues, namely. Val, Leu, Asp, Ser, Gln, His, and Pyl while their N-terminal positions are kept constant with Sec residues. Full geometry optimization and vibrational frequency calculations are carried out at B3LYP/6-311++G(d,p) level in gas and aqueous phase. The stereo-electronic effects of the side chain moieties of C-terminal residues are found to influence the values of and dihedrals, planarity of the peptide planes, and geometry around the C7   -carbon atoms of the dipeptides. The gas phase intramolecular H-bond combinations of the dipeptides are similar to those in aqueous phase. The theoretical vibrational spectra of the dipeptides reflect the nature of intramolecular H-bonds existing in the dipeptide structures. Solvation effects of aqueous environment are evident on the geometrical parameters related to the amide planes, dipole moments, HOMOLUMO energy gaps as well as thermodynamic stability of the dipeptides.

1. Introduction

Generally, twenty canonical amino acid residues adequately build up the proteins and enzymes necessary to support most of the cellular functions in all the three domains of life on earth. Selenocysteine (Sec) and pyrrolysine (Pyl) are the two rarely occurring genetically encoded amino acids whose presence in the active sites of some enzymes enables them to sustain life in some extraordinarily unique ways [17]. The chemical structures of Sec and Pyl are portrayed in Figure 1. Although the distribution of Pyl is limited to methanogenic archea and certain bacteria, Sec is commonly found in eubacteria, archaea, and eukarya [5]. Sec and Pyl are cotranslationally inserted into proteins corresponding to UGA (opal codon) and UAG codons (canonical stop codon) respectively, which are generally responsible for terminating the process of protein biosynthesis.

The dynamic properties and functional specificity of the proteins and polypeptides are known to depend primarily on the linear sequence of amino acid residues [8]. Therefore, over the last few decades, small amino acid sequences like di- or tripeptides have been used extensively as model systems in the experimental and theoretical studies concerning the structure of proteins and energetics of protein folding. On the other hand, since theoretical or computational approaches are difficult to employ directly to the large macromolecular systems such as polypeptides or proteins, model systems serve as an easy and computationally less expensive alternate way to understand the structure of protein. Some of the most important structural features of the protein backbone have been reproduced theoretically by considering dipeptides as model systems. It is now realized that computational techniques are indispensable in elucidating atomic level structural information about biologically active molecules [911].

The gas-phase structural studies on dipeptides provide us the opportunity to understand their intrinsic properties free from the solvent or crystal phase effects. In gas phase, the structural features of the dipeptides mainly depend on a delicate balance between the stabilizing intramolecular H-bonds, destabilizing repulsive lone electron pairs, and steric effects. Gas phase structural studies on dipeptides arising from the genetically encoded amino acids [1215] have pointed out that in most of the dipeptides the amide plane are not completely planar; and this has been explained in terms of the cumulative effect of steric hindrance of –R group and H-bonding. However, it is of fundamental importance to determine the conformational details of a biologically important molecule in aqueous solution since most of the biochemical processes occur in an aqueous environment.

The solvent effects play crucial role in shaping the secondary structure of proteins by modifying the interplay between intra- and intermolecular H-bond interactions existing in the primary amino acid sequences. The effects of solvation on the conformations and energies of dipeptides have been well documented in literature [1623]. In these studies the energetics and structural features of the dipeptides in gas and solvent phases are analyzed to understand the effect of the surrounding environment on the stabilities and conformational preferences of the dipeptides. In a strong polar solvent like water, the interactions among the nearest-neighbor residues of the dipeptides are dramatically modified as compared to those in gas phase, which consequently affect the Ramachandran dihedrals ( , ) [24, 25] and confer markedly different conformations to the dipeptides in the aqueous phase. Investigations of the numerous parameters involved in dipeptide structure prediction have now been regarded as a pivotal part of the computational studies concerning the structure of protein and energetics of protein folding [26].

In this study, efforts are being made to examine the effects of solvation and identity of the varying C-terminal residue on the energetics, structural features of the peptide planes, geometry about the -carbon atoms, values of the and dihedrals, theoretically predicted vibrational spectra, dipole moments, rotational constants, HOMO/LUMO energies as well as their energy gaps, and types of intramolecular H-bonding interactions that may play crucial roles in determining the structure and stability of a series of dipeptides whose C-terminal positions are varied with seven different amino acid residues namely valine (Val), leucine (Leu), aspartic acid (Asp), serine (Ser), glutamine (Gln), histidine (His), and pyrrolysine (Pyl), while the N-terminal position is kept constant with a selenocysteine (Sec) residue. All these amino acid residues are taken as neutral (nonionic) species. The standard three letter abbreviations are used to represent an amino acid while a particular dipeptide is named by listing the N-terminal residue first. Thus, Sec-Val dipeptide corresponds to a structure in which Sec is in the N-terminal position and Val is in the C-terminal position. Figure 2 schematically represents the chemical structures of all the seven dipeptides. The C4–N6 is the peptide bond of a given dipeptide while C3 and C7 are the -carbon atoms of its N- and C-terminal residues respectively. To facilitate a clear representation of the intramolecular hydrogen bond interactions present in the dipeptides some of the hydrogen atoms are named as Ha or Hb. This theoretical structural study on the seven dipeptides in gas as well as in simulated aqueous phase is expected to provide the opportunity to know the influence of local interactions on the structural aspects of dipeptides at an atomic level which in turn may help us to understand the dynamics and functional specificity of proteins and polypeptides and in enhancing this rapidly expanding area of research.

2. Computational Methodology

The molecular geometries of all the selected dipeptides were subjected to full geometry optimization and vibrational frequency calculations using the B3LYP/6-311++G(d,p) level of theory [27, 28] of Gaussian 03 package [29]. The efficacy of B3LYP/6-311++G(d,p) in studying conformational behavior and various other properties of amino acids has been explained in the literature [30]. The computations were conducted in gas as well as in aqueous phase using a polarizable continuum model (PCM) [31]. The accuracy of self-consistent reaction field (SCRF) model in predicting the structure and energetics of dipeptides has already been justified in literature [32]. Absence of imaginary frequency value in the vibrational frequency calculations proved the optimized geometries to be true minima. Zero point energy (ZPE) corrections were applied to the total energies of all the dipeptides using a correction factor 0.9877 [33]. The theoretically predicted vibrational frequencies were scaled with appropriate scaling factors (1.01 for below 1800 cm−1 while 0.9679 for above 1800 cm−1) [33]. Use of diffuse functions is important to take into account the relative diffuseness of lone pair of electrons when a molecule under investigation contains lone pair of electrons [34] while polarization functions are useful in studying the conformational aspects where stereoelectronic effects play an important role [35].

3. Results and Discussion

Table 1 presents the gas and aqueous phase data on total energies, rotational constants, and dipole moments of the dipeptides calculated at B3LYP/6-311++G(d,p) level of theory. Tables 2 and 3 list the values of the bond lengths and bond angles of the amide planes of the dipeptides respectively (the gas phase values are given in brackets). Table 4 collects the values of the four dihedral angles considered to monitor the planarity of the peptide planes of the dipeptides (viz. C3–C4–N6–C7, O11–C4–N6–H10, C3–C4–N6–H10 and O11–C4–N6–C7), the values of (C4–N6–C7–C9) dihedrals considered to specify the orientations of the side-chain moieties, and the values of the two well-known Ramachandran backbone dihedral angles (N5–C3–C4–N6) and (C4–N6–C7–C8) which are useful in studying the effects of solvation on the dipeptide structures as well as in predicting the overall structure of proteins. Table 5 represents the gas and aqueous phase data on the geometrical parameters considered to examine the geometry around the -carbon atoms. Table 6 lists some important intramolecular H-bonding interactions that play crucial roles in the energetics and in conferring the observed conformations to the dipeptides in both the phases. Some of the characteristic frequency and intensity values (given in brackets) of the dipeptides calculated at the B3LYP/6-311++G(d,p) level of theory are given in Table 7. Table 8 represents the DFT results on the HOMO/LUMO energies, and their energy gaps for the dipeptides in both phases. Figure 3 represents the optimized structures of the dipeptides in aqueous phase while Figures 4, 5, 6, and 7 represent the theoretical IR spectra of the seven dipeptides both in gas and aqueous phase (scaled with a correction factor 0.9679).

3.1. Structure and Stability of the Dipeptides

All the seven dipeptide geometries exhibit large values of total dipole moments (listed in Table 1), ranging from 2.910 to 9.218 D in gas phase and from 5.423 to 12.542 D in aqueous phase, which indicate that they have greater polar character and consequently possess greater affinity to polar solvents. Thus, the data on the total energy of dipeptides correctly predicts that the dipeptide geometries are thermodynamically more stable in a strong polar solvent such as water than in gas phase by an energy difference that may range from 12.07 to 19.60 kcal/mol. The accuracy of DFT method in predicting the rotational constants of conformers of some aliphatic amino acids has been discussed in the literature [36, 37]. In the absence of any experimental data on rotational constants and dipole moments, these theoretically predicted values may assist experimentalists in determining the other conformers of the seven dipeptides studied here.

The gas and aqueous phase bond length values of the five bonds of the amide planes that is, C3–C4, C4=O11, C4–N6, N6–H10 and N6–C7, listed in Table 2, suggest that very little variance in the bond length values of the amide plane results as the identity of the C-terminal residue of a given dipeptide changes. Maximum deviations of 0.006 Å in gas phase and 0.004 Å in aqueous phase from their respective average values indicate that the bond lengths are essentially fixed. However, due to solvation effects the aqueous phase bond length values of the above mentioned bonds deviate from their respective gas phase values. For example, in aqueous phase, the exposed C4=O11 and N6–H10 bonds are elongated up to 0.009 and 0.002 Å respectively; whereas the buried C4–N6 bonds are shortened by a range of 0.007 to 0.011 Å for all the systems. Table 3 lists the values of the six bond angles of the amide planes that is, C3–C4–O11, C3–C4–N6, O11–C4–N6, C4–N6–C7, C4–N6–H10 and H10–N6–C7; and the data in both the phases indicates very little changes in the bond angle values as the individuality of the C-terminal residue of the dipeptides changes. Maximum deviations of 1.3° in gas phase and 0.7° in aqueous phase indicate that the bond angles are also essentially fixed. The solvent effects on these bond angles are quite apparent when their aqueous phase data is compared with the corresponding gas phase values; a maximum deviation up to 2.0° is observed for C4–N6–C7 angle in the Sec-Asp.

The predicted gas and aqueous phase values of the four dihedral angles of the dipeptides, namely, C3–C4–N6–C7, O11–C4–N6–H10, C3–C4–N6–H10, and O11–C4–N6–C7, listed in Table 4, provide valuable information regarding the planarity of the peptide planes. The values of the two dihedral angles C3–C4–N6–C7 and O11–C4–N6–H10 should be close to 180° and those for the other two, that is, C3–C4–N6–H10 and O11–C4–N6–C7 should be close to 0° if indeed the amide plane is planar. The data presented in Table 4 shows that in aqueous phase the values of the four dihedral angles deviate up to a maximum value of 4.7° from the expected value whereas in gas phase the maximum deviation observed is 11.4°. Thus, these dihedral angles do not deviate dramatically from their expected values in both phases, however, the extent of deviations observed in the values of the four dihedral angles obviously suggests that the geometry of the amide planes are not perfectly planar regardless of whether the systems are in gas phase or in strong polar solvents like water. It is expected that the conformations of the seven dipeptides predicted at B3LYP/6-311++G(d,p) level are reliable since it has been pointed out that full geometry optimization of gaseous tryptophan conformers at B3LYP/6-311G(d) and MP2/6-311++G(d,p) levels do not produce any noticeable structural changes, only the conformer energies change by small amounts [38]. Therefore, it is reasonable to assume that solvation effects cannot drastically improve the planarity of the amide planes, and the extent of the deviations from planarity primarily depends on two factors—(a) steric interactions of the side chain moieties of the C-terminal residues (–SC group) and (b) intramolecular H-bond formation by the H- and O-atoms of the amide planes with their adjacent moieties belonging to the C- and N-terminal residues. The intramolecular H-bond interactions that play crucial roles in deviating the amide planes from planarity and in imparting the observed conformations to the dipeptides in gas and aqueous phase are listed in Table 6, and a discussion on these interactions is also offered in a succeeding section of this paper.

Table 4 also lists the –SC groups of the C-terminal residues of the dipeptides as well as the gas and aqueous phase values of the , and dihedrals. Previous gas phase studies [12, 13] have pointed out that the value of increases as the size of a given –SC group increases. However, a thorough analysis of the dipeptide structures studied here reveals that both size as well as the type of functional groups present in a –SC group may influence the value and planarity of the amide plane of a given dipeptide. A large sized –SC group may compete for its physical space requirements to accommodate itself in between the amide plane and carboxylic group of the C-terminal residue of a given dipeptide and consequently influence the planarity of the amide plane. On the other hand, the –SC groups, depending on the type of functional groups present in them, may exert electrostatic repulsive or electrostatic attractive forces on their neighboring atoms belonging to the peptide planes and the carboxylic group of the C-terminal residues of the dipeptides which may also influence the values of as well as planarity of the amide planes. The gas and aqueous phase values of and dihedrals of the dipeptides reveal that in solvent phase the type of functional groups present in the –SC groups is more important in influencing the and values of the dipeptides rather than the size of the –SC groups. The polar solvents are known to leave remarkable influence on the conformational properties of dipeptides by weakening the intraresidue hydrogen bonds and leading to the appearance of new energy minima [1921, 23]. Thus, the differences in the gas and aqueous phase values of dihedrals in Sec-Asp and Sec-Pyl systems, 32.8° and 35.5° respectively, can be justified on the basis of the type of functional groups present in the –SC groups of Asp and Pyl residues. Similarly, the smaller values of Sec-Gln than those of Sec-Ala (even though the –SC group of Gln is bigger in size than that of Val) and larger values of Sec-Ser than those of the other six systems (in spite of having smallest sized –SC group in its C-terminal residue) can be explained by invoking the influence of functional groups present in their respective –SC groups.

3.2. Geometry about the -Carbon Atoms

Since the protein structures usually contain thousands of amino acid residues, the geometries about the -carbon atoms of the individual residues play important role in deciding the overall structure of the proteins. The three bond angles considered to monitor the geometry around the C3   -carbon atoms of the dipeptides are N5–C3–C2, N5–C3–C4 and C2–C3–C4, while N6–C7–C8, N6–C7–C9 and C9–C7–C8 are the same for the C7 atoms. The -carbon atoms of the amino acids are sp3 hybridized and therefore the ideal bond angle should be 109.5°; however, this is not expected due to their stereogenic character. By monitoring the above mentioned bond angles around each -carbon atom of the dipeptides, one can get an idea about how the change in identity of the C-terminal residue can affect the geometries about these -carbon atoms. This DFT study also provides us the opportunity to probe the effects of solvation on the geometries of the -carbon atoms. Table 5 lists the gas and aqueous phase data on the bond angles about the -carbon atoms. Maximum deviations of 0.2° in aqueous and 1.2° in gas phase from their respective average values suggest that the geometries about the C3 atoms do not change much with the change in the identity of the C-terminal residues. On the other hand, with maximum deviations up to 3.5° in aqueous and 3.6° in gas phase from their respective average values, the bond angles around the C7 change appreciably with the change in identity of the C-terminal residue of the dipeptides. These observations can be justified by invoking the two factors—size and the type of functional groups present in the –SC groups as previously mentioned while discussing the planarity of the peptide planes. The stereoelectronic effects of the varying –SC groups on the geometry of the C3 atoms are very little as they reside at a distance of four bonds away from these -carbon atoms. On the contrary, since the varying –SC groups are situated adjacent to the C7 atoms, the geometry around them is affected by the changing identity of the –SC groups. The solvation effects are also more prominent on the geometry of the C7 atoms (a maximum deviation up to 2.4° is observed for the angle C9–C7–C8 in Sec-Pyl system) than that on the C3 atoms where the maximum deviation predicted is 1.8° for the N5–C3–C4 angle of Sec-Asp system.

3.3. Intramolecular Hydrogen Bond Interactions

The different conformers of a dipeptide molecule are known to be stabilized by a delicate interplay of different types of intramolecular hydrogen bonds (H-bonds) [15]. The strength of these H-bonds depends on two factors, (a) shorter is the distance A–HB than the sum of their van der Waals radii and (b) closer the angle A–HB to 180°, where A–H is H-bond donor and B is H-bond acceptor [22, 23]. Table 6 lists two types of intramolecular H-bonds, namely, NH–N and OH–C, whose interplay is very crucial in imparting the observed deviations of the peptide planes from planarity as well as in determining the energetics of the seven dipeptides. The gas phase intramolecular H-bond combinations of the dipeptides are similar to those in the aqueous phase. In aqueous phase, the BH distances of N5H10–N6 bonds are shortened by a range of 0.019 to 0.077 Å while the same of the O11H–C7 bonds are elongated up to a magnitude of 0.204 Å. On the other hand, the gas and solvent phase data on the two H-bonds O12H–C7 and O13H–C7 clearly indicates the effects of size and the type of functional groups present in the –SC groups on the conformation of the dipeptides as well as on the number and type of H-bond interactions existing in the dipeptide molecules. For example, the absence of O13H–C7 and presence of O12H–C7 bonds, only in the case of Sec-Asp system, can be explained on the basis of the identity of the –SC group of Asp residue.

3.4. Calculated Vibrational Spectra

Table 7 lists the characteristic frequency and intensity (given in brackets) values of only those vibrational modes which are sensitive to the structural changes caused by the varying C-terminal residues and solvent effects. The theoretically predicted vibrational spectra of the seven dipeptides in both phases provide valuable information to understand the existence and nature of various types of intramolecular H-bonds in the dipeptides. It is evident from Table 7 that the vibrational frequencies shift invariably towards the lower side of frequency scale corresponding to the presence of intramolecular H-bond interactions. The shortening of N5H10–N6 bonds in aqueous phase structures is well reflected by the lowering in the frequency values of the (N6–H10) stretching by a range of 17 to 36 cm−1 than those in the gas phase. Solvent effects also lower the frequency values of the (C4=O11) modes of the dipeptides by a magnitude up to 52 cm−1 in aqueous phase which may be due to the elongation of bond lengths in solvent phase (the C4=O11 bonds are elongated up to 0.009 Å in the aqueous phase). The variations observed in (C7–H) stretching values can be attributed to the effects of the changing identity of the –SC groups of the C-terminal residues. The (C3–H) stretching values of the dipeptides remain relatively unchanged since the geometry around the C3   -carbon atoms does not change much with the changes in identity of the –SC groups.

3.5. HOMO/LUMO Energies

Table 8 represents the DFT data on the highest occupied molecular orbital (HOMO), lowest unoccupied molecular orbital (LUMO) energies, and their energy gaps for the dipeptides in both the phases. These data suggest that the HOMO-LUMO energy gaps for the dipeptides increase in the presence of a solvent with high-dielectric constant than those in gas phase. This point has been well discussed in the literature [39]. In aqueous phase the predicted HOMO-LUMO energy gaps for dipeptides are (range from 5.4719 to 5.7745 eV) always higher than those in the gas phase (range from 5.1617 to 5.2532 eV) which indicate that the dipeptides are more stable in aqueous phase. Information obtained from HOMO-LUMO energy gaps has been used in elucidating the chemical activity [4042] as well as in various other interesting physicochemical properties [43] of biologically important molecular systems.

4. Conclusions

All the seven dipeptide geometries exhibit large values of total dipole moments, ranging from 2.910 to 9.218 D in gas phase and 5.423 to 12.542 D in aqueous phase, and as a consequence the aqueous phase structures show more thermodynamic stabilities than the gas phase structures by a range of 12.07 to 19.60 kcal/mol. The stereo-electronic effects of the varying –SC groups influence the values of , planarity of the peptide planes, and geometry around the C7   -carbon atoms of the dipeptides while the solvation effects are evident on the values of bond lengths and bond angles of the amide planes. The geometry of the amide planes is not perfectly planar regardless of whether the systems are in gas or in strong polar solvents like water and the deviations from planarity primarily depends on two factors—(a) steric interactions of the side chain moieties of the C-terminal residues and (b) intramolecular H-bond formation by the H- and O-atoms of the amide planes with their adjacent atoms belonging to the C- and N-terminal residues. The gas phase intramolecular H-bond combinations of the dipeptides are similar to those in the aqueous phase; two types of intramolecular H-bonds (viz. NH–N and OH–C) play crucial roles in influencing the geometry of the peptide planes and in determining the energetics of the dipeptides. The variations in the values of (C7–H) stretching frequencies of the dipeptides reflect the effects of the changing –SC groups on the geometry around the C7 atoms, while the (C3–H) stretching values remain relatively unchanged since the geometry around the C3   -carbon atoms are not affected by the changing identity of the –SC groups. The HOMO-LUMO energy gaps for the dipeptides are more in aqueous phase than those in gas phase indicating that the dipeptides are more stable in aqueous phase.

Acknowledgments

The authors gratefully acknowledge the financial assistance from the Special Assistance Program of the University Grants Commission, New Delhi, India, to the Department of Chemistry, North Eastern Hill University. Shilpi Mandal is also grateful to the University Grants Commission, Government of India, New Delhi, for the financial assistance through a research fellowship.

Supplementary Materials

Cartesian coordinates of optimized structures of the dipeptides at b3lyp/6-311++g(d,p) in gas and aqueous phase.

  1. Supplementary Material