Abstract

Based on the observation that Ramachandran-type potential energy surfaces of single amino acid units in water are in good agreement with statistical structures of the corresponding amino acid residues in proteins, we recently developed a new all-atom force field called SAAP, in which the total energy function for a polypeptide is expressed basically as a sum of single amino acid potentials and electrostatic and Lennard-Jones potentials between the amino acid units. In this study, the SAAP force field (SAAPFF) parameters were improved, and classical canonical Monte Carlo (MC) simulation was carried out for short peptide models, that is, Met-enkephalin and chignolin, at 300 K in an implicit water model. Diverse structures were reasonably obtained for Met-enkephalin, while three folded structures, one of which corresponds to a native-like structure with three native hydrogen bonds, were obtained for chignolin. The results suggested that the SAAP-MC method is useful for conformational sampling for the short peptides. A protocol of SAAP-MC simulation followed by structural clustering and examination of the obtained structures by ab initio calculation or simply by the number of the hydrogen bonds (or the hardness) was demonstrated to be an effective strategy toward structure prediction for short peptide molecules.

1. Introduction

In conformational analysis of short peptides, Monte Carlo (MC) and molecular dynamics (MD) simulation techniques have been widely applied [1, 2], in which a set of potential energy functions (a so-called force field), such as ECEPP [3, 4], AMBER [58], CHARMM [9], OPLS [10, 11], and GROMOS [12], is employed to define the relation between the structure and the potential energy. However, it is still not practical to search for the possible conformers comprehensively within a limited computing time without using advanced sampling methods, such as multicanonical [13, 14] and replica-exchange [1517] methods, which have been developed to overcome the sampling problem. In recent years, various types of coarse-grained force fields, such as AMBER-UA [18], UNRES [1923], and MARTINI [24, 25], have also been developed in order to reduce time to compute the potential energy. For instance, the UNRES force field successfully simulated the aggregation process and stable structure of β-amyloid protein oligomers that are regarded as pathologic factors of Alzheimer’s disease [26].

On the other hand, we recently discovered the interesting feature of protein structures that Ramachandran-type potential energy surfaces of single amino acid units in water obtained by ab initio molecular orbital calculation are almost identical with statistical distributions of the Ramachandran plots obtained from protein databank (PDB) [27, 28]. In other words, the amino acid residues in proteins seem to statistically follow Boltzmann distributions on the single amino acid potential (SAAP) surfaces in water. Although physicochemical implications of this unexpected similarity are not yet clear, the finding strongly suggests prominent importance of SAAP as a determinant of protein structures. This point would also be supported by the previous experimental result by Dobson and coworkers [29, 30] that and dihedral angle distributions of individual amino acid residues of a polypeptide in a random-coil state can be expressed by use of the Ramachandran plots, as well as by the theoretical approaches by Sakae and Okamoto [31] and Kamiya et al. [32] to the optimization of conventional all-atom force field parameters by using the Ramachandran plots.

A discovery of the similarity between the SAAP in water and the statistical structure of the amino acid residues in folded proteins has prompted us to develop a new force field called SAAP for polypeptide molecules [33, 34]. The SAAP force field (SAAPFF) is entirely different from conventional force fields in that a polypeptide is divided to the amino acid units, not to the atomic units, solvent effects are implicitly included in the parameters, and the atomic charges are not constant but variable depending on the conformation of the amino acid units. In the SAAPFF, the total potential energy () for a polypeptide is expressed by (1) as the following: where is a sum of potential energies (SAAP) for the individual amino acid units, and are electrostatic and Lennard-Jones potentials between the amino acid units, and is the other correlation term, which is ignored in a current version of SAAPFF. Detailed description of these terms and the simulation program was given previously (see also Supplementary Material available online at http://dx.doi.org/10.1155/2013/407862) [33, 34]. Canonical MC simulation using the SAAPFF reasonably reproduced randomly fluctuating conformation of Met-enkephalin. However, for chignolin the SAAP-MC simulation did not afford the native β-hairpin structure when the simulation was started from the extended structure [34]. Thus, the previous SAAPFF parameters were a little too rough to be applied for structure prediction of the short peptide. In this study, the SAAPFF parameters have been improved in several points, and accuracy of the improved SAAP simulation suite to search for possible conformers of short peptides has been evaluated by performing the molecular simulation for Met-enkephalin and chignolin. The results were then compared with those obtained by the conventional AMBER-MD simulation method using the amber99sb force field [1] combined with the generalized Born solvent model [3537]. A protocol of SAAP-MC simulation followed by structural clustering is suggested to be an effective strategy toward structure prediction for short peptide molecules.

2. Method of Calculations

2.1. Model Peptides

Met-enkephalin and chignolin without N- and C-terminal protecting groups were employed as model short peptides. Met-enkephalin is a short peptide, which consists of five residues (Tyr–Gly–Gly–Phe–Met) as shown in Figure 1. According to previous experimental [38] and theoretical [3950] studies, the structure of this short peptide randomly fluctuates in water, while the presence of some characteristic conformations with intramolecular hydrogen bonds was indicated in vacuo [5153]. Met-enkephalin is a peptide widely used for testing performance of developed molecular simulation methodologies.

Chignolin has a unique folded β-hairpin structure in water, although it is a short peptide consisting of ten amino acid residues (Gly–Tyr–Asp–Pro–Glu–Thr–Gly–Thr–Trp–Gly) [54]. Molecular simulation for this miniprotein has been performed actively by using AMBER and OPLS force fields [5564]. As shown in Figure 2, chignolin folds around Pro4–Glu5–Thr6 to form a β-hairpin structure, which is stabilized by three intramolecular hydrogen bonds, Gly7(N)Asp3(O) (H-bond I), Thr8(N)Asp3(O) (H-bond II), and Asp3(N)Thr8(O) (H-bond III), and a hydrophobic interaction between the aromatic rings of Tyr2 and Trp9.

2.2. Improvement of the SAAPFF Parameters

The SAAPFF parameters are comprised of a potential energy, atomic coordinates, atomic charges, and Lennard-Jones potential parameters for all possible conformations of the single amino acid unit (HCO-Xaa-NH2). The conformations are defined by dihedral angles, , , , , and so forth, each with an interval of 15 degrees. In this study, the SAAPFF parameters previously developed [33, 34] were improved in the following points.

First, the parameters for main-chain, Pro, and Val in water, which were obtained in the previous version [34] by single-point calculation with the IEFPCM [6567] solvent model using the structures optimized in vacuo, were replaced with those obtained by geometry optimization in water at the HF/6-31+G(d) level using the IEFPCM model in a Gaussian03 program (rev. E.01) [68]. The similar calculation level has been widely applied for small biomolecules [69, 70]. Relaxation of the potential surfaces in the implicit water model should improve accuracy of the SAAPFF.

Second, atomic charges of the SAAPFF parameters, which are variables as a function of the structure of an amino acid unit, were switched for all amino acids from Mulliken charges to electrostatic potential (ESP) charges [71] that were obtained at the recommended higher MP2/6-31G(d,p) level [72] using the IEFPCM solvent model. The ESP charges are defined to reproduce the electrostatic potential surface and are used in conventional force fields as reliable atomic charges. Therefore, the use of ESP charges would reproduce electrostatic interactions, that is, the term, more appropriately than the Mulliken charges.

Third, the parameters of potential energies for the main-chain unit were further improved as follows. The SAAPFF parameters for most amino acids, except for Gly, Ala, Pro, and Val, are divided into the main-chain and side-chain units according to the side-chain separation approximation method [34]. In this approximation, the main-chain unit was capped with a tert-butyl group, which was fixed eclipsed to the main-chain part during geometry optimization to obtain the SAAPFF parameters. However, this method brought a statistical error that the SAAP energies thus obtained are slightly larger than those obtained without the approximation, due probably to distortion of the bond angles derived from the fixed geometry. Indeed, such a statistical error was evident for the case of Val [34]. Therefore, the energies were replaced with those calculated for the relaxed structures, which were obtained by geometry optimization with the tert-butyl group relaxed in water, while for structural parameters (i.e., atomic coordinates and atomic charges) the geometry with the tert-butyl group fixed eclipsed was still employed in order to avoid abnormally short atomic contacts between the main-chain and side-chain units after the connection.

2.3. Simulation Conditions

Conformational properties of Met-enkephalin and chignolin in water were studied by SAAP-MC and AMBER-MD [73] simulations. The SAAP-MC simulation was carried out at 300 K in water by using the improved SAAPFF parameters, in which solvation effects were implemented implicitly. The conventional Metropolis method [74, 75] and the Mersenne Twister random number generator [76] were employed for the MC simulation. At each MC step, one dihedral angle, randomly chosen from all dihedral angles including main-chain and side-chain dihedral angles, was changed randomly with a maximum displacement angle of ±32 degree. The new structure was constructed by the interpolation of the SAAPFF parameters and subsequent connection of the single amino acid units, and the potential energy was calculated by (1). The potential energy thus obtained was then compared with the previous one. Ten trajectories with total MC steps of 100 million were obtained for Met-enkephalin, while twenty trajectories with total MC steps of 2 billion were obtained for chignolin. As an initial structure of Met-enkephalin for the SAAP-MC simulation, the extended structure from PDB (PDB ID: 1PLW) [77] was employed. On the other hand, extended (defined by all main-chain and dihedral angles of −180 degree) and native folded (PDB ID: 1UAO) [54] structures were selected for chignolin as the initial structures. An Intel Xeon W3550 3.06 GHz processor with 12 GB memory was employed as a platform for calculation. The computing time of SAAP-MC simulation for Met-enkephalin was about 6 hours and for chignolin was about 11 days.

For comparison, ten trajectories of AMBER-MD [73] simulation were also calculated for Met-enkephalin and chignolin, respectively, in water at 300 K. The simulation time was 1 s with a time step of 2 fs, and a cut-off distance of nonbonded interactions was 12 Å. Solvation effects were implicitly considered by using the generalized Born model () [3537] with a dielectric constant of 78.5, following the recent study by Götz et al. [78]. The AMBER10 program [73] with the amber99sb force field [1] was used. The computing time of the AMBER-MD simulation to obtain one trajectory was about 165 hours for Met-enkephalin, and that for chignolin was about 18 days, on the same calculation platform used for the SAAP-MC simulation.

2.4. Data Analysis

The 10,000 structures of Met-enkephalin that were extracted in every 10,000 step from the SAAP-MC simulation trajectory were classified to twenty structural clusters by using a clustering algorithm called the -means method [79] based on the RMSD for the main-chain C, O, and N atoms or all heavy atoms except for the hydrogen atoms. Similarly, ten structural clusters were obtained for chignolin from the 20,000 structures extracted in every 100,000 step from the SAAP-MC simulation trajectories. On the other hand, 10,000 structures were extracted in every 100 ps from the AMBER-MD trajectories for both Met-enkephalin and chignolin. The obtained structures were classified to ten or twenty clusters by application of the -means method under the same conditions to those applied for the SAAP-MC simulation results.

For chignolin, the RMSD values with respect to the native structure were also calculated for the structures extracted from the SAAP-MC and AMBER-MD trajectories by using the amber-ptraj program [7, 8] based on the main-chain atoms or all heavy atoms. The free-energy surfaces of chignolin were further analyzed by using all trajectories obtained from the SAAP-MC simulation according to (2): where is a probability of the structures with the corresponding structural parameters, is the probability of the structure that had the lowest energy, is a difference in the free energy, is a gas constant, and is an absolute temperature. Ramachandran-type free-energy surfaces were obtained for each amino acid residue by dividing the surface into 10 × 10 degree units, while main-chain RMSD versus hydrogen bond or hydrogen bond versus hydrogen bond free-energy surfaces were obtained by dividing the surface into 0.1 × 0.1 Å units.

2.5. Ab Initio Calculation

Relative energies of the representative structures obtained for chignolin by the SAAP-MC simulation were calculated in water by the ab initio molecular orbital method. The calculation was performed at the HF/IEFPCM/6-31+G(d,p) level by using a Gaussian03 program [68]. The polarizable continuum model using the integral equation formalism variant (IEFPCM), which is equipped as the default self-consistent reaction field method, was employed for the calculation in water.

3. Results

3.1. Met-Enkephalin by SAAP-MC and AMBER-MD

Energetic trajectories of SAAP-MC and AMBER-MD simulations for Met-enkephalin were obtained in an implicit water model at 300 K (see Figure S1). In SAAPFF, the total energy is basically expressed as a sum of , , and . The values of these energetic terms were maintained stable during the SAAP-MC simulation with large fluctuation for and , while the value of was almost zero, as observed previously [34]: the large dielectric constant of water would make the electrostatic interaction between the amino acid residues ignorable. Similarly, the total energy was maintained stable for the AMBER-MD simulation. However, the distribution of the distance between the terminal Cα atoms () was significantly different for the two simulation methods. The value of dispersed in a range from 5 to 12 Å in the SAAP-MC simulation, while those converged at around 6 Å in the AMBER-MD simulation (Figure 3). The results suggested that the structures obtained from the SAAP-MC simulation contain diverse conformations and are more extended than those obtained from the AMBER-MD simulation on the average. Indeed, when the structures were classified into twenty clusters based on the RMSD values for the main-chain atoms, various types of structures were obtained from the SAAP-MC trajectory. The representative examples are shown in Figure 4. The structures are different from each other with a diverse value, and there is no hydrogen bond in the three representative structures. In contrast, the structures of Met-enkephalin obtained from the AMBER-MD trajectory seemed to be trapped in local energy minimums during the simulation (see Figure S2).

3.2. Chignolin by SAAP-MC and AMBER-MD

Twenty trajectories were obtained for chignolin by SAAP-MC simulation. The structures in each trajectory were classified to ten clusters based on the main-chain RMSD by using the -means method. According to the main-chain folds, that is, combination of the dihedral angles for the amino acid residues, the obtained clusters were further classified to three representative structures, which are called here A, B, and C (Figure 5). Populations of the three structures are summarized in Table 1. Structure A has a native-like main-chain fold stabilized by three hydrogen bonds, which are the same as those observed in the native structure (i.e., H-bonds I–III). However, the hydrophobic interaction between the aromatic side chains of Tyr2 and Trp9 is not present. Structures B and C have nonnative conformation with a common type II β-turn around Thr6 and Gly7. A weak hydrogen bond exists between Glu5(O) and Thr8(N) with a distance about 4.0 Å. However, the conformations of the C-terminals are different from each other. Other minor structures have various local and global conformations, such as a type I β-turn, a helical conformation, and other conformations.

In the trajectories from 1 to 12, the existence ratio of structure A was high (31.2 to 58.5%), whereas the ratio was much lower than that of the misfolded structure B in the remaining trajectories. The mean ratio for structure A averaged for all trajectories was 29.2%. On the other hand, when the clustering analysis was performed based on all-atom RMSD, the native-like structure (A′) was obtained from 0 to 19.5% (Table 1). The existence ratio of structure A′ averaged over all trajectories was 6.7%. Structure A′ maintains three native H-bonds as well as a hydrophobic interaction between Tyr2 and Trp9 (Figure 5). The all-atom and main-chain RMSD values of structure A′ from the native structure were 2.6 and 2.0 Å, respectively, suggesting that structure A′ keeps the native fold. The trace of the main-chain RMSD with respect to the native structure is shown in Figure 6(a) for the case of trajectory 10, where structures A and B are involved in almost equal amounts. The RMSD values fluctuated between 1 and 6 Å, and repeats of the structural transitions were observed. Similarly, in the AMBER-MD simulation, the main-chain RMSD value fluctuated between 1 and 6 Å (Figure 6(b)). The structural clustering analysis showed that the native structure with three native hydrogen bonds was involved about 30% (Figure S5).

The convergence of the SAAP-MC simulation is not clear from Table 1. However, the similar relative populations were obtained for structures A (~30% including 6% of A′) and B (~60%) when the simulation was started from the native structure. Moreover, in our preliminary results of the SAAP-MC simulation using ten trajectories, almost the same mean populations were obtained for the representative structures (data not shown). Based on these observations, we employed the twenty trajectories shown in Table 2 for further analysis.

The structure with the smallest all-atom RMSD (1.3 Å) obtained from the twenty trajectories is superimposed on the native structure in Figure 7. The main-chain and Cα atom RMSD values of this structure with respect to the native structure were 0.9 and 0.5 Å, respectively, confirming that the structure is almost identical to the native structure.

The Ramachandran-type free-energy surfaces obtained for Tyr2–Trp9 residues from all trajectories of the SAAP-MC simulation are shown in Figure 8. The profiles for Tyr2, Pro4, Glu5, and Thr6 fit well to the native structures, while the free-energy minimums for Asp3, Gly7, and Thr8 are slightly moved from the positions of the native structures. For Trp9, the stable areas corresponding to α and β regions tend to merge. This is roughly consistent with diversity in the C-terminal conformation of the native structure. The dihedral angles for structures A and A′ are located around the energy minimum points on the free-energy surfaces. In contrast, structure B has different geometry from structures A and A′ for Glu5 and Thr6: the angles are ca. + degree. Structure C has the geometry with angles of Glu5, Thr6, Thr8, and Trp9 different from those of structures A and A′. During the SAAP-MC simulation, mismatch of the main-chain dihedral angles for Glu5, Thr6, and Gly7 to the native structure occurred at high ratios.

The free-energy surface of chignolin projected on H-bonds I versus III plane and that projected on the main-chain RMSD versus H-bond II plane obtained from all trajectories are shown in Figure 9. In Figure 9(a), the most stable area was found at  Å and  Å. This energy minimum corresponds to the misfolded structure B. Another potential energy minimum was found at  Å and  Å, corresponding to structures A and A′. Similarly, two stable minimums were located in Figure 9(b);  Å and  Å, corresponding to structure B, and  Å and  Å, corresponding to structure A. The free-energy surfaces showed that structure A maintains three native hydrogen bonds, but it has larger main-chain RMSD from the native structure than misfolded and the most stable structure B. This conflict is ascribed to the N- and C-terminals of structure A, which are significantly apart due to the absence of a hydrophobic interaction between Tyr2 and Trp9 side chains. It should be noted that structure A′ is separated from structure A in Figure 9(b).

3.3. Evaluation of the Relative Energies of Structures A–C by Ab Initio Calculation

To evaluate relative stabilities of structures A, A′, B, and C more accurately, single-point ab initio calculation was carried out. Table 2 lists the SCF and relative energies calculated in the IEFPCM solvent model. Although structure B was most populated in the SAAP-MC simulation, it is significantly unstable compared to the other structures, suggesting that structure B would not be populated in practical solutions. On the other hand, structure A′ with a native fold was found to be the most stable. According to the ab initio calculation results, it is likely that the structures having more hydrogen bonds and/or a hydrophobic interaction between the aromatic rings are more stable.

4. Discussion

4.1. Improvement of SAAPFF Parameters

In this study, we have modified the SAAPFF parameters in the following points. The main-chain, Pro, and Val parameters in water were replaced with those obtained by geometry optimization in water. The atomic charges were replaced from Mulliken to ESP charges. The energies of the main-chain unit were replaced with those obtained for the geometry with the tert-butyl cap being relaxed during the optimization. The effects of these modifications were found to be significant as the SAAP-MC simulation using the new parameters reproduced the native structure of chignolin (structure A′), while it was not obtained when the simulation was performed by using the previous parameters [34].

The clustering of the structures obtained for chignolin by SAAP-MC simulation using the improved parameters showed that structures A and B are dominant in the implicit water model as shown in Table 1. Structure A (29.2%) maintains three native hydrogen bonds, indicating the native main-chain fold, while nonnative structure B (43.2%) is stabilized by a type II β-turn. In addition, the structural clustering based on all-atom RMSD demonstrated that native structure A′, which maintains not only the three native hydrogen bonds but also a hydrophobic interaction between the aromatic side chains of Tyr2 and Trp9, is involved in 6.7% of the total structures. The free-energy analysis clearly indicated that the main-chain dihedral angles of the each amino acid residue and the distances of H-bonds I–III for structures A and A′ are consistent with the native structure (see Figures 8 and 9), although their populations were less than that of the misfolded structure B (vide infra).

In the meantime, randomly fluctuating structure was obtained for Met-enkephalin by using both the previous [34] and improved SAAPFF parameters (see Figures 3 and 4). This would be due to the two Gly residues, whose SAAP parameters have been modified only for the atomic charges in this study, being involved in the mid of the amino acid sequence.

There are a number of reports in the literature on the molecular simulation for chignolin by using conventional force fields to obtain the native structure from the unfolded state [5564]. In these studies, various state-of-the-art techniques of molecular simulation have been employed, including a multicanonical MD method [55, 56], a replica exchange MD method [5760], a long-term conventional MD method by using a high-speed computer system [62], and originally developed MHOP and MSES methods [63, 64]. The values of all-atom RMSD for the obtained native structure were in a range from 1.3 to 1.8 Å with respect to the experimentally determined structure, which is smaller than that of structure A obtained as a major native-like structure by the SAAP-MC simulation in this study. Moreover, the AMBER-MD simulation carried out in this study showed that the native structure of chignolin is a little more frequently obtained by using the amber99sb force field than SAAPFF. Thus, it is obvious that further improvement of the SAAPFF parameters is required in future works.

4.2. Efficiency of SAAP-MC Simulation for Conformational Sampling

Previous studies applying an explicit water molecule model [43] or a generalized-ensemble method with RISM theory [3942] have shown that Met-enkephalin has extended conformations in water. This feature was reasonably reproduced in this study by application of SAAPFF as shown in Figures 3 and 4. Clustering of the structures obtained from the SAAP-MC simulation indicated diverse conformations of Met-enkephalin in water. On the other hand, the AMBER-MD simulation for Met-enkephalin carried out in this study produced rather uniform folded conformations with intramolecular hydrogen bonds (see Figure S2), probably due to insufficient computing time. Thus, the SAAP-MC method would be useful for sampling possible conformations for short peptide molecules in an implicit water model, although the accuracy is not satisfactory in terms of the reproducibility of the relative energies.

The efficiency for conformational sampling arises probably from a less number of variables used in the SAAPFF than that in the conventional force fields. In SAAPFF, the structure of peptides is defined only by the dihedral angles of the each amino acid unit, not by the cartesian coordinates of each atom. Acceleration of conformational sampling by reducing the number of structural parameters is a common strategy of coarse-grained force fields, such as united-atom force fields [1826, 80, 81]. Efficiency of the SAAP-MC simulation for conformational sampling can also be explained by the fact that the solvent effects are implicitly included in the parameters of SAAPFF. This enables the SAAP-MC simulation in an implicit water model to be performed in the same computing time as that in vacuo.

4.3. Prediction of Native Structures

Although the accuracy of SAAPFF is not yet satisfied, the efficiency for conformational sampling would be advantageous for prediction of the stable structures of peptides in water. Therefore, we subsequently explored the application of the SAAP-MC method to predict the native structure of chignolin.

Indeed, it was found by single-point ab initio calculation in water at the HF/IEFPCM/6-31+G(d,p) level that structure A′ is the most stable one in water among the structures shown in Figure 5 (see Table 2). The procedure should involve a large error in the energy because the structure is not optimized. However, the tendency of the relative energies would be reliable to some extent as previously demonstrated for HCO-Ala2-NH2  in vacuo [33]. Thus, a protocol of SAAP-MC simulation followed by structure clustering and examination of the obtained structures by ab initio calculation would be a useful strategy toward the prediction of the native structure for short peptides. The relative energies obtained by ab initio calculation also suggested the possibility that structure B can be produced as an artifact of the force field because the relative energy was quite large (>20 kcal/mol).

As for prediction of the native structure, another simpler method would be possible based on the hardness (i.e., the number of interamino acid interactions or the depth of the potential hole on the free-energy surfaces) of the structures. As seen in Figure 5, structure B of chignolin maintains only one hydrogen bond, while structures A and A′ hold three hydrogen bonds. In terms of the number of hydrogen bonds, it can be assumed that structures A and A′ are more native-like than structure B, although the population of structure B was larger than those of structures A and A′. Indeed, the areas around the free-energy minimums corresponding to structures A and A′ in Figures 9(a) and 9(b) are slightly narrower than those corresponding to structures B and C. Such robust structures on the conformational surface would be predicted to be more native-like than the structures with more conformational flexibility. Thus, we suggest that a protocol of SAAP-MC simulation followed by structural clustering and examination of the obtained structures by the number of hydrogen bonds (or the hardness) may be another simple strategy toward structure prediction for short peptide molecules.

5. Conclusions

The parameters of SAAPFF, which was previously developed to analyze the structures and folding of polypeptides, have been improved in several points in this study. The main-chain, Pro, and Val parameters in water were replaced with those obtained by geometry optimization in the IEFPCM solvent model. The atomic charges were replaced from Mulliken to ESP charges. (3) The energies of the main-chain unit were replaced with those obtained for the geometry with the tert-butyl cap being relaxed during the optimization. To investigate the accuracy of the improved SAAPFF parameters, the SAAP-MC simulation was carried out for Met-enkephalin and chignolin as model peptides. Analysis of the obtained structures revealed that the SAAP-MC simulation reasonably reproduced randomly fluctuating conformation for Met-enkephalin, while three folded structures, one of which corresponds to a native-like structure with three native hydrogen bonds, were obtained for chignolin. The latter result supported that the accuracy of the SAAPFF parameters has been remarkably improved from the previous ones, although the native structure of chignolin could not be assigned to the most stable structure in the conformational space defined by SAAPFF.

In the meantime, efficiency of SAAP-MC simulation for conformational sampling was demonstrated for Met-enkephalin as the SAAP-MC simulation afforded diverse structures. The feature, combined with the structural clustering analysis, was subsequently applied to the structure prediction of chignolin. Among the representative structures obtained by the clustering, structure A′ with a native fold was assigned to the most stable structure according to ab initio calculation. Thus, a protocol of SAAP-MC simulation followed by structural clustering and examination of the obtained structures by ab initio calculation would be a useful strategy toward prediction of the native structure for short peptides. Alternatively, the structure with a maximum number of hydrogen bonds or with the hardest conformation on the potential surface may be considered to be the most native-like among the clustered structures. Applications of this method to structure prediction of other peptide molecules as well as further improvement of the SAAPFF parameters are being conducted by our group.

Acknowledgment

This work was supported by Grant-in-Aid for Scientific Research on Innovative Areas (no. 2120005) from the Ministry of Education, Culture, Sports, Science and Technology. The SAAP force field parameters are available for download at http://saap.sc.u-tokai.ac.jp/.

Supplementary Materials

The SAAP force field and the simulation algorithm, the results of SAAP-MC and AMBER-MD simulations for Met-enkephalin and chignolin, the structural parameters of structures A, A’, B, and C as well as the smallest all-atom RMSD and the lowest total potential energy structures obtained for chignolin, and the results of single-point ab initio calculation for the representative structures.

  1. Supplementary Material