Abstract

Using pump-probe experiments of varying time intervals between pump and probe, the method of time-resolved crystallography has given many insights into the fast time variations of crystallized molecules as a result of photoexcitation. We show here that quantities extractable from multiple diffraction patterns of dissolved molecules in random orientations, as measured using powerful ultrashort pulses of X-rays, also contain information about structural changes of a molecule on photoexcitation.

1. Introduction

The X-ray free-electron laser (XFEL) is a new instrument which promises to revolutionize our study of the atomic architecture of matter [1, 2]. The brightness of the X-rays produced by this instrument is some 10 billion times greater than any existing X-ray source (including present-day synchrotrons). This allows the possibility of measuring signals from scattered X-rays of even large single molecules, like proteins. The traditional limitation of X-ray flux for fragile biomolecules can be circumvented completely due to the fact that this very bright radiation is delivered in ultra-short pulses [35]. Although the molecules under study will undoubtedly suffer catastrophic radiation damage, the shortness of the pulse enables a signal to be measured from the particle before its disintegration. This enables structure determination of reproducible biomolecules by essentially an unlimited X-ray flux and is in one sense a complete solution of the radiation damage problem. It should also be possible to combine the ultra brightness [6] of the radiation with the ultra shortness of its duration [7] to enable the gathering of information never been before possible, for example, the changes in the structure of an uncrystallized biomolecule as a result of some stimulus, such as photoexcitation, as a function of time since the photoexcitation. Since this time can be very short, the possibility then exists of experimentally following the course of rapid chemical reactions of such uncrystallized biomolecules, as with crystallized ones [7] by the technique of time-resolved diffraction [810]. This may allow for the first time the study of biochemical reactions of molecules in aqueous solution in which they occur in nature. An idea proposed for sample delivery of hydrated molecules to an XFEL beam is to inject a continuous stream of solutions containing the molecules into the sample chamber [1113]. The incident X-rays then scatter off the protein solution. The design specification of the LCLS is to produce an X-ray beam of perhaps 0.1 microns in diameter at the sample. To maximize the protein scattering from such a solution stream (and to minimize scattering by the aqueous solution), it would be best to use a solution as concentrated as possible. Photoactive yellow protein (PYP) is a popular (15 kDa) protein for time-resolved structural studies. The reason is that this protein suffers a significant and reproducible structural change on illumination by laser light [1417], where a chromophore swings a significant distance outward from the center of the molecule, and the nearby ARG 52 residue moves to accommodate the new position of the chromophore. PYP can be concentrated to 150 mg/mL or higher [18]. This corresponds to about 10 mmol/L = 10 mole/m3 = 150 mg/mL. Thus, the volume of solution illuminated by the XFEL beam can be estimated by multiplying an estimated cross-sectional area of the beam based on the design estimate of 0.1 micron diameter, with an estimate of the thickness of the continuous stream of solvent. Our best estimate of this volume is thus  m3, where for a thickness we estimated the smallest reported diameter of a solvent droplet [19]. Multiplying this by the number of moles per m3, we estimate the number of moles illuminated by the XFEL beam to be ~. Multiplying this by Avogadro’s number, ~ molecules per mole, suggests that a typical illuminated volume of solvent probably contains ~10,000 molecules of PYP at their optimum concentration. Even with a reduced concentration of 10 mg/mL, as used in small-angle X-ray scattering (SAXS) work and a 500 kDa protein, when this number is reduced by a factor of 500 the number of molecules in a typical illuminated volume would still be expected to be about 20; this still a far cry from illumination of a single molecule per measured diffraction pattern as assumed by many algorithms for the reconstruction of the real-space structure of a single molecule. Of course, methods have been developed [20] for automatically selecting single-particle X-ray diffraction snapshots from the entire collection of measured diffraction patterns for analysis by such techniques, which may be of particular use for large molecular ensembles or viruses. However, it should be pointed out that such methods depend on the rejection of diffraction patterns from multiple particles. The rejection of data from multiple particles reduces the particle/solvent ratio and thereby presumably reduces the signal-to-noise ratio.

2. Structural Information from Disordered Ensembles of Molecules

The possibility of extracting structural information about an ensemble of molecules in a disordered ensemble as in solution from scattered X-ray signals was recognized as far back as the late 1970s [21]. The idea proposed was that if the X-ray pulses were shorter than the rotational diffusion time of the molecules, the measured signal from a single X-ray pulse is of the ensemble of particles frozen in space and time. It was shown that, in this case, the average angular correlation functions of the scattered intensities (as defined below) are characteristic of the 3D structures of the individual biomolecules. This gives rise to the exciting possibility of recovering the structure of a biomolecule from a disordered ensemble of molecules as expected in their functional state in nature, rather than an ordered collection of the molecules as found in a crystal. A realistic possibility of advancing this idea is provided by the advent of the XFEL [22]. If the average over a set of diffraction patterns of angular pair correlation function between the intensities of a couple of resolution rings and , diffraction patterns from such a disordered entity are defined by where represents an average over all the diffraction patterns, . Note that the orientational averaging of the particles implied by the reasonable assumption that all molecular orientations equally likely suggest that the LHS of the above equation will be independent of the value of chosen on the RHS. If one makes the other reasonable assumptions of complete translational disorder (as expected for a dilute ensemble of particles), it is possible to show [22] that where and is the wavenumber of the incident radiation. The angle takes account of the curvature of the Ewald sphere and is almost equal to radians for a small resolution ring or a flat Ewald sphere. In this expression, also where are a set of spherical harmonic expansion coefficients of the 3D diffraction volume of a single particle and is a Legendre function. Recent developments of iterative phasing algorithms suggest that if an oversampled 3D diffraction volume was found via the coefficients , the real-space structure of the particle may be recovered by means of an iterative phasing algorithm.

Although the quantities may be found quite straightforwardly by inverting (2), finding the coefficients in general from these quantities is far from easy [23]; however, where the particle has a known symmetry, for example, in the case of an icosahedral [24] or helical virus [25] this may be possible by exploitation of the known symmetry.

Another circumstance in which a solution may be possible is when trying to recover a small change in a structure from a known one, as in the case of pump-probe experiments on photoexcited biomolecules in studies of time-resolved structural changes. Such an experiment is of great significance not only because it allows the study of the course of fast chemical reactions, but also because it truly exploits both the extreme brightness and the fast time structure of XFEL pulses.

A full calculation that shows this capability will be reported on in a future paper. We illustrate here the sensitivity of the quantities to expected torsional angle changes on photoexcitation. In the case of PYP, earlier time-resolved studies have established that in 2 ms after photoexcitation the primary change in the structure of PYP may be regarded as a cis-trans isomerization of its chromophore about its C2-C3 axis, and a change of the torsional angle of the side chain of the ARG 52 residue of PYP to make room for the structural changes due to the chromophore isomerization.

3. Proposed Time-Resolved Experiment on Dissolved Molecules in Random Orientations

Imagine an experiment where a continuous stream of a solution of photoexcitable molecules is injected into an XFEL sample chamber in the usual manner of the so-called “diffract and destroy” experiments [1], but where a short distance before the intersection of the X-rays from the XFEL, the molecules are photoexcited by a powerful laser (see Figure 1). If is the speed of the solvent stream (typically 10 m/s), the molecules will be illuminated by the laser at time before it is interrogated by the X-ray beam. Since the time is controllable by varying the distance , this pump-probe experiment allows exquisite control of the time delay after photoexcitation. It is envisaged that a large number of diffraction patterns may be measured from different regions of the continuous solvent for a given time delay. It has been shown that structural information of the photoexcited molecules in the solution resides in the ensemble of such diffraction patterns.

4. Model Calculation

Previous studies by time-resolved crystallography have established that the primary structural change of PYP 2 ms after photoexcitation is a cis-trans isomerization of the chromophore which is a result of a degree rotation of the head of the chromophore about its C2-C3 bond as well as a 77 degree change of the torsional angle of the side chain of the ARG 52 residue of PYP [1417]. Ignoring further small relaxations of other nearby atoms of the structure, such structural changes can be parametrized to a good approximation by just two torsional angles, and . Figures 2(a) and 2(b) illustrate the dark and photoexcited structures in the part of the molecule undergoing structural changes.

Starting from the atomic coordinate data for the dark structure of PYP from the Protein Data Bank entry 2PHY, we calculated a range of hypothetical excited state structures corresponding to and torsional angles at 5-degree intervals from the dark structure values of and . For each of the hypothetical photoexcited structures, we calculated its structure factors from the formula: and hence its expected three-dimensional diffraction volume by evaluating . Then, using Gaussian quadrature on shells of radius , we calculated the spherical harmonic expansion coefficients of this 3D diffraction volume. These were used to calculate the experimentally accessible quantities via (4) for ranging from 0 to 25, and ranges of and from 0 to  Å−1 (i.e., up to 5 Å resolution) in intervals of for each of these hypothetical structures. These quantities were compared with the values of the same quantities for a reference structure corresponding to degrees and degrees (assumed to be the correct structure 2 ms after photoexcitation) via a reliability factor (or -factor) defined by where and are the torsional angles above of the chromophore and the side-chain torsional angle of the ARG 52 residue. The resulting 2D contour map of as a function of and is shown in Figure 3.

5. Discussion and Conclusions

The contour map reveals an unambiguous minimum close to the assumed 2 ms structure. Although there are about 15 atoms whose positions are different between the dark and 2 ms photoexcited structures, the principle that bond lengths are unlikely to change on photoexcitation suggests that structural changes may be well parametrized by changes in only torsional angles. This parametrization allows the efficient determination of the changes of the positions of some 15 atoms between the dark and photoexcited structures. In the present case of just two varied structural parameters, the -factor contour map is seen to be simple enough that even a simple gradient descent algorithm that starts at the dark structure () is likely to find the correct photoexcited structure. It should be noted that such a parametrization may even be of use in established methods of time-resolved crystallography in order to determine directly even changes in atomic positions of a photoexcited structure rather than via the fitting of a deduced difference electron density map with an atomic model. Indeed, an input to our calculation is the PDB file of the dark structure and the output a PDB file of the excited structure. Of course, for photoexcited structures that differ from a dark structure by more structural parameters, a more sophisticated search algorithm like simulated annealing [26] may be necessary to find a global minimum.

The primary conclusion from this work is that a determination of a fast structural change of a molecule on photoexcitation may not require the formation of a crystal. What we have demonstrated is that quantities measured in a pump-probe experiment using an X-ray free electron laser on dark and photoexcited biomolecules in a disordered ensemble in solution contain information from which small changes in a structure can be deduced. The initial demonstration is only for a case that can be characterized by the changes of just two structural parameters, namely, two torsional angles of a protein. A more complete demonstration of the method will have to await the development of techniques for extracting from the same measurable data possible structural changes in any part of the molecule. Nevertheless, we feel the present demonstration is important in that it demonstrates that quantities measured from disordered ensembles of biomolecules in solution may hold the key to unlocking such fast structural changes in uncrystallized biomolecules in solution.

Acknowledgments

M. Schmidt and D. K. Saldin acknowledge the support for this work from the National Science Foundation (Grant no. MCB-1158138), and D. K. Saldin thanks the Research Growth Initiative (RGI) of the University of Wisconsin-Milwaukee.