#### Abstract

New global potential energy surface for the ground electronic state of ozone is constructed at the complete basis set level of the multireference configuration interaction theory. A method of fitting the data points by analytical permutationally invariant polynomial function is adopted. A small set of 500 points is preoptimized using the old surface of ozone. In this procedure the positions of points in the configuration space are chosen such that the RMS deviation of the fit is minimized. New *ab initio* calculations are carried out at these points and are used to build new surface. Additional points are added to the vicinity of the minimum energy path in order to improve accuracy of the fit, particularly in the region where the surface of ozone exhibits a shallow van der Waals well. New surface can be used to study formation of ozone at thermal energies and its spectroscopy near the dissociation threshold.

#### 1. Introduction

The global potential energy surface (PES) for the ground electronic state of ozone,
is now 10 years old. The old PES was based on the *ab initio* calculations of the electronic structure at the icMRCI+Q/cc-pVQZ level of theory using CASSCF(12,9) active space and was represented by a 3D-spline interpolation of the data on a structured rectangular grid [1, 2].

The *ab initio *data obtained at that level of theory contained two serious deficiencies. First, the computed dissociation energy was too low—about lower than the experimental value available at that time (see below). Second, the surface exhibited an artificial barrier on its way to dissociation—about above the threshold. In the improved, most popular version of that surface [3] an analytic correction function was added to the spline in order to (i) eliminate the barrier by taking into account the results of more accurate *ab initio* calculations [4, 5] carried out along the one-dimensional minimum energy path to dissociation (MEP) and (ii) make the surface deeper in order to reproduce experimental value of the dissociation energy available at that time [6], .

More accurate *ab initio* calculations of the electronic structure of O_{3} were impossible 10 years ago but have become quite feasible nowadays. Furthermore, according to new experimental information [7, 8], the surface must be somewhat deeper, . Clearly, there is an opportunity and a need of improving the existing PES of ozone.

During the last decade several attempts have been made [8–10] to assess the level of theory needed to construct an accurate PES of O_{3} and obtain correct value of the dissociation energy without any empirical corrections, directly from the *ab initio* calculations. The main conclusion from the previous work was that the most important effect is due to the basis set size. It was also suggested that extrapolation to the complete basis set (CBS) limit might be necessary. In this paper we use new *ab initio* results obtained at the icMRCI+Q/CBS level of theory using CASSCF(12,9). Extrapolation to the CBS limit is done based on calculations with aug-cc-pVQZ and aug-cc-pV5Z basis sets at every point of the PES. Details of the electronic structure calculations will be reported elsewhere [11]; here the focus is on construction of the accurate global analytic fit of the *ab initio* data points.

The problem is twofold. First, the *ab initio* points are still very expensive computationally. At the level of theory indicated above, the CPU time per point is about 5 hours. If the old method is employed (which is building a spline of data on a structured 3D grid), the number of points required is on order of 6,000. Simple math tells us that such calculations are still quite expensive. Furthermore, incorporation of larger active space and inclusion of spin-orbit correction would further increase the computational cost, quite dramatically. Thus, reduction of the number of points computed *ab initio* is highly desirable. Second, the potential energy surface of ozone is rather complicated. In addition to the deep covalent well, it contains a shallow van der Waals (vdW) well in the asymptotic region of the PES. The vdW well is separated from the main well by a reef-like structure with top of the reef slightly below the dissociation threshold. Accurate reproduction of these features by the fitting function is highly desirable, because they are expected to affect the dynamics and spectroscopy of ozone.

The ozone molecule contains three electronically identical atoms, and its PES possesses the corresponding symmetry. It is wise to utilize this symmetry in order to construct an efficient and accurate analytic representation of the PES. We found that the fitting approach of Braams and Bowman [12] allows obtaining accurate representation of ozone PES with as few as 500–600 data points. If such small number of points is indeed sufficient, the electronic structure calculations can be carried out at a very high level of theory, which offers a very attractive method of building accurate PES of ozone and other highly symmetric molecules.

Note that the fitting approach of Braams and Bowman [12] does not require a structured grid. The data points can be anywhere on the PES. In this work we propose a simple and general method for generating a small set of optimally placed points. Two reasonable criteria for choosing good points are (a) minimizing RMS of the global fit and (b) emphasizing most important parts of the PES (e.g., the minimum of the well, or the transition state).

The paper is organized as follows. In Section 2 we give details of the fitting approach. Section 3 describes the procedure we used for generating a small set of pre-optimized data points for the following *ab initio* calculations. Results of the calculations are presented and discussed in Section 4. Conclusions are given in Section 5.

#### 2. The Fitting Method

The fitting method used in this study has been introduced by Braams and Bowman [12] for treatment of larger polyatomic molecules, with up to 10 atoms. Their approach uses basis functions that are invariant with respect to permutations of identical atoms. To the best of our knowledge it has not yet been applied to fit the PES of a homonuclear triatomic molecule which represents, at the same time, the smallest and the most symmetric polyatomic molecule. In what follows we give details of this approach as applied to molecules involving three identical atoms, such as ozone molecule.

The first step is transformation from the set of three internuclear distances to the set of three *Morse variables * defined as
Here is the Morse variable for the pair of th and th atoms; is a fixed parameter that possesses units of length and depends on the system. The choice of this parameter for ozone is described in the next section.

Using the Morse variables, the fitting function is expressed by the following expansion: where is order of the polynomial fitting function. The explicit forms for the first five terms of this expansion are: Here is the zero-order coefficient, is the first-order coefficient, the triplet represents a set of second-order coefficients, and so on. The functions are called monomials of th order, because each of them contains only the terms of order .

Equations (4) above show that the PES is expressed using various products of three permutationally invariant terms, namely, the first-order term, the second order term, and the third order term, All possible products of these permutationally invariant terms are included in Equations (4) to build the permutationally invariant function for fitting the PES. Using notations of (5)–(7) this function can be re-expressed in the following useful form: The way in which the terms of (8) are combined is quite clear—according to their order, which also gives a natural method for truncating the expansion. The terms, , and are, of course, equal to unity but are included in (8) explicitly in order to emphasize the general structure of this expression. Namely, the term of each order contains all possible monomials of this order that can be formed from , , and . The relationship between the polynomial order and the number of adjustable coefficients in the overall expansion, , is given in Table 1.

Assume that the number of molecular geometries (the *ab initio* data points) used to construct the fit is . Vector of length is introduced that contains the values of *ab intio* energies at these points. Vector of length is introduced that contains the values of fit at these points. We also introduce vector of length that contains a set of fitting coefficients for the polynomial of order . Finally, the matrix of the size -by- is introduced that allows obtaining values of the fit at the data points from the matrix-vector product:
The structure of matrix follows directly from (8) above. Namely, the elements of matrix can be expressed as:
where index labels data points and defines, , and according to (5)–(7), while index labels the terms of expansion in (8) and defines a set of powers , , and . As introduced above, and . For each given order of expansion , a set of three powers in (10) has to satisfy the following monomial condition:
As introduced above, . This relation is illustrated in Table 2, where we listed all allowed sets of powers for orders up to the 5th. Using this approach the elements of matrix can be easily generated for any set of data points and any order of the fitting polynomial.

Next, the deviation between the fitting function and the *ab initio* energies at the data points is minimized,
by finding the appropriate values of . This can be formulated as a linear least squares fitting problem. In our numerical implementation, this problem is solved using subroutine DGELS of the LAPACK library [13]. Accuracy of the fit is measured by the root mean square (RMS) deviation defined as
with being the *ab initio* energy at the point and being the value of the fit at the same point, obtained from (8). This RMS deviation characterizes global quality of the fit. The local quality of the fit for each point is given by the absolute deviation:
A FORTRAN subroutine has been developed to determine a set of adjustable coefficients for any given set of data points and any given order of the fitting function. This subroutine is general and can be used for any homonuclear triatomic molecule.

In addition to the internuclear distances and the Morse variables , we extensively use one more set of the internal coordinates for the triatomic molecule: —the adiabatically adjusting principal axes hyperspherical (APH) coordinates [14]. These coordinates are not used in the fitting procedure, but they are very handy for graphical representation of the surface. Thus, the hyperradius is used as an abscissa in Figures 1, 3 and 4 below. The full set of andis used in Figure 5. Working with the APH coordinates, it is useful to remember that the small-amplitude motion along the hyper-radius correlates with the symmetric stretching normal mode, while the large-amplitude motion along describes dissociation, process (1). Small-amplitude motion along correlates with the asymmetric stretching normal mode, while the large-amplitude motion along describes pseudorotation [14]. The hyperangle correlates with the bending normal mode.

**(a)**

**(b)**

#### 3. Generating Small Set of Preoptimized Points

Quite often, even before we start the computationally expensive *ab initio* calculations, we do already have some preliminary information about the PES. This information may come from the experiment (e.g., equilibrium structure, harmonic frequencies, dissociation energy), from the empirical forcefield or from the computationally inexpensive electronic structure calculations (e.g., CASPT2). In the case of ozone, we have an old surface [3], and we want to take full advantage of this information in order to minimize the number of points needed to construct an accurate fit.

Using the old surface of ozone [3], we experimented with polynomials of different orders, with different values of , and with different number of points in the data set. The polynomial of 16th order with Bohr was chosen. Such polynomial contains adjustable parameters. We found that, when the number of data points is around , the polynomial of 16th order has enough flexibility to reproduce all features of the complicated ozone PES, without showing artificial oscillations. Polynomials of lower-orders were not flexible enough and exhibited higher RMS deviations. Polynomials of higher-orders were more successful in fitting the data, but they showed artificial oscillations in the regions of sparse points. We searched for solution of this problem and found that the undesirable oscillations of the high order polynomials could be reduced by adding more data points. Finally, we have chosen to proceed with a small set of points, about 500, and the polynomial function of 16th order. The choice of number 500 was, of course, somewhat arbitrary. In what follows we will call these points the *fitting set*.

Next, we tried to find a way of choosing positions of points of the fitting set in a more or less optimal way with the purpose of obtaining the most accurate representation, by the polynomial of 16th order, of the relevant part of the PES. For our purposes (reaction dynamics at thermal energies) most relevant is the low energy part of the PES. Using the old surface we generated a very dense 3D grid of points in the configuration space. Points at energies higher than 0.2 eV above the dissociation threshold were removed as irrelevant, leaving approximately half million points in what we will call the *reference set*. The reference set of points is discrete, but, since it is very dense, it approximates reasonably well the continuous configuration space. At the first step the 500 points of the fitting set were placed randomly in the relevant part of the configuration space. Then we moved points of the fitting set, literally one by one, to new positions making sure that each such move maximizes accuracy of the surface representation. Accuracy of the surface representation was characterized by the RMS deviation between the fit and the old surface (computed using the reference set of points). Each possible position of the fitting point was evaluated by refitting the corresponding set of 500 fitting points and calculating the corresponding RMS deviation of the fit. The position that gives smallest RMS deviation was chosen, and the point was moved into that new position, after which the procedure was applied to the next point of the fitting set, and the next.

This procedure does not have to stop after all 500 points of the fitting set have been moved to new better positions. We tried to reiterate the whole set again and again, until the “memory” of the initial (arbitrary) placement of the fitting points has been lost.

Figures 1 and 2 illustrate this procedure. Figure 1(a) shows initial positions of the 500 fitting points in the configuration space. In this example the initial placement of points was intentionally done in the worst possible way. All points sit very close to each other in one remote region of the PES. Coverage of the configuration space is minimal, and the RMS deviation of the fit is huge, close to . Figure 2(a) shows that as points are moved one by one to their new better positions, the RMS deviation of the fit drops very quickly, roughly exponentially. After 200 points have been moved, the exponential fall regime transforms into an almost horizontal line regime, where the RMS deviation of the fit fluctuates in the range between 33 and 50 cm^{−1}. Figure 2(b) shows that further iterations do not really improve the RMS deviation of the fit, but they promote the loss of “memory” of the initial placement of points. Figure 1(b) shows the final distribution of fitting points in the configuration space. Note that the final distribution of points in Figure 1(b) is not uniform. Some emphasis on the high energy part of the PES is clearly present, which means that this part of the PES is harder to fit accurately, compared to the well region at low energies. Some emphasis of the reef region is also seen. This final fitting set is characterized by the RMS deviation of 36.0 cm^{−1}.

**(a)**

**(b)**

**(a)**

**(b)**

**(a)**

**(b)**

**(c)**

In order to characterize quality of the analytic fit itself, we computed the RMS deviation between the fitting function and the 500 data points of the fitting set. This value was only 33.0 cm^{−1}.

One interesting observation that follows from the numerical experiments described above is that moving only a small number of points is sufficient in order to reduce the RMS deviation to a relatively low value. Further moves do not necessarily improve the RMS deviation. For example, in the worst case scenario presented in Figures 1 and 2, it was sufficient to move the first 200 points. In the less extreme tests we conducted, when the initial coverage was close to the uniform, we saw that moving about 100 points was enough.

One can ask the question: would it be possible to use just those first 100 points and construct an accurate fit of the PES? The answer is no. We need more points than fitting coefficients, and our tests showed that the fitting function of 16th order is smooth only if we use about 500 points or more.

Detailed inspection of the final fit showed that although the covalent well was reproduced reasonably well, the fit of the MEP in the region of the vdW well was not really acceptable. Indeed, from Figure 1(b), we see that only a small number of fitting points ended up in the vdW region of the PES. In order to describe this part of the PES more accurately, one has to force the points to move into this part of PES, and cover it thoroughly, even if the RMS deviation of the global fit is not improved by such placement of points. This can be done, for example, by associating different fitting weights with different regions of the PES and we tried this approach. Another approach is to manually cover the asymptotic part of the PES with fitting points and forbid those points to move out. We found that this second method is easier to implement and ended up with the algorithm described in the next paragraph.

Based on the fact that moving just 100 points allows minimizing the RMS deviation of the fit, we spilt the set of 500 points onto two groups. The first 400 points were placed throughout the configuration space quasi-randomly (using a very sparse grid in the hyperspherical coordinates) and were not allowed to move. In Figure 3(a), these points are shown in black. About 80% of these points belong to the vdW region, while the 20% of them describe the main well. Positions of the remaining 100 points were optimized, using the method described above, in order to reduce the RMS deviation of the fit. Their final positions are shown in Figure 3(a) in red. The resultant RMS deviation between the fit and the reference set was 36.4 cm^{−1}. The RMS deviation between the fit and the fitting set of 500 points was only 36.0 cm^{−1}. These numbers are slightly higher (~9%) compared to the case when all 500 points were optimized, but sacrificing slightly the global RMS deviation of the fit allows reproducing much better the vdW well region of the PES.

We would like to call a set of points obtained in this way a *preoptimized* set of points, because the points optimized using preliminary version of PES may not be equally optimal for the new accurate PES. However, it is obvious that pre-optimizing a set of points is better than using an arbitrarily chosen set of points, especially if the *ab initio* calculations are computationally expensive.

We have to stress again that our pre-optimization procedure was carried out using the old surface of O_{3}, without doing any new *ab initio* calculations. The data points from the old surface are basically free, which allowed us to use them so liberally (i.e., repeatedly move points to new positions disregarding the old ones). We have to admit that if the old PES would not exist (which may be the case for new molecules), one option would be to actually compute points during pre-optimization. This would make the pre-optimization procedure more demanding computationally and would probably require employing more elegant approach to optimization, such as the steepest descend method.

Finally, the preoptimized set of points was used to carry out new *ab initio* calculations and construct new PES of ozone.

#### 4. Results and Discussion

All *ab initio* calculations were carried out using, MOLPRO suite of the electronic structure codes [15]. The level of theory was icMR-CISD+Q using the CASSCF(12,9) active space. At each point of the PES, we carried out two independent calculations: first with aug-cc-pVQZ and second with aug-cc-pV5Z basis sets. Then, for each point of the PES, we made extrapolation to the CBS limit using the following extrapolation formula [10]:
The resultant value of the dissociation energy at the CBS limit is , which is higher than the dissociation energy of the old surface [1, 2]. This represents a dramatic improvement. New theoretical value of the dissociation energy is only () lower compared to the recent experimental value [7, 8].

We also tried calculations with aug-cc-pV6Z basis sets and made extrapolation based on the aug-cc-pV5Z and the aug-cc-pV6Z data. We found, however, that this adds only (about ) to the dissociation energy of ozone and decided not to follow this path. Calculations with larger active space CASSCF(18,12) and with incorporation of the spin-orbit correction are ongoing and will be reported elsewhere [11].

New data points at the CBS limit are shown in Figure 3(b). By comparing energies of the pre-optimized set of points (old PES, Figure 3(a)) to the set of new points computed in this work (Figure 3(b)), one can easily see that the two PESs are somewhat different, particularly in the regions above the well and in the vicinity of the transition state.

The CBS data points were fitted by the permutationally invariant polynomial of 16th order as explained in Section 2. The RMS deviation of the resultant fit was 26 cm^{−1} which is, in fact, slightly lower than the RMS deviation seen at the pre-optimization step (Section 3). This result is quite encouraging and supports the idea (and efficiency) of pre-optimizing a set of data points prior to the massive *ab initio* calculations.

For the new analytic PES, we determined the MEP using the Newton-Raphson minimization method [16]. We inspected the MEP and found that, in its vicinity, the typical deviation of the fit from the data points was about 25 cm^{−1}, close to the global RMS deviation. We concluded that new surface can benefit from adding more points to the vicinity of the MEP. Note that the fitting method of Braams and Bowman [12] allows adding points to the fitting set *a posteriori*. Refitting an upgraded set of the data points is straightforward. We tried to add 4 points in the range of small values of , 30 points close to the bottom of the covalent well, 14 points near the reef, and 22 points along the bottom of the vdW well—about 70 new points total. These additional points are shown in Figure 3(b) in green. Refitting the whole set of points gave even better global RMS deviation of 22 cm^{−1} and resulted in a much better description of the PES along the MEP. The final MEP is shown in Figure 3(b).

Detailed views of three critical regions of new PES are given in Figure 4. Figure 4(a) shows the bottom of the covalent well, while Figure 4(b) shows the reef and Figure 4(c) shows the vdW well. In these figures the minimum energy path is shown by the solid line, the values of *ab initio* energies at the data points are shown by filled symbols, while the values of analytic fit at the data points are shown using plus-signs. Deviations of the fit from the data are about 3 cm^{−1} near the bottom of the covalent well, only 2 cm^{−1} near the top of the reef, and about 7 cm^{−1} near the bottom of the vdW well. These numbers demonstrate excellent accuracy of the analytic representation of the PES. On this new PES, the top of the reef is 102 cm^{−1} below the dissociation threshold; the vdW well is 215 cm^{−1} deep.

In Figure 5 we show an isoenergy surface for the PES of ozone in three dimensions. In this approach to PES visualization, the configuration space where the potential energy of the molecule is higher than a given cut-off value (here 0.2 eV above the dissociation threshold) is made transparent, while the configuration space classically accessible to the motion of nuclei at this energy is made visible as a 3D structure. The APH hyperspherical coordinates are used [14] in order to emphasize symmetry of the PES. The range of hyper-radius in Figure 5 is the same as in Figure 3, namely, 3.5 < < 11 Bohr. Recall that the hyperangles and correlate with bending and asymmetric stretching motions, respectively. Note that, due to the permutation symmetry of ozone, there are three identical covalent wells in the entire configuration space, which can be reached by changing the value of . Those are seen as three ellipsoidal-like lobes in the lower part of Figure 5. They are connected to three identical dissociation channels seen in the upper part of Figure 5. The reef structures discussed above (the transition states) are seen on this picture as narrow bottlenecks connecting each covalent well to two dissociation channels. Figure 5 gives global view of the PES and permits seeing all its features simultaneously. We also see that fitting by the permutationally invariant polynomial function takes full advantage of symmetry of the molecule.

#### 5. Conclusions

The approach of fitting the *ab initio* data by the permutationally invariant polynomial functions allows constructing accurate global PESs of symmetric molecules using relatively small number of points. Several features of the method contribute to reduction of the number of data points. First of all, analytic form of the fitting function takes the full advantage of symmetry and uses most efficiently every point computed *ab initio*. Redundant coverage of the configuration space is avoided. Second, this approach allows focusing on the most important part of the PES (here low energies) and avoids placing points into the irrelevant regions. Indeed, the high-energy regions of PESs are not only irrelevant to dynamics or spectroscopy but also harder to characterize *ab initio*. Third, the approach allows adding *ab initio* points one by one to the most important regions of the PES in order to improve representation of the surface there. This is very useful since in chemistry the transition state point is, usually, the most important. Finally, positions of the data points in the configuration space can be efficiently pre-optimized using available theoretical or experimental information. In our case an older PES of ozone was extremely useful.

It is true that such complete information as an older PES may not be available for every molecule in the nature, but for many important molecules the older PESs do exist. If unavailable, an approximate PES can be constructed using experimental data or the *ab initio* calculations at a reduced level of theory. In fact, the permutationally invariant polynomials of low order can be employed to construct an approximate PES quickly, since they contain very small number of fitting parameters (see Table 1) and require very few data points.

#### Acknowledgments

Joel Bowman and Bas Braams at Emory University are gratefully acknowledged for their invaluable advice on adopting the fitting approach. D. Babikov acknowledges support as Visiting Fellow at Emerson Center for Scientific Computation, Emory University. Qadir Timerghazin at Marquette University is acknowledged for his help in setting up the electronic structure calculations. This research was partially supported by the NSF Atmospheric Chemistry Program, grant number 0842530, and partially by the Air Force Office of Scientific Research, grant number FA9550-09-1-0604. We used resources of the National Energy Research Scientific Computing Center, which is supported by the Office of Science of the US Department of Energy under Contract no. DE-AC02-05CH11231.