Research Article  Open Access
Jose Fayos, "Space Group Approximation of a Molecular Crystal by Classifying Molecules for Their Electric Potentials and Roughness on Their Inertial Ellipsoid Surface", Advances in Chemistry, vol. 2014, Article ID 737480, 9 pages, 2014. https://doi.org/10.1155/2014/737480
Space Group Approximation of a Molecular Crystal by Classifying Molecules for Their Electric Potentials and Roughness on Their Inertial Ellipsoid Surface
Abstract
In order to predict the most probable space group where a molecule crystallizes, it is assumed that molecular shape and electric potential distribution on the molecular surface are the main factors or predictors. However, to compare and classify molecules by these two factors seems to be very difficult for in general such different objects. Thus, in order to compare molecules, they are reduced to their inertial ellipsoid in which surface 26 equally spaced points were chosen where a roughness factor and an electric potential due to all atomic charges of the whole molecule are calculated. By this procedure, different molecules encoded by these two predictor vectors can be compared and classified, showing that molecules that crystallize in the same space group have more similar predictor vectors. This result opens the possibility to predict the more probable spatial group associated with a molecule.
1. Introduction
The first hypothesis considered for crystal packing prediction CPP of organic molecules is that the isolated molecule contains the information of its future crystal [1]. So it is for the socalled “blind tests,” the last published in 2011 [2], where several laboratories compete to find the crystal structure of a molecule by diverse calculations, where it is assumed that 95% of the molecules present no polymorph and prefer a cell with a given space group SG. Some approaches were previously done to get molecular crystal structure information by data mining [3, 4]. In the present study it is further assumed that the molecular crystal space group is mainly predetermined by the molecular form or roughness and by the electric potential distribution on the molecular surface, both factors being the best predictors for crystal packing, including the formation of hydrogen bonds. In fact electrostatic forces determine molecular reactions as can be observed by Xray in the electron density distributions of crystalline molecules [5, 6], were interactions by electrostatic forces (including Hbonds) between equal molecules in a crystal, would determine its crystal packing. The purpose of this work is to classify the molecules into groups by similarity of the above predictors and to check if these assumed types of packing are correlated with their space group SG, or conversely, to verify that molecules crystallizing in the same SG have similar aggregation predictors. However, comparing these two predictors between usually dissimilar molecules does not seem so simple, unless the molecules could first be reduced to more comparable objects. In a previous work [7], some molecular crystal descriptors like cell axes and the presence of some symmetry elements, but not the SG, were predicted by reducing each molecule to its inertial ellipsoid (adding to each axis the hydrogen VDW radius of 1.17 A). The same molecular reduction to its inertial ellipsoid is taken here, defined by its three axes: Large, Medium, and Small (L, M, S), where 26 points are added to its surface: on the ends of the ellipsoid axes, in the edges centers, and 8 in the face centers: all points approximately equidistant from each other. Figure 1(a) shows the numbered 26 points with their sequence of coordinates on the ellipsoid surface, where for clarity the ellipsoid has been deformed to a cube in this figure.
(a)
(b)
(c)
The classical charges for the atoms in every molecule were calculated by Chem3D Pro [8]; then the electrostatic potential on those 26 points of every ellipsoid surface, due to the charges of all atoms in the molecule, was calculated by = Σ (/), where are the distances from point to the atoms.
Besides, a roughness factor in those 26 points of every molecular ellipsoid surface, from now concavity factor, was calculated as the average of the distances from each point to their four closest atoms in the molecule. In total two vectors per molecule: 26 for potentials and 26 for concavities, each of 26 components, are both the space group predictors in this work. In fact, the main differences with the previous work [7] are the addition of this concavity vector, not scaling the molecular potential vectors and a more simple space group classification procedure.
Table 1 shows molecular global form parameters such as the average molecular concavity and its standard deviation , the relationship between the principal inertial axes (M/L, S/L, S/M), and other parameters of internal symmetry of the molecule, some of them used in the previous work [7].
 
= average in the 26P, desv = stand_deviation of . M/L, S/L, S/M for the inertial ellipsoid. PL = 2, planar molec H’s no considered; PL = 1, pseudoPL; PL = 0, no PL. mS, mM, mL = 2: molecular symmetry plane m perpendicular to S, M, L. mS, mM, mL = 1: pseudo m perpendicular to S, M, L. mS, mM, mL = 0: no m. 
Table 2 shows the averages of those parameters per space group SG. Although all these form parameters are somehow conditioning the SG of the crystal, it is not easy to find in these tables any similarities between those global form parameters for molecules of the same SG, which justifies extending that global form information among 26P over the inertial ellipsoid.

2. Molecular Space Group Predictors
Due to the relation between the present and previous work [7], the same 31 molecules are also chosen here. Table 3 shows those molecules all having azol group with three different substitutes: a group of 17 molecules 17P21/c with space group SG P21/c (also P21/n or P21/a), six of them 6FAQP21/c sharing also a COOEt group in the same substituent, eleven molecules distributed among several SG: 3 in Pbca, 3 in P212121, 3 in Pna21, 2 in P1, and other three molecules crystallizing in different SG.

Finally, in order to compare properly both molecular vectors, 26 for potentials and 26 for concavities, between different molecules, each molecule was reoriented within its inertial ellipsoid IE, by using the symmetry planes L = 0, M = 0, or S = 0, in order to have the molecular largest potential _weighted octant, among the eight octants, in the IE octant (+L+M+S), were _weighted is calculated with the potentials of the seven points of the octant in the proportion: [3 (111) +2 (110 +101 +011) + (100 +010 +001)]. The Supplementary Table (available online at http://dx.doi.org/10.1155/2014/737480) shows the final comparable values of 26 and 26 for the reoriented 31 molecules. It is important to see in Tables 1 and 3 that the presence of two molecules in the same space group SG occurs not only for little changes in their structure like between FAQROE and FAQSAR but also for big changes like for FAQROE and KOSFUT, suggesting that the 26 potentials and 26 concavities sequences around the inertial ellipsoid will determine better the space group.
It is interesting to consider here that all the molecules could also be reoriented to have the lowest, instead of the largest, potential _weighted octant in (+L+M+S). Although the relative orientation between the largest and lowest _weighted octants is not the same for all molecules, however the average distribution on the inertial ellipsoid IE surface of the 26 and −26, respectively, is similar for both orientations of the 31 molecules, which reinforces this analysis. Finally the largest _weighted octant was taken for molecular reorientation. Figure 2(a) shows first the distribution along the IE of the 26 averaged for the 31 azole molecules 26, together with the 26 standard deviations of these averages 26std, showing two sawtooth peaks, around (110)_(111) and (010)_(011), as expected (see Figure 1(a)) for molecules oriented with the largest potential _weighted octant in (+L+M+S).
(a)
(b)
The centrosymmetric molecule HEPJER almost planar in (L, M, 0) is shown in Figure 1(b) into a schematic inertial ellipsoid IE, because it is more suitable for its simplicity to understand the sequence of concavities and potentials along the 26 points on its IE shown in Figure 1(c). The concavities are maximal in the 8 molecular LM0 contour points, the remaining concavities on either side of plane S = 0 being lower. The symmetrical potentials have lower values on 0MS points away from the molecule and higher peaks especially on LM0 points.
3. Analysis of Averaged Potentials and Concavities by Space Group
Figure 2(a) shows the distribution from (1−1−1) to (−111) along the inertial ellipsoid of the 26 potentials averaged for the molecules 26 belonging to different groups: the total of 31az, the 14NotP21/c, the 17P21/c, and the 6FAQP21/c. It also shows the distribution of the 26 standard deviations of those averages des (26) per group, which indicate the similarity between the potentials of the molecules in each one of the 26 points of their inertial ellipsoid. In general des(26) are maximum in singular points with maximum or minimum values of and in particular are clearly superior for the first two groups of mixed space groups compared to the last two P21/c groups, as also shown in Table 4 with the total averages under the column . Figure 2(a) also shows that the distribution of the 26 for 31az and 14azNotP21/c is similar with two peaks in sawtooth form, although being less pronounced the second peak for 31az. The distributions of the 26 for the equal space groups 17P21/c and 6FAQP21/c differ from the previous with the loss of the second sawtooth peak and an increased negative potential on (0−10).
 
shows the similarity for the molecules in the same point of IE. shows the similarity for the molecules in the same point of IE. des shows the similarity between the . 
Therefore, although the overall molecular shape factors of Tables 1 and 2 and the contents of the molecules in Table 3 do not seem to indicate similarity between molecules within each space group SG, however the distributions of the 26 potentials along the 26 points of the inertial ellipsoid appear to have some similarity between the molecules of the same SG. In fact, the analysis of Figure 2(a) and the values of in Table 4 suggest that there exists an association of the potential 26 predictor vector of a molecule with its molecular crystal or space group SG.
Figure 2(b) shows the average distribution of the 26 concavities 26 and of the corresponding standard deviations des(26) for the four groups of molecules: 31az, 14NotP21/c,17P21/c, and 6FAQP21/c. While for 14azNotP21/c, the distribution of 26 is pseudosymmetrical around the center (00 ± 1), that symmetry tends to break for 17azP21/c and more for 6FAQP21/c molecules, especially in the area where their 26 potentials lose the second sawtooth peak. The distribution of 26 for 31az involves molecules of equal and different space group being intermediate as might be expected. Table 4 also shows the quantitative differences between the average distributions of 26 concavities for different groups of molecules. The average values under showing more similar distributions for the 26 concavities between the molecules of the groups 6FAQP21/c and 17P21/c than between the molecules of the group 14NotP21/c, while an intermediate similarity for the total of 31az. This shows the association of the predictor vector 26 of molecular concavities with the space group SG of the molecular crystal, parallel to the previous association observed of the predictor vector 26 of potentials with the space group SG.
Figure 3 shows the molecular distributions of 26 potentials and 26 concavities for the four minority space groups: Pbca, P212121, Pna21, and P1. Although this analysis is less significant with only 3, 3, 3, and 2 molecules each, it also notes some similarity between the distributions of its 26 and of its 26 for molecules with the same space group SG, except for P1 and BIWWEJ (unique molecule in Table 2 with its 26 “calculated” positive). Furthermore, Table 4 shows that the average standard deviation values and for these four space groups do not deviate too much from those of groups 6FAQP21/c and 17P21/c, with the exceptions described above.
(a)
(b)
Finally, Figure 2(b) also shows the differences between the average potentials and average concavities distributions along the inertial ellipsoid surface for the 17P21/c and 14NonP21/c molecular groups. While for L = 1 there is quit similitude between the ’s and between the ’s of both groups, for L = 0 the second sawtooth disappearance for the ’s is accompanied with some ’s distribution variations between both groups, and for L = −1 besides the notable ’s distribution differences there are drastic differences between the ’s distributions for both molecular groups.
4. Conclusion
Assuming that the molecular form and the potential distribution on its surface were the major predictors for crystal packing, it is not easy to compare these properties between different molecules, in order to classify them by their space group. To enable this comparison, molecules are reduced to their inertial ellipsoids IE with 26 singular points equally spaced on the surface, in which 26 potentials and 26 concavity factors are calculated. These two molecular vectors 26 and 26 are taken as molecular packing or space group predictors for the molecular crystal (assuming no polymorphism). Comparing both predictors between 31 molecules, there is more similarity between them for molecules crystallizing in the same space group SG than between molecules with different SG. This suggests that each space group would have its own mean distribution of their 26 potentials and 26 concavities on a virtual inertial ellipsoid, which would enable predicting the probable space group of a molecular crystal by calculating the 26 and 26 distributions on its molecular inertial ellipsoid. Foreknowledge of the probable space group associated with a molecule would facilitate the total crystal prediction CPP to perform other crystal engineering calculation. For example, if the above predicted space group SG (associated with a crystalline form) were not convenient for pharmaceutical processes [1], a molecular modification simulation could be tried to change the SG avoiding that molecular aggregation.
Summary of Symbols
IE:  Inertial ellipsoid of the molecule 
SG:  Crystal space group 
L, M, S:  Large, medium, small axes of IE 
P:  One of 26 points on the IE surface 
:  Electric potential at one P 
:  Concavity at one P 
:  Number of molecules in a group 
des:  Standard deviation of an average 
26:  Vector with the 26 on IE surface 
26:  Vector with the 26 on IE surface 
:  average on a P of a molecular group 
:  average on a P of a molecular group 
des:  Standard deviation of average 
des:  Standard deviation of average 
:  Average of the total 26 of a group 
:  Average of the total 26 of a group 
:  Average of the 26(des) 
:  Average of the 26(des). 
Conflict of Interests
The author declares that there is no conflict of interests regarding the publication of this paper.
Supplementary Materials
Distribution of potentials V and concavities C among the 26 points of the inertial ellipsoid, for the 31 molecules oriented with their largest V_weighted zone in the octant (+L+M+S).
References
 S. L. Price, “Predicting crystal structures of organic compounds,” Chemical Society Reviews, vol. 43, pp. 2098–2111, 2014. View at: Google Scholar
 D. A. Bardwell, C. S. Adjiman, Y. A. Arnautova et al., “Towards crystal structure prediction of complex organic compounds—a report on the fifth blind test,” Acta Crystallographica Section B: Structural Science, vol. 67, part 6, pp. 535–551, 2011. View at: Publisher Site  Google Scholar
 J. Fayos and F. H. Cano, “Crystalpacking prediction by neural networks,” Crystal Growth and Design, vol. 2, no. 6, pp. 591–599, 2002. View at: Publisher Site  Google Scholar
 J. Fayos, L. Infantes, and F. H. Cano, “Neural network prediction of secondary structure in crystals: hydrogenbond systems in pyrazole derivatives,” Crystal Growth and Design, vol. 5, no. 1, pp. 191–200, 2005. View at: Publisher Site  Google Scholar
 H. Nakatsuji, S. Kanayama, S. Harada, and T. Yonezawa, “Electrostatic force theory for a molecule and interacting molecules. 7. Ab initio verification of the force concepts based on the flotating wave functions of ammonia, methyl(1+) ion, and ammonia(1+) ion,” Journal of the American Chemical Society, vol. 100, no. 24, pp. 7528–7534, 1978. View at: Publisher Site  Google Scholar
 Y. Honda and H. Nakatsuji, “Force concept for predicting the geometries of molecules in an external electric field,” Chemical Physics Letters, vol. 293, no. 34, pp. 230–238, 1998. View at: Publisher Site  Google Scholar
 J. Fayos, “Molecular crystal prediction approach by molecular similarity: data mining on molecular aggregation predictors and crystal descriptors,” Crystal Growth and Design, vol. 9, no. 7, pp. 3142–3153, 2009. View at: Publisher Site  Google Scholar
 F. H. Allen, “The Cambridge structural database: a quarter of a million crystal structures and rising,” Acta Crystallographica B, vol. 58, no. 1, part 3, pp. 380–388, 2002. View at: Publisher Site  Google Scholar
Copyright
Copyright © 2014 Jose Fayos. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.