Research Article  Open Access
Nanoquantitative StructureProperty Relationship Modeling on C_{42} Fullerene Isomers
Abstract
The interest of scientists in nanostructures has been increased in the last years and proper methods for their assessment are needed. In silico methods found their usefulness in the replacement of experimental evaluation and are successfully used as efficient alternatives for estimation and prediction of compound’s properties or activities. In this paper, it is shown that a Quantitative StructureProperty Relationship method is proper to be applied also on nanostructures. Based on computational experiment, several models to describe the total strain energy of C_{42} fullerene isomers were obtained and their characteristics are presented. Furthermore, the best performing model obtained on C_{42} fullerene isomers was validated on C_{40} fullerene isomers.
This paper is dedicated to Professor Mircea V. Diudea on the occasion of his 65th birthday.
1. Introduction
Since their discovery in 1985 [1], fullerenes attracted interest in different fields of science, including medical field (e.g., for potential use as antibiotics [2–4], as inhibitors of erythroid cells—fullerenol [5], as drug delivery system [6], or as inhibitors of inflammatory mediators [7]). Fullerene molecules are constructed from carbon atoms and take the shape of sphere (also known as buckyballs), ellipsoid, or tube [8]. First spherical fullerene, C_{60}, was discovered in 1985 [1]. Fullerenes have different properties and showed different number of associated isomers (Table 1) [9]. The smallest fullerene (C_{28}) was stabilized by metal encapsulation (with Ti, Zr, and U) by Dunk et al. [10]. Chen et al. showed that C_{32} fullerene has stronger aromaticity compared with C_{30} and C_{34}, respectively [11]. Fifteen distinct isomers with different energies were reported by Manna and Ghanty who encapsulate U into various C_{36} cages [12]. Muhammad et al. showed that C_{20} is a closedshell fullerene and fullerenes C_{26} and C_{30} are pure openshell compounds, whereas C_{36}, C_{40}, and C_{42} are intermediate openshell compounds [13].
 
Source: http://www.nanotube.msu.edu/fullerene/fullereneisomers.html [accessed June 7, 2015]. 
The C_{42} fullerenes are small, not necessary spherical cages. The C_{42} cages enclosed high pentagon/hexagon ratio [14]. Fullerene C_{42} along with C_{60} showed highest values of the main peak on MatrixAssisted Laser Desorption Ionization TimeofFlight (MALDITOF) on mass spectrometric measurement [15].
Some activities of fullerenes have been modeled using quantitative structureactivity relationship (QSAR) approaches (such as antiHIV protease inhibition activity [16], antiviral activity [17], and drug delivery system [18]). However, C_{60} received the main attention while other fullerenes were neglected in regard of QSAR/QSPR (Quantitative StructureProperty Relationship) modeling. The aim of our research was to model the total strain energy of the isomers of C_{42} fullerene using the structural information.
2. Materials and Methods
All C_{42} fullerene isomers were included in the analysis. Data related to continuum elasticity expressed as total strain energy (TSE in eV) and the structures as files of C_{42} fullerene isomers were taken from [19] (Table 2).

The analysis was conducted on the downloaded file of the C_{42} isomers without any modification on the available geometry. According to [19], the fullerene geometries were based on the geometry of the structures in Yoshida’s Fullerene Library (UNIX files) and reoptimized using Dreidinglike forcefield [20]. Here the obtained geometry is used.
The steps applied in the analysis are depicted in Scheme 1.
In the first step of the analysis the downloaded files were translated into file with Spartan software (https://www.wavefun.com/products/spartan.html). In the second step the file is transformed as file using Babel software (http://openbabel.org). The partial charges were calculated in the third step using HyperChem software (http://www.hyper.com/) by applying PM3 (Parameterized Model number 3 [21]) single point (energy) semiempirical calculations. The structural features of the investigated nanoclass of compounds were extracted using unsymmetrical Szeged set, an extension of corresponding Szeged Matrix [22] (forth step). The calculated values of the structural descriptors and the collected values of total strain energy were included in nanoQSPR modeling in the fifth step of the analysis and models with the highest goodnessoffit (defined as highest correlation coefficients) were analyzed and validated in leaveoneout and leavemanyout analyses [23, 24].
Leaveoneout analysis retrieves valid models if determination coefficient () takes values higher than 0.5. Leavemanyout analysis was conducted for the models with highest abilities in estimation expressed as the highest value of the correlation coefficient. The set was split using a simple random technique [25] in training and test with 2/3 of compounds in training set. The models obtained in training sets were used to predict the TSE in the test sets. The leavemanyout analysis was run five times for equations identified as being with highest estimation and internal prediction abilities in order to assess their prediction abilities.
The assessment of the prediction ability was done on an external dataset represented by C_{40} isomers considering the same property. The TSE values and the structures for external validation were taken from the same source as C_{42} isomers: http://nanotube.msu.edu/fullerene/fullerene.php?C=40 (accessed December 20, 2015). Several metrics were used to assess the prediction ability of the model [23, 24]: determination coefficient on the external set (), predictive square correlation coefficient on external set (, [26]), external prediction ability (), root mean square error of prediction (RMSEP), mean absolute error of prediction (MAEP), percentage predictive error (%PredErr), and concordance correlation coefficient (CCC [27]).
3. Results and Discussion
Structural information of the investigated C_{42} isomers was obtained by calculation of the pool of descriptors given by Szeged Matrix Property Indices (SMPI) method [28]. Performing models in regard of goodnessoffit (highest correlation coefficient) with 1, 2, 3, and 4 SMPI descriptors was obtained and is given in (1)–(4):where is total strain energy estimated by the model; IJUGE, IIUGF, IFEGE, IFETB, and IFUGB are SMPI descriptors. Two descriptors (IFETB and IFUGB) account for the atomic number as atomic property; the other two descriptors account for electronegativity (IJUGE and IFEGE), while one accounts for the first ionization energy (IIUGF). The investigated property is related to the geometry of compounds (fourth letter “G” in the name of descriptors) with one exception when it is related to topology (fourth letter “T” in the IFETB descriptor). The other letters reflect the linearization operator (first letter), matrix operation (second letter), and interaction descriptor (third letter).
As expected, the determination coefficient increases as the number of descriptors in the models increases, while the standard error of the estimate decreases (Table 3).
 
: determination coefficient; : adjusted determination coefficient; se: standard error of estimate; : Fisher’s statistic (value); : the minimum of absolute statistic associated with the intercept and coefficients of the model; %PredErr: percentage prediction error; : determination coefficient in leaveoneout analysis; loo: leaveoneout analysis. 
The distance between determination coefficient of the model and determination coefficient obtained in leaveoneout analysis varied from 0.0027 to 0.0227, the smallest distance being obtained by (3) (Table 3). On the other hand, the smallest difference between standard errors (estimation model and leaveoneout model) is obtained by the same model (3).
The analysis of the results presented in Table 3 showed that the model with four descriptors is the one with smallest percentage of prediction error. Furthermore, the data on the scatter closest to the straight line is observed for the model given by (4) (Figure 1). Figure 1 shows the absence of the differences between models from (3) and (4), with the dispersion of the point in the scatter closest to the line for model given by (4).
(a)
(b)
(c)
(d)
The main characteristics of the models given by (3) and (4) obtained in leavemanyout analysis (training versus test analysis; 2/3 of compounds in training set run 5 times) are presented in Table 4.

The results presented in Table 4 showed the stability of the models, with internal prediction power (defined as determination coefficient in test sets) closed to the estimation power (determination coefficient in training set) from both investigated models. Therefore, the results obtained in training sets closely follow the results on the whole sample for (3) with in the same range when two decimals are of interest. The obtained in test set in all five runs of the leavemanyout analysis was equal to 0.99, so slightly higher than the obtained in training sets (0.98). In three cases out of five, the in training sets for (4) was in the same range for two decimals with the value given in Table 3. However, without any exception, the in test sets was smaller than the in training sets for (4), with values that varied from 0.0005 (id 7 in Table 4) to 0.0264 (id 6 in Table 4). These results showed that (3) performs slightly better in terms of determination coefficients in leavemanyout analysis.
The plots of the models obtained in the fourth run for (3) and fifth run for (4), as examples, are given in Figure 2.
(a)
(b)
The equations identified with estimation power and internal prediction abilities, namely, (3) and (4), were further applied on C_{40} isomers to test the external prediction abilities. The prediction power of (4) proved to be better compared with prediction power of (3) (see Figure 3 and Table 5).
 
: determination coefficient on the external set; : predictive square correlation coefficient on external set; : external prediction ability; RMSEP: root mean square error of prediction; MAEP: mean absolute error of prediction; %PredErr: percentage predictive error; NR: not reliable value. 
(a)
(b)
Despite the fact that the predictive square correlation coefficient on external set is higher for (3) compared with the value obtained with (4), all other calculated metrics sustain that the model given by (4) has better prediction abilities (highest determination coefficient on external set, lowest mean absolute error of prediction, and lowest percentage of predictive error; see Table 5). Furthermore, the analysis of the overall spread of the points in the scatterplot leads to the conclusion that (4) had better prediction abilities compared with (3). Nevertheless, the mean of residuals proved to be significantly different than the expected value (zero). It could be concluded that the model given by (4) better fit the data on which it was constructed compared with all other models. Nevertheless, are the structural features extracted by SMPI descriptors on C_{42} isomers able to predict the TSE on C_{40} isomers?
SMPI descriptors used by (3) and, respectively, (4) were used to predict the TSE on C_{40} isomers. One out the three descriptors from (3) proved to have the slope not significantly different by zero and was not included in further analysis. The identified models obtained on C_{40} isomers are given inwhere is total strain energy estimated by the model; IJUGE, IIUGF, IFETB, and IFUGB are SMPI descriptors. Two descriptors (IFETB and IFUGB) account for the atomic number as atomic property, one descriptor accounts for electronegativity (IJUGE), and one accounts for the first ionization energy (IIUGF). The investigated property is related to the geometry of compounds (fourth letter “G” in the name of descriptors) with one exception that is related with compounds topology (IFETB descriptor). The other letters reflect the linearization operator (first letter), matrix operation (second letter), and interaction descriptor (third letter). Note that both models have the mean of residual not significantly different by zero ().
The analysis of the metrics associated with (5) and (6) leads to the conclusion that model given by (6) perform better than the model given by (5). The same conclusion is obtained by analyzing the plots of observed versus predicted TSE (Figure 4).
(a)
(b)
The results of our study showed that the identified nanoQSPR models fit the data based on which the model was identified (C_{42} isomers) but could be used for selection of those structural descriptors with fair abilities in prediction on external dataset (C_{40} isomers). To sum up, equations relating electronegativities, ionization potential, and energy have been identified on C_{42} isomers and proved to work also on C_{40} isomers. Note that electronegativities and ionization potential are atomic properties and since the investigated set contains just C and H atoms, the identified relation between the three properties could be assigned also to the topology and geometry of the investigated compounds.
To the best of our knowledge, structureproperty relationship approaches were not applied on C_{42} or C_{40} fullerene isomers. The smalldiameter fullerenes (C_{20}, C_{34}, C_{42}, and C_{60}) were mainly investigated in regard of properties (such as adsorption [29], distribution of CC distance [14], and Schlegel diagrams of molecular structures [30]). Therefore, this is the first report of a quantitative relationship between structure and property of C_{42} fullerene. Undoubtedly, the advancement from theoretical to experimental studies is desired.
4. Conclusions
The C_{42} fullerene isomers were successfully modeled and the total strain energy was characterized as function of information extracted from structure of the compounds. The models with goodnessoffit in leaveoneout () and leavemanyout analyses proved also that prediction power is the one with four descriptors. The total strain reaction proved a function of electronegativity and first ionization energy, in relation to geometry of compounds. The structural descriptors able to fairly explain the total strain energy on C_{42} isomers proved also able to explain the same property on C_{40} fullerene isomers.
Conflict of Interests
The authors declare that there is no conflict of interests regarding the publication of this paper.
References
 H. W. Kroto, J. R. Heath, S. C. O'Brien, R. F. Curl, and R. E. Smalley, “C_{60}: Buckminsterfullerene,” Nature, vol. 318, no. 6042, pp. 162–163, 1985. View at: Publisher Site  Google Scholar
 R. Dinesh, M. Anandaraj, V. Srinivasan, and S. Hamza, “Engineered nanoparticles in the soil and their potential implications to microbial activity,” Geoderma, vol. 173174, pp. 19–27, 2012. View at: Publisher Site  Google Scholar
 A. J. Huh and Y. J. Kwon, “Nanoantibiotics: a new paradigm for treating infectious diseases using nanomaterials in the antibiotics resistant era,” Journal of Controlled Release, vol. 156, no. 2, pp. 128–145, 2011. View at: Publisher Site  Google Scholar
 Y. S. Zhang, T. H. Dai, M. Wang, D. Vecchio, L. Y. Chiang, and M. R. Hamblin, “Potentiation of antimicrobial photodynamic inactivation mediated by a cationic fullerene by added iodide: in vitro and in vivo studies,” Nanomedicine, vol. 10, no. 4, pp. 603–614, 2015. View at: Publisher Site  Google Scholar
 N. V. Tishevskaya, Yu. M. Zakharov, E. V. Golubotovskii et al., “Effects of fullerenol C_{60}(OH)_{24} on erythropoiesis in vitro,” Bulletin of Experimental Biology and Medicine, vol. 157, no. 1, pp. 49–51, 2014. View at: Publisher Site  Google Scholar
 S. Pacor, A. Grillo, L. Đorđević et al., “Effects of two fullerene derivatives on monocytes and macrophages,” BioMed Research International, vol. 2015, Article ID 915130, 13 pages, 2015. View at: Publisher Site  Google Scholar
 A. L. Dellinger, Z. Zhou, and C. L. Kepley, “A steroidmimicking nanomaterial that mediates inhibition of human lung mast cell responses,” Nanomedicine: Nanotechnology, Biology, and Medicine, vol. 10, no. 6, pp. 1185–1193, 2014. View at: Publisher Site  Google Scholar
 A. Hirsch and M. Brettreich, Fullerenes: Chemistry and Reactions, John Wiley & Sons, New York, NY, USA, 2005. View at: Publisher Site
 D. Tománek, Guide Through the Nanocarbon Jungle, Morgan & Claypool, San Rafael, Calif, USA, 2014. View at: Publisher Site
 P. W. Dunk, N. K. Kaiser, M. MuletGas et al., “The smallest stable fullerene, M@C_{28} (M = Ti, Zr, U): stabilization and growth from carbon vapor,” Journal of the American Chemical Society, vol. 134, no. 22, pp. 9380–9389, 2012. View at: Publisher Site  Google Scholar
 Y.M. Chen, J. Shi, L. Rui, and Q.X. Guo, “Theoretical study on C_{32} fullerenes and their endohedral complexes with noble gas atoms,” Journal of Molecular Structure: THEOCHEM, vol. 907, no. 1–3, pp. 104–108, 2009. View at: Publisher Site  Google Scholar
 D. Manna and T. K. Ghanty, “Enhancement in the stability of 36atom fullerene through encapsulation of a uranium atom,” Journal of Physical Chemistry C, vol. 117, no. 34, pp. 17859–17869, 2013. View at: Publisher Site  Google Scholar
 S. Muhammad, K. Fukuda, T. Minami, R. Kishi, Y. Shigeta, and M. Nakano, “Interplay between the diradical character and thirdorder nonlinear optical properties in fullerene systems,” Chemistry—A European Journal, vol. 19, no. 5, pp. 1677–1685, 2013. View at: Publisher Site  Google Scholar
 E. Małolepsza, Y.P. Lee, H. A. Witek, S. Irle, C.F. Lin, and H.M. Hsieh, “Comparison of geometric, electronic, and vibrational properties for all pentagon/hexagonbearing isomers of fullerenes C_{38}, C_{40}, and C_{42},” International Journal of Quantum Chemistry, vol. 109, no. 9, pp. 1999–2011, 2009. View at: Publisher Site  Google Scholar
 E. I. Kauppine, “Carbon Nanotubes and NanoBuds—Synthesis, Structure, Functionalisation and Dry Deposition for TCE and TFT Applications,” July 2015, http://www.jst.go.jp/sicp/ws2009_finland/abstract/wg2_02kau.pdf. View at: Google Scholar
 M. Ibrahim, N. A. Saleh, W. M. Elshemey, and A. A. Elsayed, “Fullerene derivative as antiHIV protease inhibitor: molecular modeling and QSAR approaches,” MiniReviews in Medicinal Chemistry, vol. 12, no. 6, pp. 447–451, 2012. View at: Publisher Site  Google Scholar
 L. Ahmed, B. Rasulev, M. Turabekova, D. Leszczynska, and J. Leszczynski, “Receptor and ligandbased study of fullerene analogues: comprehensive computational approach including quantumchemical, QSAR and molecular docking simulations,” Organic & Biomolecular Chemistry, vol. 11, no. 35, pp. 5798–5808, 2013. View at: Publisher Site  Google Scholar
 A. Trpkovic, B. TodorovicMarkovic, and V. Trajkovic, “Toxicity of pristine versus functionalized fullerenes: mechanisms of cell damage and the role of oxidative stress,” Archives of Toxicology, vol. 86, no. 12, pp. 1809–1827, 2012. View at: Publisher Site  Google Scholar
 D. Tománek, C42 Isomers. In: Guide through the Nanocarbon Jungle: Buckyballs, Nanotubes, Graphene, and Beyond, 2015, http://www.nanotube.msu.edu/fullerene/fullerene.php?C=42.
 S. L. Mayo, B. D. Olafson, and W. A. Goddard, “DREIDING: a generic force field for molecular simulations,” Journal of Physical Chemistry, vol. 94, no. 26, pp. 8897–8909, 1990. View at: Publisher Site  Google Scholar
 J. J. P. Stewart, “PM3,” in Encyclopedia of Computational Chemistry, P. von and R. Schleyer, Eds., John Wiley & Sons, New York, NY, USA, 1998. View at: Google Scholar
 M. V. Diudea, O. M. Minailiuc, G. Katona, and I. Gutman, “Szeged matrices and related numbers,” MATCH Communications in Mathematical and in Computer Chemistry, vol. 35, pp. 129–143, 1997. View at: Google Scholar  MathSciNet
 S. D. Bolboacă and L. Jäntschi, “Quantitative structureactivity relationships: linear regression modelling and validation strategies by example,” BIOMATH, vol. 2, no. 1, Article ID 1309089, 11 pages, 2013. View at: Publisher Site  Google Scholar  MathSciNet
 S. D. Bolboacă, L. Jäntschi, and M. V. Diudea, “Molecular design and QSARs/QSPRs with molecular descriptors family,” Current ComputerAided Drug Design, vol. 9, no. 2, pp. 195–205, 2013. View at: Publisher Site  Google Scholar
 S. D. Bolboacă, “Assessment of random assignment in training and test sets using generalized cluster analysis technique,” Applied Medical Informatics, vol. 28, no. 2, pp. 9–14, 2010. View at: Google Scholar
 N. Chirico and P. Gramatica, “Real external predictivity of QSAR models. Part 2. New intercomparable thresholds for different validation criteria and the need for scatter plot inspection,” Journal of Chemical Information and Modeling, vol. 52, no. 8, pp. 2044–2058, 2012. View at: Publisher Site  Google Scholar
 Lin's Concordance, December 2015, http://services.niwa.co.nz/services/statistical/concordance.
 L. Jäntschi, “Szeged Matrix Property Indices,” 2014, http://l.academicdirect.org/Chemistry/SARs/SMPI. View at: Google Scholar
 X. Liu, Y. Wen, Z. Chen et al., “Modulation of Dirac points and bandgaps in graphene via periodic fullerene adsorption,” AIP Advances, vol. 3, no. 5, Article ID 052126, 2013. View at: Publisher Site  Google Scholar
 Y.N. Chiu, J. Xiao, C. D. Merritt et al., “Special geminals and Schlegel diagrams of molecular structures of fullerenes and metallofullerenes,” Journal of Molecular Structure: THEOCHEM, vol. 530, no. 12, pp. 67–83, 2000. View at: Publisher Site  Google Scholar
Copyright
Copyright © 2016 Sorana D. Bolboacă and Lorentz Jäntschi. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.