Journal of Chemistry

Journal of Chemistry / 2013 / Article

Research Article | Open Access

Volume 2013 |Article ID 908586 | https://doi.org/10.1155/2013/908586

Elaheh Konoz, Amir H. M. Sarrafi, Alireza Feizbakhsh, Zahra Dashtbozorgi, "Prediction of Gas Chromatography-Mass Spectrometry Retention Times of Pesticide Residues by Chemometrics Methods", Journal of Chemistry, vol. 2013, Article ID 908586, 13 pages, 2013. https://doi.org/10.1155/2013/908586

Prediction of Gas Chromatography-Mass Spectrometry Retention Times of Pesticide Residues by Chemometrics Methods

Academic Editor: Yenamandra S. Prabhakar
Received14 Jan 2012
Accepted30 Apr 2012
Published14 Jun 2012

Abstract

A quantitative structure-retention relationships (QSRRs) method is employed to predict the retention time of 300 pesticide residues in animal tissues separated by gas chromatography-mass spectroscopy (GC-MS). Firstly, a six-parameter QSRR model was developed by means of multiple linear regression. The six molecular descriptors that were considered to account for the effect of molecular structure on the retention time are number of nitrogen, Solvation connectivity index-chi 1, Balaban Y index, Moran autocorrelation-lag?2/weighted by atomic Sanderson electronegativity, total absolute charge, and radial distribution function-6.0/unweighted. A 6-7-1 back propagation artificial neural network (ANN) was used to improve the accuracy of the constructed model. The standard error values of ANN model for training, test, and validation sets are 1.559, 1.517, and 1.249, respectively, which are less than those obtained reveals by multiple linear regressions model (2.402, 1.858, and 2.036, resp.). Results obtained the reliability and good predictability of nonlinear QSRR model to predict the retention time of pesticides.

1. Introduction

Pesticides are used on a large scale for agricultural purposes. The adverse effects of pesticides on both human health and the environment are a matter of public concern. Thus, both the actual state and the transition of pesticide residues in various matrices including water, soil, and agricultural products should be extensively monitored. These researches should be undertaken using an efficient analytical system with a laborsaving and cost-effective device, as pesticides as well as applicable fields of research rang over a broad spectrum. Conventional sample preparation methods used to analyze pesticide residues in various matrices require expensive instrumentation, an expert analyst [14]. Besides the above mentioned, the experimental determination of chromatographic retention parameters of pestocides is time consuming and expensive. Alternatively, quantitative structure-retention relationship (QSRR) provides a promising method for the estimation of retention time based on descriptors derived solely from the molecular structure to fit experimental data. The advantages of this approach lie in the fact that it requires only the knowledge of chemical structure and is not dependent on any experiment properties.

QSRR studies [5, 6] started from the calculation and selection of descriptors, to finding their relation to retention times and derivation of mathematical models that involve these multivariate data in order to be used for predictive purposes in chromatographic system. Multivariate data consist of the results of observations of many different variables (molecular descriptors) for a number of individuals (molecules). Known methods for this include the multiple regression analysis, experimental design techniques, and nonlinear regression. The drawback, sometimes, of these very popular techniques is their inability to give highly predictive models due to hidden nonlinearity inside the data variables or the prerequisite to specify the mathematical model before the fitting of the data. So there is a need to improve further such kind of models in order to extract the most accurate prediction. To this end, artificial neural networks (ANNs) could be used successfully in QSRR studies providing better results than the conventional regression models.

Today artificial neural networks [710] have become an important modeling technique for QSAR and QSPR, and also this technique has been applied in numerous application areas of chemistry and pharmacy [1116]. The mathematical adoptability of ANN commends them as powerful tool for pattern classification and building predictive models. A particular advantage of ANNs is their inherent ability to incorporate nonlinear dependencies between the dependent and independent variables without using an explicit mathematical function. There are few reports about the application of QSRR in the chromatographic studies: D’Archivio et al. modeled the combined effect of solute structure and eluent composition on the retention behavior of 26 pesticides in isocratic reversed-phase high-performance liquid chromatography using multilinear regression and artificial neural networks [17]. In another study, they applied a six-parameter nonlinear QSRR model to predict the retention behavior of 26 pesticides including commonly used insecticides, herbicides, and fungicides as well as some metabolites in reversed-phase high-performance liquid chromatography [18]. Also Ghasemi and his coworkers used multiple linear regression and partial least squares regression to QSRR study of the gas chromatography retention time of 38 diverse chlorinated pesticides, herbicides, and organohalides by using molecular descriptors [19]. In the present study, the application of ANN is being described in order to predict accurately the retention time values of 300 pesticides in four groups with different molecular structures [20].

2. Methods

2.1. Dataset

Development of the multiple linear regression and artificial neural networks in the present work relies on a data set taken from reference [20]. This dataset (Table 1) consists of 300 pesticides in animal tissues such as beef, mutton, pork, chicken, and rabbit ranging in retention time from 5.62 to 35.77?min. All of these 300 pesticides divided into four groups, depending upon properties and retention time of each pesticide. Each group is consisting different kind of pesticides such as acaricides, insecticides, and Fungicides. To apply the ANN modeling, the dataset was randomly divided into three groups of training, test, and validation sets consisting of 256, 22, and 22 pesticides, respectively. The training set was used for the model generation. The test set plays a different role in the cases of the MLR and the ANN models. For the ANN model, this set was used for early stopping to optimize learning iteration and avoid overtraining. The validation set was used to assess the accuracy of the ANN predictions. On the other hand, in the case of the MLR model, the test set and the validation set were used to evaluate the model. As can be seen from Table 1, the pesticides in the test and validation sets were chosen in a way that adequately represents the training set in term of retention time.


NumberPesticide 𝑡 𝑅 (EXP) 𝑡 𝑅 (MLR) 𝑡 𝑅 (ANN)( 𝑡 𝑅 )EXP − ( 𝑡 𝑅 )ANN

Training set
1 Allidochlor 8.7810.229.83−1.05
2 Dichlormid 9.7410.279.88−0.14
3 Etridiazol 10.4216.1112.9−2.48
4 Chlormephos 10.5311.8814−3.47
5 Propham 11.3614.1712.13−0.77
6 Cycloate 13.5616.3216.4−2.84
7 Diphenylamine 14.5518.1315.58−1.03
8 Chlordimeform 14.9317.4413.691.24
9 Ethalfluralin 15.0016.1714.560.9
10 Thiometon 16.2018.117.82−1.62
11 Atrazine-desethyl 16.7613.9617.25−0.49
12 Clomazone 17.0020.9518.63−1.63
13 Fonofos 17.3120.4816.241.07
14 Simazine 17.8516.9918.93−1.08
15 Propetamphos 17.9719.7218.93−0.96
16 Secbumeton 18.3620.219.21−0.85
17 Dichlofenthion 18.8021.0217.381.42
18 Pronamide 18.7219.6121.15−2.43
19 Mexacarbate 18.8317.0518.640.2
20 Aldrin 19.6722.9221.33−1.66
21 Dinitramine 19.3518.5716.452.9
22 Ronnel 19.8022.6321.2−1.4
23 Cyprazine 20.1819.8218.431.75
24 Beta-HCH 20.3116.1116.933.38
25 Metalaxyl 20.6720.1921.92−1.25
26 Chlorpyrifos (-ethyl) 20.9625.2622.59−1.63
27 Methyl-parathion 20.8222.8521.44−0.62
28 Malathion 21.5421.222.9−1.36
29 Fenitrothion 21.6223.1721.190.43
30 Paraoxon-ethyl 21.5722.1921.390.18
31 Triadimefon 22.2225.4125.62−3.4
32 Parathion 22.322423.28−0.96
33 Pendimethalin 22.5920.8419.772.82
34 Linuron 22.4421.1319.942.5
35 Bromophos-ethyl 23.0623.1821.511.55
36 trans-chlordane 23.2923.422.610.68
37 Phenthoate 23.3024.9125.59−2.29
38 Fenothiocarb 23.7923.7624.85−1.06
39 Prothiophos 24.0424.9923.690.35
40 Dieldrin 24.4324.7724.55−0.12
41 Procymidone 24.3622.9126.47−2.11
42 Methidathion 24.4927.1826.96−2.47
43 Napropamide 24.8424.3523.121.72
44 Fenamiphos 25.2923.426.85−1.56
45 Aramite 25.6026.6325.360.24
46 Bupirimate 26.0025.1124.631.37
47 Carboxin 26.2523.2325.540.71
48 Flutolanil 26.2322.6627.39−1.16
49 Ethion 26.6926.726.60.09
50 Sulprofos 26.8725.625.141.73
51 Etaconazole-2 26.8922.0626.10.79
52 Myclobutanil 27.1926.3426.450.74
53 Diclofop-methyl 28.0828.7927.350.73
54 Propiconazole 28.1530.1528.96−0.81
55 Fensulfothin 27.9425.8426.941
56 Bifenthrin 28.5727.5629.69−1.12
57 Mirex 28.7228.4828.330.39
58 Benodanil 29.1422.7727.082.06
59 Nuarimol 28.9026.926.931.97
60Oxadexyl29.5923.4925.444.06
61 Tetramethirn 29.5928.228.880.71
62 Phosmet 30.4629.3429.940.52
63 Oxycarboxin 31.0025.7628.632.37
64 cis-Permethrin 31.4231.532.3−0.9
65 trans-Permethrin 31.6832.7432.6−0.92
66 Pyrazophos 31.6032.8729.971.63
67 Cypermethrin33.1933.6433.73−0.54
68 Fenvalerate34.4535.5934.070.38
69 Deltamethrin 35.7732.6933.931.84
70 EPTC 8.549.6411.15−2.61
71 Butylate 9.4910.0110.12−0.63
72 Dichlobenil 9.7511.2711.75−2
73 Pebulate 10.1811.3211.54−1.36
74 Nitrapyrin 10.8910.9512.12−1.23
75 Chloroneb 11.8515.3713.01−1.16
76 Tecnazene 13.5415.816.61−3.07
77 Heptenophos 13.7818.8415.01−1.23
78 Hexachlorobenzene 14.6916.118.31−3.62
79 Ethoprophos 14.4013.6412.51.9
80 cis-Diallate 14.7514.6715.7−0.95
81 Propachlor 14.7316.1213.970.76
82 Trifluralin 15.2319.714.950.28
83 Chlorpropham 15.4916.4116.17−0.68
84 Sulfallate 15.7512.7912.782.97
85 Alpha-HCH 16.0615.9918.04−1.98
86 Terbufos 16.8316.3817.13−0.3
87 4,4-DDE 23.9224.1924.68−0.76
88 Chlorbufam 17.8518.1518.66−0.81
89 Fluotrimazole 28.3926.6728.64−0.25
90 Terbuthylazine 18.0718.4219.7−1.63
91 Monolinuron 18.1519.3620.27−2.12
92 Cyanophos 18.7320.3419.53−0.8
93 Chlorpyrifos-methyl 19.3823.8621.45−2.07
94 Desmetryn 19.6417.418.431.21
95 Alachlor 20.0319.2419.270.76
96 Terbutryn 20.6120.0319.371.24
97 Thiobencarb 20.6320.7719.730.9
98 Dicofol 21.3323.7623.45−2.12
99 Metolachlor 21.3421.0521.68−0.34
100 Methoprene 21.7123.0822.07−0.36
101 Bromofos 21.7523.7222.31−0.56
102 Ethofumesate 21.8424.2723.8−1.96
103 Isopropalin 22.1026.624.73−2.63
104 Propanil 22.6816.9419.792.89
105 Crufomate 22.9321.3119.593.34
106 Chlorfenvinphos 23.1923.5323.32−0.13
107 cis-Chlordane 23.5523.421.711.84
108 Tolylfluanide 23.4522.7623.52−0.07
109 Butachlor 23.8222.5524.27−0.45
110 Chlozolinate 23.8325.5423.060.77
111 Crotoxyphos 23.9424.2624.8−0.86
112 Iodofenphos 24.3323.8521.872.46
113 Chlorbromuron 24.3721.8920.713.66
114 Profenofos 24.6522.7323.341.31
115 Buprofezin 24.8724.5323.880.99
1162,4′-DDD24.9423.1623.561.38
117 Endrin 25.1524.9225.10.05
118 Hexaconazole 24.9226.8426.97−2.05
119 2,4-DDT 25.5622.3123.681.88
120 Methoprotryne 25.6326.8725.60.03
121 Erbon 25.6825.9326.83−1.15
122 Chloropropylate 25.8525.4325.320.53
123 Nitrofen 26.1223.0828.85−2.73
124 Oxyfluorfen 26.1325.5929.37−3.24
125 Chlorthiophos 26.5226.2625.960.56
126 Endosulfan I 26.7223.9626.390.33
127 4,4-DDT 27.2222.925.152.07
128 Carbofenothion 27.192726.870.32
129 Benalyxyl 27.5427.727.98−0.44
130 Edifenphos 27.9425.8725.961.98
131 Triazophos 28.2327.8426.841.39
132 Chlorbenside sulfone 28.8826.1527.851.03
133 Endosulfan-sulfate 29.0528.3128.930.12
134 Bromopropylate 29.3025.9327.741.56
135 Benzoylprop-ethyl 29.4028.8128.840.56
136 Fenpropathrin 29.5628.7929.79−0.23
137 Phosalone 31.2231.1131.56−0.34
138 Azinphos-methyl 31.4131.3630.371.04
139 Fenarimol 31.6529.5628.812.84
140 Azinphos-ethyl 32.0132.9131.790.22
141 Prochloraz 33.0733.0331.521.55
142 Coumaphos 33.2228.7730.862.36
143 Cyfluthrin32.9434.1833.61−0.67
144 Dichlorvos 7.809.727.98−0.18
145 Biphenyl 9.0016.6211.96−2.96
146 Vernolate 9.8211.5710.9−1.08
147 3,5-Dichloroaniline 11.2010.4210.550.65
148 Molinate 11.9214.0713.92−2
149 Methacrifos 11.8615.713.49−1.63
150 2-Phenylphenol 12.4716.3212.140.33
151 cis-1,2,3,6-tetrahydrophthalimide 13.3912.3511.961.43
152 Fenobucarb 14.6014.4213.70.9
153 Prometon 16.6618.4216.82−0.16
154 Triallate 17.1215.5513.93.22
155 Pyrimethanil 17.2818.4518.46−1.18
156 Gamma-HCH 17.4816.1218.09−0.61
157 Disulfoton 17.6116.9716.660.95
158 Heptachlor 18.4921.6119.83−1.34
159 Isazofos 18.5422.6220.07−1.53
160 Fenpropimorph 19.222521.3−2.08
161 Transfluthrin 19.0423.9622.15−3.11
162 Tolclofos-methyl 19.6921.1118.511.18
163 Metobromuron 20.0719.9220.38−0.31
164 HCH, epsilon- 20.7816.0517.543.24
165 Dipropetryn 20.8220.3221.8−0.98
166 Formothion 21.4217.6718.962.46
167 Diethofencarb 21.4320.6619.422.01
168 Dimepiperate 22.2821.7421.920.36
169 Bioallethrin-1 22.2922.921.570.72
170 2,4-DDE 22.6422.4624.45−1.81
171 Fenson 22.5423.6521.870.67
172 Chlorthion 22.8624.722.650.21
173 Prallethrin 23.1123.9622.590.52
174 Mecarbam 23.462225.24−1.78
175 Flumetralin 24.1027.8524.44−0.34
176 Triadimenol 24.2225.224.81−0.59
177 Pretilachlor 24.6722.6524.590.08
178 Uniconazole 26.1525.53260.15
179 Flusilazole 26.1929.2127.8−1.61
180 Fluorodifen 26.5925.1925.071.52
181 Diniconazole 27.0326.9527.24−0.21
182 Piperonyl butoxide 27.4628.9326.80.66
183 Mepronil 27.9125.3127.430.48
184 Fenazaquin 28.9729.3830.6−1.63
185 Fenoxycarb 29.5729.0130.52−0.95
186 Sethoxydim 29.6324.827.282.35
187 Anilofos 30.6829.8228.592.09
188 Permethrin 31.5730.2732.38−0.81
189 Pyridaben 31.8631.5730.561.3
190 Fluoroglycofen-ethyl 32.0134.8533.64−1.63
191 Bitertanol 32.2532.0530.821.43
192 Etofenprox 32.7532.5732.87−0.12
193 Cycloxydim 33.0528.5530.362.69
194 Alpha-cypermethrin 33.3528.931.062.29
195 Esfenvalerate 34.6536.73340.65
196 Difenconazole 35.4036.1433.032.37
197 Flumioxazin 35.5032.6732.453.05
198 Dimefox 5.62−0.467.13−1.51
199 Tri-iso-butyl phosphate 11.6510.4712.32−0.67
200 Crimidine 13.1312.3712.091.04
201 Chlorfenprop-methyl 13.5717.7513.6−0.03
202 2,3,5,6-Tetrachloroaniline 14.2213.512.711.51
203 Tri-n-butyl phosphate 14.3316.1514.42−0.09
204 2,3,4,5-Tetrachloroanisole 14.6615.9114.510.15
205 Tebutam 15.3016.5117.61−2.31
206 Dioxabenzofos 16.1417.7415.190.95
207 Simetone 16.6917.7916.84−0.15
208 Atratone 16.7018.8718.68−1.98
209 Bromocylen 17.4318.5317.240.19
210 Cycluron 17.9517.2219.68−1.73
211 Musk ambrette 18.6218.8617.77−1.15
212 Musk xylene 18.6620.7418.81−0.15
213 Pentachloroaniline 18.9115.3816.22.71
214 Aziprotryne 19.1121.0519.070.04
215 Sebutylazine 19.2619.0719.030.23
216 Isocarbamid 19.2416.7217.152.09
217 Musk moskene 19.4621.7921.16−1.7
218 Dimethenamid 19.5520.5420.39−0.84
219 Fenchlorphos oxon 19.7221.0418.441.28
220 BDMC-2 19.7415.4919.30.44
221 Paraoxon-methyl 19.8320.8720.76−0.93
222 Monalide 20.0219.2221.33−1.31
223 Isobenzan 20.5522.1821.66−1.11
224 Pyrimitate 20.5922.7918.971.62
225 Isodrin 21.0121.8822.55−1.54
226Isomethiozin21.0622.1419.891.17
227 Dacthal 21.2522.3821.38−0.13
228 4,4-Dichlorobenzophenone 21.2921.4519.451.84
229 Nitrothal-isopropyl 21.6921.8521.60.09
230 Rabenzazole 21.7321.1724.36−2.63
231 Fuberidazole 22.1020.6721.260.84
232 Isofenphos oxon 22.0423.3221.780.26
233 Dicapthon 22.4422.7422.20.24
234 Isocarbophos 22.8724.0122.750.12
235 Phorate sulfone 23.1519.7425.53−2.38
236 Chlorfenethol 23.2921.9822.520.77
237 trans-Nonachlor 23.6225.124.51−0.89
238 Dinobuton 23.8822.7122.011.87
239 DEF 24.0821.2423.610.47
240 Flurochloridone 24.3118.1521.832.48
241 Bromfenvinfos 24.6225.0324.94−0.32
242 Ditalimfos 24.8225.2825.31−0.49
243 4,4-Dibromobenzophenone 25.3023.0624.620.68
244 Disulfoton sulfone 26.1620.6624.241.92
245 Cyproconazole 27.2326.8726.410.82
246 Phthalic acid, benzyl butyl ester 27.5627.1126.221.34
247 Clodinafop-propargyl 27.7431.229.61−1.87
248 Fenthion sulfone 28.5527.4629.4−0.85
249 Metamitron 28.6321.7827.391.24
250 Tebufenpyrad 29.0630.0129.27−0.21
251 Cloquintocet-mexyl 29.3229.6229.85−0.53
252 Lenacil 29.7021.6328.281.42
253 Bromuconazole 29.9028.3229.660.24
254 Fenamiphos sulfone 31.3429.0830.171.17
255 Fluquinconazole 32.6232.2230.222.4
256 Fenbuconazole 34.0233.631.532.49
Test set
257 Tetradifon 30.7029.1430.260.44
258 Fluorochloridone 25.1419.7224.40.74
259 Cyanofenphos 28.4327.7829.74−1.31
260 EPN 30.0630.4231.71−1.65
261 Benfluralin 15.2316.2114.161.07
262 Atrizine 17.6416.4519.4−1.76
263 Simetryn 20.1818.8519.710.47
264 Metribuzin 20.3317.6117.782.55
265 Bioallethrin-2 22.3423.5821.890.45
266 Kresoxim-methyl 25.0427.4727.78−2.57
267 Propargite 27.8726.8530.06−2.19
268 Amitraz 30.2928.8329.760.53
269 Trietazine 17.5318.7819.76−2.23
270 Prosulfocarb 19.5121.0821.51−2
271 Octachlorostyrene 20.6020.8123.38−2.78
272 Methfuroxam 22.4521.1524.43−1.98
273 Flutriafol 25.3126.8227.15−1.84
274 Diclobutrazole 25.9527.2227.61−1.66
275 Triphenyl phosphate 28.6526.8830.41−1.76
276 Desbrom-leptophos 30.1527.9728.321.83
277 Propisochlor 19.8916.9619.98−0.09
278 Ametryn 20.1118.9819.40.71
Valid set
279 Quintozene 16.7518.118.4−1.65
280 Prometryne 20.1319.8219.710.42
281 Chlorbenside 22.9623.7325.4−2.44
282 Oxadiazone 25.0626.6726.72−1.66
283 Tetrasul 25.8525.4826.27−0.42
284 Etaconazole-1 26.8128.4127.98−1.17
285 Pyridaphenthion 30.1729.1429.640.53
286 trans-Diallate 15.2914.8215.52−0.23
287 Propazine 17.6718.1519.5−1.83
288 Pirimiphos-methyl 20.3023.6622.81−2.51
289 Dichlofluanid 21.6822.0322.53−0.85
290 Profluralin 17.3622.6417.160.2
291 Tetrachlorvinphos 24.3622.7524.140.22
292 Chlorfenson 25.0526.0723.341.71
293 2,4-DDD 24.9426.1327.65−1.75
294 Leptophos 30.1930.529.650.54
295 Nitralin 30.9229.6529.371.55
296 Fenamiphos sulfoxide 31.0326.9330.380.65
297 Dicloran 17.8926.9417.140.75
298 Perthane 24.8115.126.01−1.2
299 Cyprodinil 21.9422.1922.63−0.69
300 Mefenacet 31.2929.7530.271.02

2.2. Molecular Descriptors

All structures were generated with the HyperChem (Version 7) [21] and optimized with the classic potential MM+ included. Molecular geometry was optimized with the Austin Model 1 (AM1) method [22], and then the molecular descriptors were calculated by the software Dragon 3.0 [23]. Overall more than 1400 theoretical descriptors were calculated for each molecule by this software. These descriptors can be classified into several groups: 0D: constitutional descriptors; 1D: functional groups, atom-centered fragments, empirical descriptors and molecular properties; 2D: topological descriptors, molecular walk counts, BCUTs descriptors, Galvez topological charge indices, and 2D autocorrelations; 3D: aromaticity indices, Randic molecular profiles from the geometry matrix, geometrical, RDF, 3D-MORSE, WHIMs, and GETAWAYs descriptors. Molecular descriptor meanings and their calculation procedure are explained in Handbook of molecular descriptors by Todeschini. These molecular descriptors of different kinds were used to describe compound chemical diversity.

2.3. Regression Analysis

The main goal of the generation of the MLR model was to choose a set of suitable descriptors that can be used as inputs for construction of the ANN model. Linear models were formed by a stepwise selection of important descriptors and MLR model construction [24]. The best MLR model is one that has high correlation coefficient and ??-value, low standard error, and high prediction power. The statistics of the constructed MLR model is presented in Table 2.


DescriptorNotationCoefficientMean effect

Number of nitrogen atoms 𝑛 N 0.980 (±0.123)0.091
Solvation connectivity index-Chi1χ1sol2.191 (±0.125)1.467
Balaban 𝑌 index 𝑌 index−4.639 (±0.401)−0.421
Moran autocorrelation-lag 2/weighted by atomic sanderson electronegativityMATS2e−7.386 (±1.022)−0.104
Total absolute charge (electronic charge index—ECI)Qtot0.499 (±0.084)0.211
Radial Distribution-6.0/unweightedRDF060u−0.156 (±0.025)−0.140
Constant7.045 (±1.346)

𝑁 = 2 5 6 , 𝐹 = 2 4 0 , 𝑅 = 0 . 9 2 5 , S E = 2 . 4 5 6 .
2.4. Artificial Neural Network

A detailed description of theory behind artificial neural networks has been adequately described in several publications [2531]. An ANN program was written in MATLAB 7 in our laboratory. This network was feed-forward fully connected that has three layers with sigmoidal transfer function. Descriptors appearing in the MLR models were used as inputs of network and signal of the output node represent the retention time of interested compound. Thus, this network has six nodes in input layer and one node in output layer. The value of each input was divided into its mean value to bring them into dynamic range of the sigmoid transfer function of the network. The back-propagation algorithm was used for the training of the network [32, 33]. Before training the network, the parameters of the number of nodes in the hidden layer, weights and biases learning rates, and momentum values were optimized [3436]. Optimized values of these parameters were numbers of nodes in the input layer (=6), numbers of nodes in the hidden layer (=7), numbers of nodes in the output layer (=1), weights learning rates (=0.1), bias learning rate (=0.4), momentum (=0.5), and transfer function was sigmoid. The ANN-calculated values of permeability coefficient for training, test, and prediction sets, are shown in Table 1.

3. Results and Discussion

3.1. Analysis by Multiple Linear Regressions

The best MLR model (Table 2) for the training set includes six descriptors. These descriptors are number of nitrogen, Solvation connectivity index-chi 1, Balaban ?? index, Moran autocorrelation-lag?2/weighted by atomic Sanderson electronegativity, total absolute charge, and radial distribution function-6.0/unweighted.

The correlation between these descriptors is shown in Table 3. As shown in this table, there are no significant correlations between these descriptors.


𝑛 𝑁 χ1sol 𝑌 indexMATS2eQtotRDF060u

𝑛 𝑁 1
χ1sol−0.1621
𝑌 index−0.1200.5401
MATS2e0.315−0.2400.0651
Qtot−0.0820.4270.034−0.5351
RDF060u0.0840.634−0.3630.1620.4331

The first descriptor with the larger mean effect is solvation connectivity index—chi 1 (?1sol) that defined in order to model solvation entropy and describe dispersion interactions in solution. Taking into account the characteristic dimension of the molecules by atomic parameters, they are defined as????????=?12???+1·?????+1??????????=1?????????????=1???????1/2????,(1) where ???? is the principal quantum number (2 for C, N, O atoms, 3 for Si, S, Cl, …) of the ??th atom in the ??th subgraph and ???? the corresponding vertex degree; ?? is the total number of ??th order subgraphs; ?? is the number of vertices in the subgraph. The normalization factor 1/(2)??+1 is defined in such a way that the indices ???? and ?????? for compounds containing only second-row atoms coincide. The first-order solvation connectivity index is 1????=14·?????????·????????????·???????1/2????,(2) where ?? runs over all the bonds; ???? and ???? are the principal quantum numbers of the two vertices incident to the considered bond. This index coincides with the Randic connectivity index 1?? for the hydrocarbons; ??=2 for all the atoms. These molecular descriptors are defined for an H-depleted molecular graph [37].

The positive sign for the mean effect of this descriptor reveals that molecules have higher numerical value of ?1sol, therefore, they have longer retention time. For example, HCH-epsilon (compound 164) with 6 chloride atoms in its structure has bigger ?1sol (7.75) than Allidochlor (compound 1) with one chloride atom in its structure (5.61) and also has longer retention time than Allidochlor (20.78 and 8.78?min, resp., Figure 1). Compounds that have more atoms in their structure have larger numerical value of ?1sol and so they have longer retention time, for example, compound 194 (Alpha-cypermethrin) has ?1sol value of 13.32 and retention time of 33.35?min but compound 4 (Chlormephos), which has less atoms and shorter structure than Alpha-cypermethrin, has ?1sol and retention time of 7.21 and 10.53?min, respectively (Figure 2). The second important descriptor is Balaban ?? index which was calculated by the same formula as the Balaban distance connectivity index ??, but by using atomic information indices instead of vertex distance degrees [38]. The ?? index is defined based on atomic information indices ???? calculated for vertices of a H-depleted molecular graph as follows:????=?????????-1????·??·log2??,(3) where ?? runs over all of the different topological distances from the ??th vertex, ?????? is the number of distances from the ??th vertex equal to ??, and ???? is the ??th atom eccentricity (i.e., the maximum topological distance from the considered atom) [38]. So, the Balaban ?? index is calculated as????index=·???+1edge(??,??)?????·?????-1/2,(4) where ?? is the number of bonds and ?? is the cyclomatic number. As the number of cycles, and branching in molecular structures increased (so ???? in (2) increased), the value of ?? index decreased. For example, compounds 70 (EPTC), 80 (Cis-Diallate), and 97 (Thiobenarb) which have amine group in their structures with retention time of 8.54, 14.75, 20.63?min have ?? index values of 2.27, 2.15, 1.02, respectively. This descriptor has negative sign for its mean effect. Therefore, as the numerical value of this descriptor increased the retention time of compounds decreased. For example in compounds 149 (Methacrifos), 27 (Methyl-parathion), 131 (Triazophos) which are organophosphorous compounds ?? index values are 2.08, 1.18, and 0.81, and retention times are 11.86, 20.82, and 28.23?min, respectively.

The total absolute charge is the other descriptor which, also known as the electronic charge index (ECI), is the sum of absolute charge over all atoms in a molecule and is a measure of molecule polarity [37]. Therefore, compounds with polar bonds have more numerical value of Qtot and have longer retention on a polar stationary phase than others. For example, compound 288 (Pirimiphos-methyl) has more retention and polarity (20.30?min and 10.14, resp.) than Dichlobenil (compound 72) that has Qtot of 1.02 and retention time of 9.75?min (Figure 3), also Compounds 9 (Ethalfuralin), 21 (Dinitramine), and 175 (Flumetralin) that contain CF3 and two nitro groups in their structure have retention time of 15, 19.35, 24.10, and Qtot value of 6.66, 6.86, and 7.33, respectively. The other molecular descriptors with lower mean effects are number of nitrogen atoms, Moran autocorrelation-lag?2/weighted by atomic sanderson electronegativity and radial distribution-6.0/unweighted. As number of nitrogen atoms increase, in molecule’s structure due to increasing the interaction of molecules with polar stationary phase, the retention time of compound increases. For example, in etridiazol, atrazine-desethyl, and fluquinconazole (compounds 3, 11, and 255, resp.) which have 2, 5, and 6 nitrogen atoms in their structure, their retention time is 10.42, 16.76, and 32.62, respectively. A Moran coefficient is a general index of spatial autocorrelation that, if applied to a molecular graph, can be defined as(???(??)=1/?)·????=1?????=1??????·?????-???·?????-????(1/??)·????=1?????-???2,(5) where ???? is any atomic property (here Sanderson electronegativity), ?? is its average value on the molecule, ?? is the atom number, ?? is a Kronoker delta (??????=1 if ??????=??, zero, otherwise). ? is the sum of the Kronoker deltas, that is, the number of vertex pairs at distance equal to ?? [39].

So the Moran autocorrelation-lag?2/weighted by atomic sanderson electronegativity can be a factor of electronegativity of moleculs. The last descriptor is radial distribution-6.0/unweighted. The 3D coordinates of the atoms of molecules can be transformed into a structure code that has a fixed number of descriptors irrespective of the size of a molecule. This task is performed by a structure coding technique referred to as radial distribution function code (RDF code) [40]. In general, there are some prerequisites for a structure code: independence from the number of atoms, that is, the size of a molecule, unambiguity regarding the three-dimensional arrangement of the atoms, and invariance against translation and rotation of the entire molecule.

Formally, the radial distribution function of an ensemble of ?? atoms can be interpreted as the probability distribution to find an atom in a spherical volume of radius ?? [41]. The equation represents the radial distribution function code as it is used in this investigation:??(??)=??·??-1????????>??????·????·??-??(??-??????)2,(6) where ?? is a scaling factor and ?? is the number of atoms. By including characteristic atomic properties ?? of the atoms ?? and ??, the RDF codes can be used in different tasks to fit the requirements of the information to be represented. The exponential term contains the distance ?????? between the atoms ?? and ?? and the smoothing parameter ?? that defines the probability distribution of the individual distances. ??(??) was calculated at a number of discrete points with defined intervals.

The atomic properties ???? and ???? used in this equation enable the discrimination of the atoms of a molecule for almost any property that can be attributed to an atom. Such distribution function provides, besides information about interatomic distances in a whole molecule, the opportunity to gain access to other valuable information, for example, bond distance, ring types, planar and nonplanar systems, and atoms types. This fact is a most valuable consideration for a computer-assisted code elucidation. The radial distribution function in this form meets the entire requirement mentioned above, especially invariance against linear translations. As RDF060u has negative sign for its mean effect, as molecules became larger their RDF factor in 6?Å radius reduces. For example Tebufenpyrad (compound 250) which has larger structure than Pyrimitate (compound 224) has lower RDF factor (18.79) than Pyrimitate (29.76); so the retention time of Tebufenpyrad is higer (29.06?min) than Pyrimitate (20.59).

From the above discussion, it can be seen that all descriptors involved in the QSRR model has physical meaning, and these descriptors can account for the structural features that affect the retention time of under studied pesticides.

3.2. Comparison of Neural Network and MLR Models

A graphical comparison of ANN and MLR analysis is given in Figure 4, where the retention time values calculated by means of the respective models are plotted against the experimental values. The statistical parameters obtained by ANN and MLR models for these sets are shown in Table 4. The standard errors of training, test and validation sets for the MLR model are 2.402, 1.858, and 2.036, respectively, which would be compared with the values of 1.559, 1.517, and 1.249, respectively, for the ANN model. Comparison between these values and other statistical parameters in Table 4 reveals the superiority of the ANN model over MLR ones. Figure 5 shows the plot of the residuals against the experimental values of retention time, for the ANN model. Since the residuals are propagated on both sides of the zero line, there is no systematic error in developing of ANN model.


Model S E t r S E t SEv 𝑅 t r 𝑅 t 𝑅 v 𝐹 t r 𝐹 t 𝐹 v

ANN1.5591.5171.2490.9690.9510.9713928188334
MLR2.4021.8582.0360.9250.9240.9221508118113

t r : training, t : test, v : valid.

4. Conclussions

Few structure-activity relationships involving pesticides have been published. In this study, we use MLR and ANN to predict the retention time of 300 pesticides that were different in molecular structure. The results of this study demonstrate that QSRRs method using ANN techniques can generate a suitable model for prediction of gas chromatographic retention of pesticides.

Also the results obtained in this work indicate that the regression and ANN models exhibit reasonable prediction capabilities. Descriptors which appeared in the obtained QSRR models reveal that electronic interactions as well as steric parameters can be affected on the gas chromatographic retention time of pesticides.

References

  1. M. Skamoto and T. Tsutsumi, “Applicability of headspace solid-phase microextraction to the determination of multi-class pesticides in waters,” Journal of Chromatography A, vol. 1028, no. 1, pp. 63–74, 2004. View at: Publisher Site | Google Scholar
  2. F. Hernández, O. J. Pozo, J. V. Sancho, L. Bijlsma, M. Barreda, and E. Pitarch, “Multiresidue liquid chromatography tandem mass spectrometry determination of 52 non gas chromatography-amenable pesticides and metabolites in different food commodities,” Journal of Chromatography A, vol. 1109, no. 2, pp. 242–252, 2006. View at: Publisher Site | Google Scholar
  3. C. Gonçalves and M. F. Alpendurada, “Solid-phase micro-extraction—gas chromatography—(tandem) mass spectrometry as a tool for pesticide residue analysis in water samples at high sensitivity and selectivity with confirmation capabilities,” Journal of Chromatography A, vol. 1026, no. 1-2, pp. 239–250, 2004. View at: Publisher Site | Google Scholar
  4. L. Alder, S. Lüderitz, K. Lindtner, and H. J. Stan, “The ECHO technique—the more effective way of data evaluation in liquid chromatography-tandem mass spectrometry analysis,” Journal of Chromatography A, vol. 1058, no. 1-2, pp. 67–79, 2004. View at: Publisher Site | Google Scholar
  5. A. R. Katritzky, V. S. Lobanov, and M. Karelson, “QSPR: the correlation and quantitative prediction of chemical and physical properties from structure,” Chemical Society Reviews, vol. 24, no. 4, pp. 279–287, 1995. View at: Google Scholar
  6. R. Kaliszan, Structure and Retention in Chromatographic Approach, Harwood Academic, Amsterdam, The Netherlands, 1997.
  7. C. Bishop, Neural Networks for Pattern Recognition, University Press, Oxford, UK, 1995.
  8. L. Fausett, Fundamentals of Neural Networks, Preentice Hall, New York, NY, UK, 1994.
  9. G. C. Looney, Pattern Recognition Using Neural Networks, Oxford University Press, New York, NY, USA, 1997.
  10. D. E. Rumelhart and J. L. McClelland, Parallel Distributed Processing. Experiments in the Microstructure of Cognition, MIT Press, Cambridge, Mass, USA, 1986.
  11. S. Goll and P. Jurs, “Prediction of vapor pressures of hydrocarbons and halohydrocarbons from molecular structure with a computational neural network model,” Journal of Chemical Information and Computer Sciences, vol. 39, no. 6, pp. 1081–1089, 1999. View at: Google Scholar
  12. J. Tetteh, T. Suzuki, E. Metcalfe, and S. Howells, “Quantitative structure-property relationships for the estimation of boiling point and flash point using a radial basis function neural network,” Journal of Chemical Information Computer Science, vol. 39, no. 3, pp. 491–507, 1999. View at: Google Scholar
  13. Z. J. Gasteiger, Neural Networks for Chemists: an Introduction, Wiley-VCH, Weinheim, Germany, 1993.
  14. J. A. Burns and G. Whitesides, “Feed-forward neural networks in chemistry: Mathematical systems for classification and pattern recognition,” Chemical Reviews, vol. 93, no. 8, pp. 2583–2601, 1993. View at: Google Scholar
  15. D. Svozil, V. Kvasnicka, and J. Pospichal, “Introduction to multi-layer feed-forward neural networks,” Chemometrics and Intelligent Laboratory Systems, vol. 39, no. 1, pp. 43–62, 1997. View at: Google Scholar
  16. S. Agatonovic-Kustrin, L. H. Ling, S. Y. Tham, and R. G. Alany, “Molecular descriptors that influence the amount of drugs transfer into human breast milk,” Journal of Pharmaceutical and Biomedical Analysis, vol. 29, no. 1-2, pp. 103–119, 2002. View at: Publisher Site | Google Scholar
  17. A. A. D'Archivio, F. Ruggieri, P. Mazzeo, and E. Tettamanti, “Modelling of retention of pesticides in reversed-phase high-performance liquid chromatography: quantitative structure-retention relationships based on solute quantum-chemical descriptors and experimental (solvatochromic and spin-probe) mobile phase descriptors,” Analytica Chimica Acta, vol. 593, no. 2, pp. 140–151, 2007. View at: Publisher Site | Google Scholar
  18. M. Aschi, A. A. D'Archivio, M. A. Maggi, P. Mazzeo, and F. Ruggieri, “Quantitative structure-retention relationships of pesticides in reversed-phase high-performance liquid chromatography,” Analytica Chimica Acta, vol. 582, no. 2, pp. 235–242, 2007. View at: Publisher Site | Google Scholar
  19. J. Ghasemi, S. Asadpour, and A. Abdolmaleki, “Prediction of gas chromatography/electron capture detector retention times of chlorinated pesticides, herbicides, and organohalides by multivariate chemometrics methods,” Analytica Chimica Acta, vol. 588, no. 2, pp. 200–206, 2007. View at: Publisher Site | Google Scholar
  20. G. Pang, Y. Cao, J. Zhang et al., “Validation study on 660 pesticide residues in animal tissues by gel permeation chromatography cleanup/gas chromatography—mass spectrometry and liquid chromatography—tandem mass spectrometry,” Journal of Chromatography A, vol. 1125, no. 1, pp. 1–30, 2006. View at: Publisher Site | Google Scholar
  21. HyperChem and Autodesk, “Release 3 for windows,” 1993. View at: Google Scholar
  22. M. J. S. Dewar, E. G. Zoebisch, E. F. Healy, and J. J. Stewart, “Development and use of quantum mechanical molecular models. 76. AM1: a new general purpose quantum mechanical molecular model,” Journal of the American Chemical Society, vol. 107, no. 13, pp. 3902–3909, 1985. View at: Publisher Site | Google Scholar
  23. R. Todeschini, V. Consonni, M. Pavan, and V. Pisani, 13-20124, Milano, Italy, Dragon software version 3.0.
  24. N. Draper and H. Smith, Applied Regression Analysis, Wiley Interscience, New York, NY, USA, 2nd edition, 1981.
  25. S. Haykin, Neural Network, Prentice-Hall, Englewood Cliffs, NJ, USA, 1994.
  26. J. Zupan and J. Gasteiger, Neural Networks in Chemistry and Drug Design, Wiley-VCH, Weinheim, Germany, 1999.
  27. N. K. Bose and P. Liang, Neural Network, Fundamentals, McGraw-Hill, New York, NY, USA, 1996.
  28. L. S. Anker and P. C. Jurs, “Prediction of carbon-13 nuclear magnetic resonance chemical shifts by artificial neural networks,” Analytical Chemistry, vol. 64, no. 10, pp. 1157–1164, 1992. View at: Publisher Site | Google Scholar
  29. M. T. Beal, H. B. Hagan, and M. Demuth, Neural Network Design, PWS, Boston, Mass, USA, 1996.
  30. J. Zupan and J. Gasteiger, Neural Networks for Chemists: An Introduction, Wiley-VCH, Weinheim, Germany, 1993.
  31. P. K. Hopke and X. Song, “Classification of single particles by neural networks based on the computer-controlled scanning electron microscopy data,” Analytica Chimica Acta, vol. 348, no. 1–3, pp. 375–388, 1997. View at: Google Scholar
  32. S. Haykin, Neural Networks. AComprehensive Foundation:, vol. 1, Pearson Education, Saddle River, NJ, USA, 2nd edition, 1999.
  33. T. Masters, Practical Neural Network Recipes in C++, Academic Press, 1993.
  34. M. Jalali-Heravi and M. H. Fatemi, “Prediction of flame ionization detector response factors using an artificial neural network,” Journal of Chromatography A, vol. 825, no. 2, pp. 161–169, 1998. View at: Publisher Site | Google Scholar
  35. M. Jalali-Heravi and M. H. Fatemi, “Prediction of thermal conductivity detection response factors using an artificial neural network,” Journal of Chromatography A, vol. 897, no. 1-2, pp. 227–235, 2000. View at: Publisher Site | Google Scholar
  36. M. H. Fatemi, M. Jalali-Heravi, and E. Konuze, “Predictions of bioconcentration factors using genetic algorithm and artificial neural network,” Analytica Chimica Acta, vol. 486, pp. 101–108, 2003. View at: Google Scholar
  37. R. Todeschini and V. Consonni, Hand book of Molecular Descriptors, Wiley-VCH, Weinheim, Germany, 2000.
  38. A. T. Balaban and T. S. Balaban, “New vertex invariants and topological indices of chemical graphs based on information on distances,” Journal of Mathematical Chemistry, vol. 8, no. 1, pp. 383–397, 1991. View at: Publisher Site | Google Scholar
  39. P. A. P Moran, “Notes on continuous stochastic phenomena,” Biometricka, vol. 37, pp. 17–23, 1950. View at: Google Scholar
  40. J. Gasteiger, J. Sadowski, J. Schuur, P. Selzer, L. Steinhauer, and V. Steinhauer, “Chemical information in 3D space,” Journal of Chemical Information and Computer Sciences, vol. 36, no. 5, pp. 1030–1037, 1996. View at: Google Scholar
  41. M. C. Hemmer, V. Steinhauer, and J. Gasteiger, “Deriving the 3D structure of organic molecules from their infrared spectra,” Vibrational Spectroscopy, vol. 19, no. 1, pp. 151–164, 1999. View at: Google Scholar

Copyright © 2013 Elaheh Konoz et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.


More related articles

 PDF Download Citation Citation
 Download other formatsMore
 Order printed copiesOrder
Views2951
Downloads1026
Citations

Related articles

We are committed to sharing findings related to COVID-19 as quickly as possible. We will be providing unlimited waivers of publication charges for accepted research articles as well as case reports and case series related to COVID-19. Review articles are excluded from this waiver policy. Sign up here as a reviewer to help fast-track new submissions.