Research Article  Open Access
A Comparison of SecondOrder Calibration Methods Applied to ExcitationEmission Matrix Fluorescence Data
Abstract
Due to the variety of secondorder data being generated by modern instruments and various mathematical algorithms being available for analysis purposes, secondorder calibration is gaining widespread acceptance by analytical community. It has the socalled secondorder advantage; that is, it enables concentration and spectral profiles of sample components to be extracted even in the presence of unexpected interferences. A comprehensive performance comparison of alternating trilinear decomposition (ATLD) and its two variants, that is, alternating penalty trilinear decomposition (APTLD) and selfweighted trilinear decomposition (SWATLD), was presented in this paper. The experiment was based on the simultaneous determination of three dihydroxybenzenes, that is, catechol, hydroquinone, and resorcinol, by excitationemission matrix fluorescence (EEMF) spectroscopy. Two special measures, that is, the consistency (COS) between the resolved and actual profiles and the mean of recovery, were used for evaluation. The optimal result was obtained by the APTLD model with five components. No perceptible difference on the speed of convergence was found. It indicates that EEMF linked with the APTLD algorithm can serve as a potential tool of quantifying dihydroxybenzenes simultaneously in environmental samples.
1. Introduction
Determination of the analytes of interest in complex matrix is a very challenging task in many fields. The traditional practice is to resort to certain timeconsuming, laborious, and costexpensive physical or chemical separation. Often, the equilibrium that existed in the mixture is maybe broken by the separation and can therefore mislead subsequent quantification. With the development of modern secondorder instruments capable of generating matrix signal, a number of new analytical methods become available [1–5]. In recent years, the fluorescent properties of many substances have been widely exploited for analytical purposes due to the progress of detecting fluorescence. However, the selectivity is maybe a problem due to heavy spectral overlap and inferences [6]. Nowadays, fluorescence detection combined with chemometrics has greatly changed this situation. One of the possible strategies consists of collecting the excitationemission matrix fluorescence and extracting useful information by secondorder calibration algorithms. More specifically, secondorder calibration includes the main steps: to decompose a threeway array into three matrices and to build a regression equation between the resolved relative concentration of the analytes of interest and the corresponding actual concentration. Such a procedure has a property named secondorder advantage [7], which enables concentrations and spectral profiles of sample components to be extracted even in the presence of any number of unknown constituents. It also makes it possible to quantify several components simultaneously. Such strategies have been successfully applied in many analytical fields, for example, food, pharmaceuticals, environmental biomedical matrices, and so forth.
For secondorder calibrationbased analytical applications, the corresponding algorithm is decisive. Available algorithms can be classified into three types [8]. The first type is based on generalized eigenanalysis, such as generalized rank annihilation method (GRAM) [9] and direct trilinear decomposition (DTLD) [10]. However, GRAM is constrained to use only one calibration sample and one unknown sample at a time. DTLD allows for a direct decomposition of multiple samples, but it needs the construction of two pseudosamples. Both algorithms can only exhibit good performance on condition that the ratio of signaltonoise is high. Otherwise, imaginary solutions can be observed. The second type is based on iterative ones, such as classic parallel factor analysis (PARAFAC) proposed by Harshman [11] and Bro [12] and alternating direct trilinear decomposition (ATLD) proposed by Wu et al. [13]. These algorithms use different loss functions to obtain secondorder advantage. Although PARAFAC has been successfully applied to many chemical problems, it maybe leads to chemically meaningless solution in certain cases and sometimes its solutions become unstable, especially when the factor number is not appropriate. Besides, PARAFAC easily suffers from slow convergence. By using MoorePenrose generalized inverse from truncated single value decomposition, ATLD can extract the diagonal elements and avoid the insensitiveness of the results to the component number. Also, as the calculation is implemented on slice matrices, the speed of convergence is relatively fast. In ATLD, all factors related to the diagonal elements influence the resolved results. Subsequently, its modified versions, that is, alternating penalty trilinear decomposition (APTLD) [14] and selfweighted alternating trilinear decomposition (SWATLD) [15], were developed. Both algorithms have the advantage of being insensitive to component/factor number and being fast to convergence. The wellknown algorithm named multivariate curve resolutionalternating least square (MCRALS) [16] also belongs to the iterative type. In essence, MCRALS is a bilinear method and can be used for threeway data array only when the matrix signal of each sample obeys bilinear method. The latter is also true for the other algorithms which need to obey the trilinearity condition. The last type is to rearrange the highorder array into vectors and apply a firstorder algorithm, including unfoldedprincipal component regression (UPCR) and unfoldedpartial least squares (UPLS) [17]. These algorithms were first used to handle secondorder data before the true secondorder algorithms were developed. The popular multiway partial least squares (NPLS) [18] is a genuine multiway algorithm, but it does not have the secondorder advantage. Both UPLS and NPLS do not exploit the secondorder advantage. Although many algorithms are available, it is very important to be able to select a secondorder calibration method that would be appropriate for the task at hand.
In the present work, a comprehensive performance comparison of three secondorder calibration algorithms, that is, ATLD and its variants (APTLD and SWATLD), is presented. The experiment was based on the simultaneous determination of three dihydroxybenzenes (catechol, hydroquinone, and resorcinol) in water samples by excitationemission matrix fluorescence (EEMF) spectroscopy. Two special measures, that is, the consistency (COS) between the resolved and actual profiles and the mean of recovery, were used for evaluation. The optimal result was obtained by the APTLD model with five components. No perceptible difference on the speed of convergence was found. It indicates that EEMF linked with APTLD algorithm can serve as a potential tool of quantifying dihydroxybenzenes simultaneously in environmental samples.
2. Theory and Methods
In secondorder calibration, the trilinear model can be depicted as follows [2, 8]: The denotes the number of factors/components related to the number of detectable species, including the components of interest, background, and interferences. , , and denote the numbers of excitation wavelengths, emission wavelengths, and samples, respectively. is the element of threeway array () with dimension of and is the element of the corresponding threeway residual array (). , , and are the elements of matrices , , and with dimensions of , , and size, corresponding to excitation profiles, emission profiles, and relative concentration profiles, respectively. Secondorder calibration consists of two main steps: (1) to decompose the threeway data array to produce three matrices and (2) to regress the relative concentration of the components of interest against the reference concentration. Different strategies on decomposing the array lead to different secondorder calibration algorithms.
2.1. ATLD Algorithm
By alternating converting threeway array to the matrix form, Wu et al. [13] developed the alternating trilinear decomposition (ATLD) algorithm, which employs the following loss function to compute , , and in true trilinear sense: where , , and denote the th horizontal, th lateral, and th frontal slices of , respectively, similarly, , , and can be defined, , , and are the th, th, and th rows of profiles in , , and , and is the function of building a diagonal matrix by given elements. By minimizing (2), the update of , , and can be obtained as
The function of can extract the diagonal elements of a matrix and transform them to a column vector. Actually, (3) is not the leastsquares solution of (2) but a strategy for the calculation of , , and . ATLD algorithm focuses on extracting the trilinear part in the threeway data and makes the iterative procedure more efficiency. The combination of generalized inverse based on truncated single value decomposition and diagm operation makes ATLD have the advantage of being insensitive to component number. Also, the calculation by slice matrix needs less memory and releases considerably computation task. In ATLD, any factors that influence the diagonal elements can give a corresponding influence on the results.
2.2. APTLD Algorithm
Based on the alternating leastsquares principle and alternating penalty constraints, Xia et al. [14] developed the socalled alternating penalty trilinear decomposition (APTLD) algorithm. In APTLD, the author uses three new objective functions by introducing the penalty term. Taking the calculation of as an example, the following loss function is used: where is the square root operator and 1 is a vector of length with all elements equal to one. Similarly, one can build the and . By minimizing these alternating penalty errors simultaneously, the intrinsic profiles are obtained. Compared to traditional parallel factor analysis (PARAFAC) algorithm, APTLD can avoid the twofactor degeneracy problem and speed up the convergence. It is found that APTLD is also insensitive to the estimated component number, thus avoiding the difficulty of finding correct component number for a model. In general, as long as the component number is not less than the actual number of components, APTLD can perform well.
2.3. SWATLD Algorithm
Selfweighted alternating trilinear decomposition (SWATLD) is a secondorder algorithm proposed by Chen et al. [15]. SWATLD is derived from ATLD and is based on the same ideology. When updating th row of , it uses the following equation: The equation can lead to a leastsquares solution and a proof is available in [8]. SWATLD not only is insensitive to the component number but also holds very fast convergence speed. In addition, the buildin way of updating makes the final solution more stable than ATLD.
2.4. Figures of Merit
With the aim of comparing the results of different secondorder calibration algorithms, the figures of merit including consistency (COS) between the resolved and actual profiles and the recovery are used. Generally, secondorder calibration algorithms can provide the quality information such as the excitation and emission spectra, which is very important, since a compound of interest can be qualified through its spectrum. The value of consistency between the resolved and reference profiles is defined as where and are the reference profiles of a component and and are the estimated profiles. Thus, the higher the value of COS is, the closer the resolved profile is to the real one.
3. Experimental and Data
A total of thirty samples containing different quantities of three dihydroxybenzenes, catechol, hydroquinone, and resorcinol, were analyzed by excitationemission fluorescence spectroscopy. At the same time, indole was used as the interference. All reagents and chemicals were of analytical reagent grade The concentration range of each component of analyte of interest was 0–9 × 10^{−5} mol/L. The concentration of indole was randomly controlled in the range of 0–2 × 10^{−5}. One half of the samples were used as concentration calibration samples and the other half were used as concentration prediction samples. All response matrices were recorded by PerkinElmer fluorescence spectrophotometer with excitation and emission wavelengths varying from 230 to 320 nm at intervals of 5 nm and from 230 to 500 nm at intervals of 2 nm, respectively. A 1 cm quartz cell was used for all measurement. The effect of Rayleigh scattering on response matrices was roughly reduced by subtracting the response matrix of an average blank solution from all samples response matrices. Both excitation and emission of monochromatic slit widths were 5 nm. For each sample, a matrix spectrum of 136 × 19 size was recorded. All programs were implemented in Matlab environment on a personal computer with an operating system of Windows Xp.
4. Results and Discussion
In the secondorder instrument, the response signal for a single chemical sample corresponds to a matrix, which can be visualized as a twodimensional surface or landscape. When considering a group of samples, all secondorder data can be stacked into a single threedimensional array. Therefore, secondorder data is also called threeway array. As an example, Figure 1 shows an excitationemission matrix fluorescent spectrum of a mixture. The excitation and emission wavelengths range from 230 to 500 nm and from 230 to 320 nm, respectively. The selection of such wavelength ranges was also based on a suitable consideration of the regions corresponding to maximum signals for the components of interest and avoiding useless background signals including Rayleigh and Raman scattering. Figure 2 gives the contours of the excitationemission matrix fluorescent spectra of pure components of interest and a representative mixture. The spectral overlapping of pure components of interest is very significant. So heavy overlapping also hinders the direct fluorescent quantification and restricts the use of univariate calibration. Nowadays, a modern strategy of overcoming this problem is to resort to secondorder calibration. It has lighted a new avenue to replace the physical or chemical separation with mathematical separation through extracting the signal of the components of interest from those of background or interferences.
(a)
(b)
(c)
(d)
Figure 3 gives the excitation and emission profiles of samples of three pure components of interest. As noticed in Figure 3, the spectral overlapping is very obvious. Three kinds of secondorder calibration algorithms, that is, ATLD, SWATLD, and APTLD, were used to resolve the spectral and concentration profiles. To develop these models, it is necessary to determine the number of components/factors and several ways are available for this purpose. In this study, the core consistency diagnostic was applied [19]. When a sequence of secondorder calibration models was constructed with an increasing number of components, the value of core consistency tended to start high and then dropped abruptly at the point where too many components were used. The optimal number of components was set as the number in the largest model with a high core consistency. In the light of the results, the optimal number was five for each of these algorithms. Figure 4 displays the resolved and actual excitation and emission profiles corresponding to three kinds of models with the same number of components. It can be observed in Figure 4 that both the SWATLD and APTLD algorithms work well and there exists no significant difference. However, the ATLD algorithm is difficult to obtain satisfactory results.
(a)
(b)
(a)
(b)
To analyze further the sensitivity of the model to the selected number of components, on the test set, these models with 4–8 components were constructed and the mean values of COS and recovery were summarized in Table 1, which presented the results consistent with the trend in Figure 4. Figure 5 displays the mean values of COS corresponding to three different algorithms and a varying number of components. Obviously, when the number of components is 4, 5, or 6, the COS value of each algorithm remains almost the same. It seems to be consistent with some other researches that, compared to PARAFAC, these secondorder calibration algorithms are insensitive to the chosen component number and can often work well on condition that the number of components is larger than the actual ones. However, it is also clear in Figure 5 that they still lead to a dangerous overfitting solution when too many OSC components are allowed. Therefore, it should be remembered that the insensitivity to component number is conditional and limited since using more factors inevitably introduces imaginary solutions.

Figure 6 gives the comparison of mean recovery of quantifying dihydroxy derivatives corresponding to three different algorithms and a varying number of components. As can be seen, on average, ATLD models containing 4–7 factors exhibit the same recovery, and the recovery of either APTLD or SWATLD model is highest when using five factors. In addition, one can also find that the big value of COS does not necessarily mean that the recovery value is high; their trends are not entirely consistent. It also indicates that using more than a measure is necessary when comparing different algorithms or models.
5. Conclusions
Based on the secondorder advantage, the combination of excitationemission matrix fluorescence spectroscopy and secondorder calibration was investigated for simultaneous determination of three dihydroxybenzenes. A comprehensive comparison was made. In terms of the two measures, that is, the consistency between the resolved and actual profiles (COS) and the mean of recovery on the test set, the APTLD algorithm outperformed the others. The convergence rate of different algorithms was similar. It indicates that such an approach can be a promising alternative for practical application in environmental quality control.
Conflict of Interests
The authors declare that there is no conflict of interests regarding the publication of this paper.
Acknowledgments
This work was supported by the National Natural Science Foundation of China (21375118), the Applied Basic Research Programs of Science and Technology Department of Sichuan Province of China (2013JY0101), the Yibin Municipal Innovation Foundation (2013GY018), the Innovative Research and Teaching Team Program of Yibin University (Cx201104), and the Scientific Research Foundation of Sichuan Provincial Education Department of China (12ZA201 and 13ZB0300).
References
 Y.N. Li, H.L. Wu, X.D. Qing et al., “The maintenance of the secondorder advantage: secondorder calibration of excitationemission matrix fluorescence for quantitative analysis of herbicide napropamide in various environmental samples,” Talanta, vol. 85, no. 1, pp. 325–332, 2011. View at: Publisher Site  Google Scholar
 G. M. Escandar, A. C. Olivieri, N. M. Faber, H. C. Goicoechea, A. Muñoz de la Peña, and R. J. Poppi, “Second and thirdorder multivariate calibration: data, algorithms and applications,” Trends in Analytical Chemistry, vol. 26, no. 7, pp. 752–765, 2007. View at: Publisher Site  Google Scholar
 K. R. Murphy, C. A. Stedmon, D. Graeber, and R. Bro, “Fluorescence spectroscopy and multiway techniques. PARAFAC,” Analytical Methods, vol. 5, no. 23, pp. 6557–6566, 2013. View at: Publisher Site  Google Scholar
 X. Zhang and R. Tauler, “Application of multivariate curve resolution alternating least squares (MCRALS) to remote sensing hyperspectral imaging,” Analytica Chimica Acta, vol. 762, pp. 25–38, 2013. View at: Publisher Site  Google Scholar
 V. Gómez and M. P. Callao, “Analytical applications of secondorder calibration methods,” Analytica Chimica Acta, vol. 627, no. 2, pp. 169–183, 2008. View at: Publisher Site  Google Scholar
 M. C. Ortiz, L. A. Sarabia, M. S. Sánchez, and D. Giménez, “Identification and quantification of ciprofloxacin in urine through excitationemission fluorescence and threeway PARAFAC calibration,” Analytica Chimica Acta, vol. 642, no. 12, pp. 193–205, 2009. View at: Publisher Site  Google Scholar
 H. A. L. Kiers and A. K. Smilde, “Some theoretical results on secondorder calibration methods for data with and without rank overlap,” Journal of Chemometrics, vol. 9, pp. 179–195, 1995. View at: Google Scholar
 Y.J. Yu, H.L. Wu, J.F. Nie et al., “A comparison of several trilinear secondorder calibration algorithms,” Chemometrics and Intelligent Laboratory Systems, vol. 106, no. 1, pp. 93–107, 2011. View at: Publisher Site  Google Scholar
 S. Li, J. C. Hamilton, and P. J. Gemperline, “Generalized rank annihilation method using similarity transformations,” Analytical Chemistry, vol. 64, no. 6, pp. 599–607, 1992. View at: Publisher Site  Google Scholar
 E. Sanchez and B. R. Kowalski, “Tensorial resolution: a direct trilinear decomposition,” Journal of Chemometrics, vol. 4, pp. 29–45, 1990. View at: Google Scholar
 R. A. Harshman, UCLA Working Papers in Phonetics, vol. 1, 1970.
 R. Bro, “PARAFAC. Tutorial and applications,” Chemometrics and Intelligent Laboratory Systems, vol. 38, no. 2, pp. 149–171, 1997. View at: Publisher Site  Google Scholar
 H.L. Wu, M. Shibukawa, and K. Oguma, “An alternating trilinear decomposition algorithm with application to calibration of HPLCDAD for simultaneous determination of overlapped chlorinated aromatic hydrocarbons,” Journal of Chemometrics, vol. 12, no. 1, pp. 1–26, 1998. View at: Publisher Site  Google Scholar
 A.L. Xia, H.L. Wu, D.M. Fang, Y.J. Ding, L.Q. Hu, and R.Q. Yu, “Alternating penalty trilinear decomposition algorithm for secondorder calibration with application to interferencefree analysis of excitationemission matrix fluorescence data,” Journal of Chemometrics, vol. 19, no. 2, pp. 65–76, 2005. View at: Publisher Site  Google Scholar
 Z.P. Chen, H.L. Wu, J.H. Jiang, Y. Li, and R.Q. Yu, “A novel trilinear decomposition algorithm for secondorder linear calibration,” Chemometrics and Intelligent Laboratory Systems, vol. 52, no. 1, pp. 75–86, 2000. View at: Publisher Site  Google Scholar
 A. de Juan and R. Tauler, “Multivariate Curve Resolution (MCR) from 2000: progress in concepts and applications,” Critical Reviews in Analytical Chemistry, vol. 36, no. 34, pp. 163–176, 2006. View at: Publisher Site  Google Scholar
 A. C. Olivieri, “On a versatile secondorder multivariate calibration method based on partial leastsquares and residual bilinearization: secondorder advantage and precision properties,” Journal of Chemometrics, vol. 19, no. 4, pp. 253–265, 2005. View at: Publisher Site  Google Scholar
 R. Bro, “Multiway calibration. Multilinear PLS,” Journal of Chemometrics, vol. 10, no. 1, pp. 47–61, 1996. View at: Publisher Site  Google Scholar
 R. Bro and H. A. L. Kiers, “A new efficient method for determining the number of components in PARAFAC models,” Journal of Chemometrics, vol. 17, no. 5, pp. 274–286, 2003. View at: Publisher Site  Google Scholar
Copyright
Copyright © 2014 Hui Chen et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.