Abstract

The qualitative and quantitative determination of the components of textile fibers takes an important position in quality control. A fast and nondestructive method of simultaneously analyzing four fiber components in blended fabrics was studied by near-infrared (NIR) spectroscopy combined with multivariate calibration. Two sample sets including 39 and 25 samples were designed by simplex mixture lattice design methods and used for experiment. Four components include wool, polyester, polyacrylonitrile, and nylon and their mixture is one of the most popular formulas of textiles. Uninformative variable elimination-partial least squares (UVEPLS) and the full-spectrum partial least squares (PLS) were used as the tool. On the test set, the mean standard error of prediction (SEP) and the mean ratio of the standard deviation of the response variable and SEP (RPD) of the full-spectrum PLS model and UVEPLS model were 0.38, 0.32 and 7.6, 8.3, respectively. This result reveals that the UVEPLS can construct local models with acceptable and better performance than the full-spectrum PLS. It indicates that this method is valuable for nondestructive analysis in the field of wool content detection since it can avoid time-consuming, costly, and laborious wet chemical analysis.

1. Introduction

To blend fibers of different types is a common practice to obtain expected characteristics in the textile industry. According to the national standard of China, textile products have to be marked with fabric type and composition on the label. Also, this quantitative composition is mandatory information [1]. Thus, determining the composition of the textile blend is a key issue in the textile industry. Current standard methods are mainly based on physical, chemical, or microscopic techniques and are time-consuming, costly, and often require the use of some undesirable chemicals to dissolve fibers [2, 3]. Some alternatives to these methods including various spectroscopic techniques have shown considerable potential in recent years [46].

Especially, near-infrared (NIR) spectroscopy has shown great potential and gained wide acceptance in food industry [79], drug industry [1012], cigarette manufacturing [13], fuel processing [14], wood industry [15], etc. It is also a green method for multicomponent analysis of complex samples. Compared to conventional analytical methods, NIR spectroscopic technique is based on multivariate models by which the spectral data are correlated with the index of interest, thus providing several outstanding advantages such as being fast, nondestructive, and a potentially multicomponent method. Also, it is inexpensive and environment-friendly as it needs no solvents/reagents, thus avoiding a major expense. The NIR spectrum records signal on overtones and combinations of the fundamental molecular vibrations [16]. NIR spectral information is thus hardly selective, but quantitative analysis can be carried out with the aid of chemometrics. The main advantage of simultaneous multicomponent analysis is to quantify several components in mixtures without a prior separation, which is generally necessary owing to the overlapped signals.

In NIR-based quantitative applications, a reliable calibration model is of great importance and its predictive performance even directly determines its availability [17]. It is well known that partial least squares (PLS) is the most commonly used calibration algorithm since it is a full-spectrum method and can utilize information from the whole spectrum to construct a predictive model. Even so, both theoretical and experimental evidences have been shown that an efficient variable selection can significantly improve the performance of PLS and greatly reduce its complexity [18, 19]. Uninformative variable elimination (UVE) [20] is a good variable selection method capable of eliminating variables which are not more informative for modeling than noise. When combined with PLS, it provides a way of constructing a simpler calibration model but without the loss of predictive ability. UVE is also widely used in NIR spectroscopy and have shown great advantage at eliminating of uninformative spectral variables [21].

In the present work, a fast and nondestructive method of simultaneously analyzing four components in blended fabrics was studied by near-infrared (NIR) spectroscopy combined with multivariate calibration. Two sample sets including 39 and 25 samples were designed by simplex mixture lattice design methods and used as the training set and the independent test set, respectively. Four components include wool, polyester, polyacrylonitrile, and nylon and represent one of the most popular formulas of textiles. Uninformative variable elimination-partial least squares (UVEPLS) and the full-spectrum partial least squares (PLS) were used as the tool of variable selection and multivariate calibration. This result reveals that the UVEPLS can construct local models with acceptable and better performance than the full-spectrum PLS. It indicates that this method can serve as a tool of fast and nondestructive analysis of fiber contents since it can avoid time-consuming, costly, and laborious wet chemical analysis.

2. Theory and Methods

2.1. Partial Least Squares (PLS)

Partial least squares (PLS) [22, 23], one of the most widely used methods in multivariate calibration, aims at predicting a dependent variable, , from a matrix of independent variables/predictors, , by projecting and to the latent variable (LV) subspaces maximizing their covariance. It is different from principal component regression (PCR), which consists of a two-step process, and the projection stage is separated and independent from the regression one. PLS actively uses the information in for defining the latent variable subspaces. Indeed, PLS looks for components which compromise between explaining the variation in and predicting the responses in . This corresponds to a bilinear model as follows:where is a score matrix, and are matrices of coefficients that relate to the independent variables and dependent variables, respectively, and and represent the corresponding residual matrices. PLS is a sequential algorithm: The latent variables are computed so that the first component is the direction of maximum covariance with the dependent variable, the second component is orthogonal to the first and has maximal residual covariance, and so on. The estimation of can be obtained by the NIPALS algorithm as follows:where is a matrix of weights of size . It is possible to obtain the general equation for prediction of :where is the matrix of estimated regression coefficients. If only a dependent variable is considered, and will be vectors. The differences of many algorithms are the ways of calculating . Many algorithms are available for obtaining a satisfactory .

2.2. Uninformative Variable Elimination-Partial Least Squares (UVEPLS)

Uninformative variable elimination (UVE) is a classic method of variable selection by analyzing the stability of the regression coefficient [24, 25]. UVE aims at eliminating variables which are not more informative for modeling than noise. One can construct a better PLS model based on the remaining variables from UVE. The combination of UVE and PLS is named as UVEPLS. Taking the case of a single response as an example, the main steps of UVEPLS is summarized as follows:(1)First PLS regression is made on instrumental signal matrix () and reference values () of an interest property on the calibration/training set, and the optimal number of PLS factors is determined.(2)Then a noise matrix with an approximate size is generated and its elements are random numbers in the interval of 0-1. And the elements are multiplied by a small constant so as to make their influence on the model negligible. Such a noise matrix is appended to the original signal matrix to form an extended matrix.(3)PLS models are constructed on the extended matrix () and based on leave-one-out cross-validation. This leads to a matrix of regression coefficients with as many rows as samples and one column for each variable, including both original and random.(4)The reliability of each variable is quantitatively measured by the stability value, which is defined as the mean of the corresponding column divided by the standard deviation of that column in the matrix of regression coefficients.(5)Based on the fact that any variable with less stability than a random variable is uninformative and should be eliminated, a cutoff value is calculated as the maximum of the stability values among the random variables. Every original variable with lower stability values than the cutoff value is assumed to contain nothing but noise and is therefore eliminated.(6)Based on the remaining variables, a final PLS model can be constructed and optimized.

3. Experimental

3.1. Sample Design

This work used the simplex lattice design for preparing the four-component mixture samples. For an four-component system, the regular simplex is a tetrahedron where each vertex represents a straight component, an edge represents a binary system, and a face represents a ternary one. Points inside the tetrahedron correspond to quaternary systems. The basis of designing experiments of this kind is a uniform scatter of experimental points on the so-called simplex lattice. Points, or design points, form a [q, n] lattice in a (q − 1) simplex, where q is the number of components in a composition and n is the degree of a polynomial. The design was done by MINITAB software. The degree of lattice was set as 4 and 3 for generating the training and test sets, respectively. Also, each design was also augmented with the center point and the axial point. Thus, a total of 39 and 25 mixtures were obtained for the training and test sets, respectively. Table 1 shows the composition of each mixture from simplex lattice design for both the training and test sets, among which A, B, C, and D denote wool, polyester, polyacrylonitrile, and nylon, respectively. Each time, all fibers were first weighed separately based on a given blend ratio and were then mixed. Textiles made up of these components are industrially existent and popular in the market of China, and each content is ranging from 0 to 1 (w/w, the weight percentage).

3.2. Instrument and Spectra

The instrument used in this study was an Antaris II Fourier transform near-infrared (FT-NIR) spectrometer, which is equipped with an integrating sphere, an InGaAs detector, and a tungsten lamp as the light source. When collecting a spectrum, the mixed sample was poured into a rotatable sample cup with a 50 mm diameter, and the stacking height was controlled above 10 mm for preventing light leak. An internal gold reference was used for background collection. The rotation cup allows multipoint diffuse reflection measurements for the same sample. So, a final spectrum is actually the mean of the spectra measured at different locations. All NIR spectra were recorded in the region of 4000–10000 cm−1 with 32 co-added scans. The resolution was set as 3.856 cm−1, and each spectrum consisted of 1557 data points. Taking into account the uniformity problem of solid samples, two spectra were recorded for each sample. Thus, a total of 78 spectra and 50 spectra were obtained for the training set and the test set, respectively. The experimental temperature and the related humidity were controlled at 25°C and 60%, respectively. To remove the influence of light scattering and path-length variations on the spectra, standard normal transformation (SNV) was first used to preprocess all original spectra. That is, each spectrum was corrected individually by centered and scaled by its standard deviation [26]. All calculations were performed with MATLAB2015 for Windows.

4. Results and Discussion

4.1. Preliminary Sample and Spectral Analysis

Given a dataset, in general, the partition of available samples into a representative training/calibration set and a test set is of great importance. The training set is used to construct a calibration model while the test set is used to evaluate its performance. Theoretically, the evaluation is valid only when the test set has the same distribution as the training set. In this work, the so-called simplex lattice design was used to generate two independent sample sets, one for training and the other for test. As Table 1 shows, even if most of the samples in the training set and the test set have different compositions, they cover the same concentration range and roughly the same distribution. Also, it is evident from Figure 1, which shows the concentration of four fibers in the designed experimental sample set. The first 39 samples correspond to the training set, and the latter 25 samples correspond to the test set. In Figure 1, different colors represent different components. The length of the rectangular block with a specific color in each column represents the content of the component. In the present sample design, either the training set or the test set actually contains pure materials, as well as binary, ternary, and quaternary mixtures.

Actually, each spectrum was composed of 1557 data points, and its profile depended on the sample components. That is, the weight percentage of each component in each sample was associated with the respective spectrum. The NIR spectrum reflects composition variation of a sample. Pure NIR spectra of these fabrics have different profiles despite the difference being very small. Figure 2 shows the NIR spectra of the training and test sets. The signal bands in the NIR spectrum are generally severely overlapped and difficult to resolve due to several reasons. For example, even if the typical chemical bonds in wool include C–H, N–H, and C=O, which correspond to absorption peaks in specific regions on the spectrum, a bond vibration could also be affected by its adjacent bonds, such as hydrogen bonds, and the overtone and combination vibrations. The typical absorption band of animal hairs can be observed. The shoulder at 7000 cm−1 is assigned to the first overtone of the O-H stretching vibration of water while the band around 5200 cm−1 to a combination of the O-H stretch and H-O-H bending vibrations of the hydroxyl group from water. The doublet at 5800 cm−1 is originated from an overtone of the C-H stretch of protein side chains and lipids. The band of 4000–5000 cm−1 gives information on amino acid composition as well as characteristic molecular conformation of animal keratin fibers. The bands and peaks around 4090 cm−1, 4430 cm−1, 4690 cm−1, and 6020 cm−1 are associated with the C-H combination and first overtones of polyester. Similarly, some specific bands of polyacrylonitrile and nylon can be analyzed.

Besides the chemical composition of the samples, the NIR spectra of a mixture can be affected by other factors such as light scattering, baseline shift, and path-length variations from heterogeneity. These situations will seriously complicate the calibration task. To reduce these effects, as described above, all spectra were first preprocessed by SNV and followed by calculating the first derivative. The preprocessed spectra were ready for subsequent variable selection and calibration.

4.2. Calibration Model

Based on the training set, both full-spectrum PLS and UVEPLS methods were used for multivariate calibration. Actually, the latter includes the variable selection. When using UVEPLS, the maximum number of components, the number of random variables, the fold number of cross-validation, and the cutoff level were set as 15, 300, 5, and 0.99, respectively, by trial and error. Figure 3 shows the selected variables for quantifying four fibers by UVEPLS. In detail, the number of retained variables and the compression ratios were 139, 145, 52, and 85 and 91.1%, 90.7%, 96.7, and 94.5% for wool (A), polyester (B), polyacrylonitrile (C), and nylon (D), respectively. It is also clear in Figure 3 that different components correspond to different optimal subsets of variables. By removing the uninformative variables, subsequent models will be simpler and more robust since the ratio of variables to samples is significantly reduced.

The key parameter in the PLS model is the number of components. Based on 5-fold cross-validation, a series of candidate PLS models with different number of components were constructed. Figure 4 shows the root mean squared error of cross-validation (RMSECV) versus the number of PLS components for the full-spectrum model and the local model (with selected variables for UVE). As seen from Figure 4, for either case, RMSECV decreases first and achieves the lowest point, and then gradually increases with an increase of PLS components. The optimal number of PLS components were 9, 5, 5, and 6 for A, B, C, and D, respectively. Also, the optimal values were the same for two kinds of models, i.e., the full-spectrum PLS or local PLS model (only using selected variables). The optimal number of “A” components is maximal, which is reasonable since A denotes wool and is more complex than other fiber components. This also confirms from another perspective that UVE only remove the uninformative variables and did not impose obvious influence on the useful information.

Based on the optimal number of PLS components, final calibration models were constructed. Figures 5 and 6 give the predicted versus the actual values of fiber concentrations based on the full-spectrum PLS model and the local PLS model, respectively. In such plots, the points/markers will fall on the diagonal only if the model predicts the concentration perfectly. As shown in Figures 5 and 6, even if the differences is small, it can still be found that Figure 6 exhibits a more compact distribution, especially for the A component in the top-left subplot. Furthermore, Table 2 compares two kinds of PLS models based on three indices, i.e., the standard error of calibration (SEC) on the training set, standard error of prediction (SEP) on the test set, and the ratio of the standard deviation of the response variable and SEP (RPD).

From Table 2, it is evident that the local PLS model achieves higher accuracy than the corresponding full-spectrum PLS model. As a rule of thumb, a calibration model can be considered as very good when the RPD value is above 6.5 and excellent when it achieves 8. Thus, these local PLS models are satisfactory, especially the local PLS model related to wool, which improve the RPD value from 4.8 to 8. It seems that UVEPLS combined with NIR technique is feasible and potential for simultaneously determining several components of textile products.

5. Conclusions

This paper demonstrated a NIR spectroscopic method for simultaneously determining the fiber contents of four-component blends. Simplex lattice design was used to produce two independent sample sets for the training and test purposes, respectively. As a result, the UVEPLS algorithm construct local PLS models with satisfactory performance compared to the full-spectrum PLS algorithm. This procedure is simple, fast, and environment-friendly and has potential for quality control of textile products. The main challenge is maybe heterogeneous nature of textiles since it is difficult to blend whole fiber samples uniformly. Even so, it is still valuable and can serve as an alterative to some wet chemical methods for similar tasks.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Acknowledgments

This work was supported by the National Natural Science Foundation of China (21375118, J1310041), Scientific Research Foundation of Sichuan Provincial Education Department of China (17TD0048), Scientific Research Foundation of Yibin University (2017ZD05), Sichuan Science and Technology Program of China (2018JY0504), and Opening Fund of Key Lab of Process Analysis and Control of Sichuan Universities of China (2018005).