Abstract

The flowers of Elaeagnus angustifolia L. have been used as a homologous variety in China, whose quality seriously relies on the compositions during the flowering period. Unfortunately, studies on the variations of volatile compounds during the flowering season are rarely reported. Herein, a gas chromatography-mass spectrometry-based untargeted metabolomic methodology was proposed for the comprehensive analysis of volatile compounds in E. angustifolia flowers to classify various flowering stages. Samples from four flowering stages were collected, including the initial bloom stage, pre-full bloom stage (70–80% of flowers), full bloom stage, and ending of the bloom stage. Simultaneous distillation extraction was used for the extraction of volatile compounds in the flowers, which was then analyzed by a newly developed chemometric data analysis tool, autoGCMSDataAnal. An advantage of the developed methodology is that compounds can be accurately screened and identified. Finally, 59 compounds that showed significant difference among four flowering stages were screened and 31 compounds were identified. Sample clustering results from principal component analysis and hierarchical clustering analysis suggested that flowers from the pre-full bloom stage and full bloom stage may be more suitable when used as raw materials for industrial products.

1. Introduction

As a homologous variety, the flowers of Elaeagnus angustifolia L. (oleaster, Russian olive, or wild olive) have been used in the west zones of China [14]. Traditionally, the flowers were used as a medicine [5] for chronic bronchitis and tetanus [6], asthma [79], arthritis, and cough [10, 11]. Additionally, it can also be used as a material of food additive for flavor treatment of some wines [12, 13]. E. angustifolia flowers are widely used as tea beverage in northwest zone of China [14]. It was reported that E. angustifolia flowers have anti-saccharification effect and can be used as a food additive to inhibit the undesirable saccharification reaction in food processing [15]. The essential oils of E. angustifolia flowers and leaves have been extracted and identified as preservatives in the food industry and natural pesticides in agriculture [7].

In China, E. angustifolia has been widely planted in the northwest zones, including Ningxia, Qinghai, Gansu, and Xinjiang provinces [16]. Usually, the flowering season lasts about 20–30 days and mostly ranges from May to June in Ningxia province of China. During this period, the flowers were collected for extracting essential oil [17]. However, the volatile compounds in the flowers changed rapidly during the flowering season, which can thus seriously influence the quality of product.

A number of works had been published for characterizing the compounds in the flowers of E. angustifolia. For instance, Torbati et al. [7] utilized gas chromatography-mass spectrometry (GC-MS) for studying the compounds in the essential oil, and about 53 compounds can be quantified, including ethyl cinnamate, hexahydrofarnesyl acetone, palmitic acid, etc., which accounted for about 96.59% content in the essential oil. Chen et al. [18] and Han et al. [9] separated a number of new compounds in the flowers of E. angustifolia and characterized them by nuclear magnetic resonance (NMR), which involved macrocyclic flavonoid glycoside, triterpenoid saponin, and lignan glycosides [18]. Most of published works focused on the employment of GC-MS for characterizing the compounds in the flowers or related products [19]. Very little work had been reported for studying the volatile compounds during the flowering season.

In this work, a GC-MS-based untargeted metabolomic strategy [20] was proposed for performing compound identification and characterization of the content variations during the flowering season. The recently proposed automatic GC-MS data analysis software, autoGCMSDataAnal [21], was introduced for analyzing flower samples of E. angustifolia for the first time. Samples from Ningxia province were collected as examples for demonstrating the developed strategy. Coeluted compounds in the GC-MS can be reasonably separated with the aid of chemometric mathematical separation algorithm and accurately qualified.

2. Experiment

2.1. Sample Collection

The flower samples were collected during the flowering season between 16 May and 2 June in Yinchuan, Ningxia province of China, with a sampling interval of 4 or 5 days. Four flowering stages were manually divided, i.e., the initial bloom stage (with about 25% flowers, the perianth is not open), the pre-full bloom stage (70–80% flowers, the perianth opens), the full bloom stage (100% flowers, the perianth is completely open), and the ending of the bloom stage (100% flowers with some abscission, the perianth began to wilt) [22]. A total of 34 samples from different trees in the same area were collected, including 8 samples from the initial bloom stage, 10 samples from the pre-full bloom stage, 8 samples from the full bloom stage, and 8 samples from the ending of the bloom stage. After collection, all the samples were treated in the lab for analysis.

2.2. Sample Pretreatment

A simultaneous distillation extraction (SDE) procedure was utilized for volatile compound extraction. For each sample, about 20 g of flowers was weighed into a 1 L round-bottom flask, and then 350 mL of pure water (Watsons, China) and 40 g of NaCl were added. CH2Cl2 (Thermofisher, USA) was selected as the extraction solvent and about 40 mL was added into a 250 mL round-bottom flask. The SDE was performed for about 2 h. Then, about 10 mL of extraction solvent was transformed into a 25 mL flat-bottom flask and 2 g of anhydrous sodium sulfate was added. Finally, 1 μL of solvent was injected for GC-MS analysis.

2.3. Preparation for Quality Control Sample

A quality control (QC) sample was prepared by equally mixing the extraction solutions of 34 samples. During the sample injection procedure, the QC sample was injected after every 6 samples.

2.4. Instrumental Analysis

Agilent GC-MS was used for data collection. A DB-WAXETR (30 m × 0.25 mm, 0.25 μm) column was used. He (99.999%) was used as carrier gas. The temperature of front injection was set as , with a split ratio of 5 : 1. The oven temperature was started with and maintained for 3 min, followed by an increment of to . A 15 min post-run was used under the temperature of . The solvent delay time was set as 4 min.

The parameters of mass spectrometry were optimized. The full scan mode was used with the scan range of 50–500 Da and the scan speed of 5 spectra/s. The EI temperature was , and the collision energy was 70 eV.

2.5. Data Analysis

All collected GC-MS data were transformed into the “mzdata.xml” file format and imported into autoGCMSDataAnal platform for performing total ion current chromatogram (TIC) peak detection, peak deconvolution, time-shift correction, component registration, and statistical analysis like analysis of variance (ANOVA), principal component analysis (PCA), and hierarchical clustering analysis (HCA). The resolved spectra of screened compounds from autoGCMSDataAnal were finally used for compound identification in National Institute of Standards and Technology (NIST).

3. Results and Discussion

3.1. TIC Peak Detection and Deconvolution

The success of GC-MS data analysis seriously relied on the compound information extraction. The performance of autoGCMSDataAnal on TIC peak detection was firstly investigated in this work. A typical example is shown in Figure 1, where about 107 TIC peaks were detected. A major peak eluted at 43.49 min can be clearly found in the TIC signal, which occupied about 64.42% of content in the sample. The margined plot in Figure 1 shows the peak detection results of some minor peaks in more detail. Evidently, the minor components can be successfully extracted by autoGCMSDataAnal.

In complex sample analysis, it is very common to find coeluted compounds, whose TIC peaks were overlapped with each other [23]. In this case, compound identification based on the mass spectrum collected under the peak apex may provide inaccurate results. It will be reasonable to perform a peak deconvolution at first. Usually, analysts may resort to the AMIDS for TIC peak deconvolution. However, the AMDIS can provide a number of false-positive compound resolution results. Herein, the TIC peak deconvolution was performed based on the multivariate curve resolution-alternating least squares algorithm (MCR-ALS) [24], which was implemented in autoGCMSDataAnal. An advantage of MCR-ALS is that it is insensitive to initialized parameters by using the iterative strategy. autoGCMSDataAnal can automatically provide the initialized parameters, like the number of components, chromatographic profiles of components under a TIC peak, and so on, for MCR-ALS, which will be iteratively optimized to retrieve the underlying chromatographic and mass spectral profiles of compounds.

An example of the TIC peak deconvolution in autoGCMSDataAnal is shown in Figure 2. Figure 2(a) provides a TIC peak that was extracted by autoGCMSDataAnal. The extracted ion chromatograms (EICs) under various m/z values are shown in Figure 2(b). Figure 2(c) shows the smoothed EICs after baseline correction. autoGCMSDataAnal classified these EICs into two clusters for generating initialized chromatograms for MCR-ALS. Finally, two compounds were retrieved from the TIC peak, with the resolved chromatographic profiles shown in Figure 2(d).

The benefit of compound identification after TIC peak deconvolution is shown in Figures 2(e) and 2(f). The mass spectrum of the resolved compound eluted under the apex of the −32# was matched with the compound heptacosane by NIST with a match factor (MF) of 761. The other retrieved compound (33#) was identified as benzeneacetaldehyde by NIST with a MF of 915. It seems that both compounds can be matched with acceptable match factors. However, a further investigation indicated that the resolved compound benzeneacetaldehyde can be accurately identified. Results in Figure 2 indicated that compounds under detected TIC peaks can be successfully retrieved by autoGCMSDataAnal. Thus, in the following part of this work, autoGCMSDataAnal was used for TIC peak deconvolution and compound registration to screen compounds showing significant difference among various flowering stages.

3.2. Compound Screening and Statistical Analysis

With the aid of autoGCMSDataAnal, a registered compound list (266 × 34) was obtained, where 266 and 34 were the number of compounds and samples, respectively. A further analysis indicated that 59 compounds can be found in at least 80% of samples. The compound screening was performing by ANOVA with p value <0.05, and compounds that could not be detected by 80% of samples were removed. Finally, 59 compounds were screened.

The PCA and HCA were used to analyze grouping characteristics of samples on the basis of the screened compounds. Figure 3(a) provides sample distributions on the first two principal components, which explained approximately 78.6% of information in the dataset. It can be seen that samples from the first (initial bloom) and the fourth (ending of the bloom) stages of the flowering season can be clearly separated with the others, i.e., the pre-full bloom stage and the full bloom stage. Samples from the second (pre-full bloom) and the third (full bloom) flowering stages were located quite close to each other in the PCA plot (Figure 3(a)), with their ellipses calculated under 95% level partly overlapped.

Similar results can be found from HCA (Figure 3(b)), as samples from the second and the third flowering stages were firstly clustered, followed by the fourth and the first flowering stages. A very possible reason is that volatile compounds varied dramatically during the first and the fourth flowering seasons. In fact, sample clustering results were consistent with practical realizations, as the flavor characteristics of the second and the third were similar. With respect to the fourth flowering stage, the flavor content released by flowers decreased gradually.

3.3. Compound Identification

Mass spectra of the screened 59 compounds were written in an MSP file by autoGCMSDataAnal, which were then imported into the NIST for compound identification. Linear retention index (RI) was used to limit the number of candidates. In this work, the tolerance of RI was set at 30. An example of the compound identification is shown in Figure 4. Figure 4(a) provides resolved chromatogram and mass spectrum of a screened compound, whose content was gradually decreased from the first to the fourth flowering stages. The resolved chromatographic and mass spectral profiles of this compound by autoGCMSDataAnal are depicted in Figure 4(b). The importation of the resolved mass spectrum into NIST resulted in a number of candidate compounds. Figure 4(c) provides several acceptable candidates on the basis of MF. In this case, one may find it hard to identify the matched compound, especially for the first two compounds in Figure 4(c). With the aid of RI, the candidate can be limited to only one candidate, which is the first one in the candidate compound table (Figure 4(c)). Finally, the resolved compound was confirmed by standard compound.

A combination of autoGCMSDataAnal with NIST suggested that among the screened 59 compounds, 31 compounds can be matched by NIST with MFs above 700 and RI error below 30, which involved 11 esters (hexanoic acid ethyl ester, benzoic acid ethyl ester, benzeneacetic acid ethyl ester, benzenepropanoic acid ethyl ester, ethyl cinnamate, 2-propenoic acid-3-phenyl methyl ester, 2-propenoic acid-3-phenyl ethyl ester, 2-propenoic acid-3-phenyl-2-methylpropyl ester, hexadecanoic acid ethyl ester, octadecanoic acid ethyl ester, and 9-octadecenoic acid ethyl ester), 7 aldehydes (hexanal, heptanal, 2-hexenal, nonanal, furfural, benzaldehyde, and benzeneacetaldehyde), 4 alcohols (benzyl alcohol, phenylethyl alcohol, 3-phenyl-2-propen-1-ol, and phytol), 3 organic acids (2-methylbutanoic acid, heptanoic acid, and n-hexadecanoic acid), 2 phenols (2-methoxy-4-vinylphenol, and trans-isoeugenol), and 4 unclassified compounds. Detailed information of matched compounds including retention time, RI, MF, and so on is shown in Table 1.

The content variations of identified compounds during flowering season are shown in a heatmap in Figure 5. As can be seen, ten ester compounds showed higher expression level at the first stage (initial bloom), such as 9-octadecenoic acid ethyl ester, isobutyl cinnamate, benzoic acid ethyl ester, benzeneacetic acid ethyl ester, ethyl cinnamate, hexanoic acid ethyl ester, 2-propenoic acid-3-phenyl methyl ester, octadecanoic acid ethyl ester, 2-propenoic acid-3-phenyl ethyl ester, and hexadecanoic acid ethyl ester. By contrast, fifteen compounds show more content at the fourth flowering stage, including benzyl alcohol, nonanal, heptanal, heptanoic acid, 2-methoxy-4-vinylphenol, phenylethyl alcohol, 2-hexenal, hexanal, 6,10,14-trimethyl-2-pentadecanone, furfural, benzenepropanoic acid ethyl ester, n-hexadecanoic acid, 2-methylbutanoic acid, benzeneacetaldehyde, and benzaldehyde. Notably, all of the identified aldehydes were significantly increased at the fourth stage. A similar report was found by Hou et al. [25], which suggested that aldehydes will be gradually increased during the maturation stage.

Results indicated that most of compounds were changed slowly during the second and the third flowering stages. Compounds in the first and the fourth stages were quite different with the other stages. The compound distribution characteristics can be found in the sample clustering results, as samples from the second and third stages were classified at first (Figure 3). The relatively constant factor in composition of E. angustifolia flowers is the evaluation criterion for industrial products. According to the results shown in this work, the quality of E. angustifolia flowers can be maintained during the second and the third flowering stages, which may be more suitable when used as a raw material for industrial products.

4. Conclusion

This work proposed a strategy for investigating the volatile compounds in the E. angustifolia flowers during the flowering season. Samples from four flowering stages including the initial bloom stage, the pre-full bloom stage, the full bloom stage, and the ending of the bloom stage were collected for studying. 31 compounds were finally screened and identified. Both PCA and HCA indicated that samples from the second and third stages were closer than the remaining two stages. In conclusion, flowers from the second and third stages were more suitable when used as raw materials for industrial products.

Data Availability

The data used to support the findings of this study are included within the article.

Conflicts of Interest

The authors declare that they have no potential conflicts of interest.

Acknowledgments

This study was supported by Henan University of Animal Husbandry and Economy (grant nos. 2018HNUAHEDF014, XKYCXJJ2020002, and 2019HNUAHEDF007).