Abstract

The analytical method for the metabolomics of the 60 rice seeds from two main rice origins in Heilongjiang Province was developed based on gas chromatography coupled with mass spectrum. The specific differential metabolites between two rice origins were identified, and the distinguish of the two main origins was illustrated by using the R software platform with XCMS software package for gas chromatography coupled with mass spectrum data processing, combined with multivariate statistical analysis software. The result indicated that the 173 peaks were detected, and 54 of which were structurally identified, covering amino acids, aliphatic acid, sugar, polyols, and so on. By comparing the data of Wuchang and Jiansanjiang origins, it was found that there were 9 special metabolites in Wuchang origin and 8 special metabolites in Jiansanjiang origin. The 10 differential metabolites with significant changes (, VIP ≥ 1) were filtrated. It is indicated that the differential metabolites of rice carry information of their origin and there are the differences in the metabolites of rice in two main origins. The proposed method is expected to be useful for the metabolomic researches of rice.

1. Introduction

Metabonomics is the science of taking low-molecular weight metabolites in biological samples (such as organic acids, fatty acids, amino acids, and sugar) as the research object, through high-throughput detection and data processing, information integration, and biomarker identification [1]. Since the concept of metabolomics was put forward by Oliver [2] in 1997, metabolomics has been widely applied in various fields, becoming a powerful means to explore the inner mechanism of matter [3]. As people emphasis on food safety, identification or trace of origin of agricultural products becomes the focus of research in recent years, and mineral elements fingerprints analysis [4] and isotope fingerprint analysis [5] are a commonly used means.

And metabolomics, from the point of view of biology, the overall qualitative and quantitative study on all the endogenous metabolites in plant, and explain the metabolite changes from the angle of systems biology and biological life activity phenomenon [6], will become an effective analytical platform for the identification of agricultural products. Nicholson et al. had studied the metabolomics of Arabidopsis thaliana in different origins and found that the differences of the growth environment could make differences in the amino acids and sugars of Arabidopsis thaliana [7]. Giansante et al. [8] analyzed composition and content of fatty acid of olive oil from four different origins in an Italian by using gas chromatography. The results showed that the content of palmitic acid and linoleic acid in the olive oil existed significant differences in different origins. With high resolution, sensitivity, and reproducibility, gas chromatography-mass spectrometry (GC-MS) has become one of the main analytical platforms in metabolomic research [9].

Rice (Oryza sativa L.) is an important food crop, rich in nutrients and essential trace elements, suitable for human needs [10] and has become a hot topic in plant metabolomics research in recent years [1116].

The environment of origin has an important influence on the growth of rice, so variety and content of the metabolites in rice may have the information of origin. The metabolites of rice in different origins may exist obvious differences.

In this work, rice seeds from two main rice origins, Wuchang farm (WC) and Jiansanjiang farm (JSJ), in Heilongjiang Province were comparatively analyzed based on the metabolomics by GC-MS. The results provide a theoretical basis for the origin identification and distinguish of rice.

2. Materials and Methods

2.1. Plant Material

Rice plants, Oryza sativa L., were collected from the two geographical indication rice reserves, which are located in Jiansanjiang farm (JSJ) and Wuchang (WC) in Heilongjiang Province. The main varieties were randomly collected with the checkerboard sampling method according to the representative sampling principle [17] in the scope of protection. In each sampling point, rice panicles (Japonica rice, 1-2 kg) were collected according to different directions. 30 samples were selected from each origin.

2.2. Chemicals and Materials

Methanol and pyridine (≥99.0% purity) (chromatographic grade) were purchased from Aladdin Reagent Co., Ltd. (Shanghai, China). 2-Chlorophenylalanine (98.5% purity), methoxyamine hydrochloride (98.5% purity), and N,O-bis(trimethylsilyl)trifluoroacetamide (99%BSTFA + 1%TMCS) were purchased from Macklin Reagent Co., Ltd. (Shanghai, China). HPLC-grade water was obtained from a Milli-Q water purification system (Millipore Corp., USA) and used to prepare all aqueous solutions. All other reagents of analytical grade were purchased from Beijing Chemical Factory (Beijing, China).

2.3. Apparatus

The 7890A/5975C GC-MS (Agilent J&W Scientific, USA) equipment was used. Chromatographic separation of metobolites was performed on an HP-5 ms (30 m × 0.25 mm × 0.25 μm) (Agilent J&W Scientific). Termovap Sample Concentrator (Automatic Science Instrument Co., Ltd., China), DHG-9123A drying oven of electric heating (Jinghong Laboratory Equipment Co., Ltd., Shanghai, China), FC2K Rice huller (Dazhu Production Co., Ltd., Japan), Incubator (Senxin Instrument Co., Ltd., Shanghai, China), and TGL-16B High speed centrifuge (Anting Instrument Co., Ltd., Shanghai, China) were used in the experimental procedure.

2.4. Sample Preparation

The sample processing method and chromatographic method refer to the literatures [18, 19] with slight modifications.

Under liquid nitrogen, rice seed was pulverized to obtain the powdered samples. 50 mg of powdered samples, 800 µL of methanol, and 10 µL of internal standard (2-chlorphenylalanine) were mixed by vortex for 30 s. Subsequently, the mixture solution was centrifuged at 12,000 rpm for 15.0 min at 4°C. After centrifugation, 200 µL supernatant was transferred to a GC bottle (1.5 mL automatic sample bottle), and then the bottle was dried with nitrogen blowing. The dried residue was completely dissolved for 90 min at 37°C (in 30 µL of 20 mg·ml−1 methoxyamine hydrochloride in pyridine) followed adding 30 µL BSTFA to derive for 60 min at 70°C. All samples were analyzed within 24 h after derivatization treatment.

2.5. GC-MS Analysis

1 µL of sample volume was injected with autosampler. Gas chromatography was performed on a 30 m HP-5 ms column with 0.25 mm inner diameter and 0.25 mm film thickness (Agilent J&W Scientific). Injection temperature was 280°C, the interface was set to 250°C, and ion source was adjusted to 230°C and quadrupole to 150°C. Helium (>99.999% purity) was used as the carrier gas set at a constant flow rate of 2 mL·min−1. The temperature was 2 min isothermal heating at 80°C, followed by a 10°C·min−1 oven temperature ramp to 320°C and a final 6 min heating at 320°C. The system was then temperature equilibrated for 6 min at 80°C prior to injection of the next sample. Full scanning mode was used, and scanning range was 50–550 (m/z).

2.6. Data Analysis

GC-MS data analysis was performed at Suzhou Bionovogene (Suzhou, China). The original data of GC-MS were pretreated with XCMS software packages in the R software platform. Then, the edited data matrix was imported into SIMCA-P software (Umetrics AB, Umea, Sweden) for principal component analysis (PCA) and partial least squares-discriminant analysis (PLS-DA) and other multivariate statistical analysis. The differences of the sample metabolome between the groups were analyzed through the analysis of PCA and PLS-DA score map, and the difference metabolites were screened according to the difference between the value of group contribution (VIP) and the significance (). Compared with the standard spectrum library of National Institute of Standards and Technology (NIST) and Wiley Registry metabolomic database, most metabolites were analyzed. The paraffin retention index of metabolites is further qualitative identified based on the retention index provided by the Golm Metabolome Database (GMD). At the same time, most substances were further confirmed by standard products.

3. Results and Discussion

3.1. The Result Analysis of GC-MS
3.1.1. GC-MS Total Ion Chromatogram

In all samples, 173 peaks were detected, and 54 metabolites (Table 1), including 11 kinds of sugars, 9 kinds of fatty acids, 12 kinds of polyols, 14 kinds of other derivatives, 14 kinds of organic acid, 2 kinds of amino acids, 1 kind of phosphoric acid, and 1 kind of nucleotide, were identified by analysis of GC-MS raw data.

The comparison and analysis of the total ion flow chromatogram of two groups of samples (Figure 1) showed that the total ion chromatogram of rice in different origins is similar but slightly different. As can be seen from Figure 1, the baseline of the peak diagram is stable and the instrument is in good stability.

3.1.2. The Differential Metabolites of Rice in Two Origins

The metabolites common in 30 rice samples were found with the statistical analysis. A total of 26 metabolites were found in the WC origin, and 25 were found in the JSJ origin. Remove the same metabolites, and ultimately, the nine kinds of characteristic metabolites, including pentanoic acid, pyran glucose, stearic acid, eicosanoic acid, docosanoic acid, 1-monooctadecyl glycerol, trehalose, β-tocopherol, and β-sitosterol, respectively, were found in WC origin. The 8 kinds of characteristic metabolites, including benzoic acid, fumaric acid, xylose, xylitol, glucose, inositol, sorbitol, and raffinose, respectively, were found (Table 1).

3.2. The Differences Analysis of Rice Metabonomics in Two Origins
3.2.1. PCA Analysis

PCA [20] can degrade the data, eliminate overlapping information, and explain most information with a small number of factors, so as to distinguish similar variables and find differences. A PCA analysis (SIMCA-P) was performed for processing the data of rice samples from two groups to research the effect of different growth environment on rice metabolites. This analysis has three main components. In the PCA part of this paper, the cumulative R2X = 0.664 and Q2 = 0.572. The values of R2X and Q2 are both greater than 0.5, and the difference between the two is less than 0.2. The metrics of the two indicators given in the comprehensive literature [21] indicate that the PCA model has good fit and predictability, and there is no overfitting. It is indicated that the fitting accuracy of the model is better. As can be seen from Figure 2, except for the 5 abnormal samples, the remaining 55 rice samples are in the confidence interval. The data of the abnormal sample represent that the sample data are quite different from the other sample data of the same group, so the display in the figure shows that it is far away from other samples in the same group. There are two reasons for the anomaly. One is the error in the sample pretreatment process, and the other is caused by the large individual difference between the sample itself and other samples. According to the PCA score map, the samples in the JSJ origin are distributed on the left side of the confidence interval, and the samples in the WC origin are mostly distributed on the right side of the confidence interval. Explain that there are differences between the two origin samples. There is overlap in the sample group of the Jiansanjiang origin, indicating that the similarity between the samples is large, and there is also a large distance between the sample points. It shows that the similarity between samples is small and the difference is large. Comparison of samples from both regions and comparisons between samples from the same place indicate differences between samples.

For intergroup samples, the samples from the JSJ origin are distributed on the left side of the confidence interval, while the samples from the WC origin are mainly distributed on the right of the confidence interval. It is showed that there were significant differences between the two groups of samples, but there was still overlap between the samples of similar groups. Through unsupervised principal component analysis, rice samples from different origins could not be distinguished.

3.2.2. PLS-DA Analysis

At the same time as the reducing dimension, PLS-DA (Figure 3) combines with regression model and makes a discriminant analysis of regression results with a certain discriminant threshold, which is conducive to more efficient discovery of intergroup differences and differential compounds. This analysis has two main components, R2X = 0.715, R2Y = 0.87, and Q2 = 0.629, and the values of the two principal components are similar. When PLS-DA is used for data correlation or discriminant model analysis, the replacement test mode can be used (Figure 4). As can be seen from Figure 4, the abscissa indicates the displacement retention of the permutation test (the ratio that coincides with the order of the original model Y variable, and the point where the permutation retention is equal to 1 is the R2Y and Q2 values of the original model). The ordinate indicates the value of R2Y or Q2. The green dot indicates the R2Y value obtained by the displacement test. The blue square point indicates the Q2 value obtained by the displacement test. The two broken lines represent the regression lines of R2Y and Q2, respectively. The original model R2Y is 0.87. Between 0.5 and 1, closer to 1, indicating that the established model is more in line with the real situation of the sample data. The original model Q2 is 0.629, which is greater than 0.5, indicating that if a new sample is added to the model, an approximate distribution will be obtained. In general, the original model can better explain the difference between the two sets of samples. The Q2 value of the random test for the displacement test is smaller than the Q2 value of the original model. The regression line of Q2 and vertical axis intercept is less than zero. At the same time, as the retention decreases, the proportion of the substituted Y variable increases, and the Q2 of the stochastic model gradually decreases. It shows that the original model has good robustness and there is no overfitting phenomenon.

In view of the above, it shows that the PLS-DA model has better predictability and there is no overfitting. Compared with the PCA score chart (Figure 2), Q2 is increased, which indicates that the concentrated repeatability of test is well, and the accuracy of the model is very high.

Figure 3 shows that samples of two groups completely separate, no overlapping samples. For intergroup samples, the samples from the JSJ origin are distributed on the left side of the confidence interval, while the samples from the WC origin are mainly distributed on the upper of the confidence interval. There are two abnormal samples that are not within the confidence interval. It is proved that different origins (growth environment) have a great influence on rice metabolites.

3.2.3. Exploration and Identification of Differential Metabolites between the Two Groups

In this experiment, the variable importance in the projection (VIP, the threshold value > 1) of the PLS-DA model coupled with the value of Student’s t-test (t-test, the threshold value ≤ 0.05) were used for looking for the metabolites of differential expression. The rice samples of two origins were analyzed, and the ten differential metabolites with significant changes were screened, covering 4 kinds of fatty acids, 3 kind of other derivatives, a kind of polyols, a kind of sugars, and a kind of nucleotides, the differences metabolites contain both primary metabolites and also contains the secondary metabolites (Table 2). Among the two rice origins, eight differential metabolites were higher in rice in the WC origin than in the JSJ origin and were upregulated and increased by 1.51–4.24 times. The content of glycerol and α-D-Methylfructofuranoside was lower than in rice in the WC origin than in the JSJ origin and were downregulated and decreased by 0.52 and 0.06 times, respectively.

3.2.4. Hierarchical Cluster Analysis

The data set is scaled by the pheatmap package in R software (v3.3.2), and the bidirectional clustering analysis of samples and metabolites was conducted. Figure 5 is a hierarchical clustering diagram of relative quantitative values of metabolites in this experiment. As can be seen from Figure 5, the heat map is divided into two parts: red and green, indicating that the content of metabolites is very different, and the difference between them is obvious. At the top of the graph is the clustering of samples in two origins, and it can be found that the clustering effect is very good, the samples on the left are from mainly JSJ origin, and the samples on the right are from the WC origin. It can be seen that, through ten differential metabolites of the selection, the samples in two origins can be distinguished and the results are good.

3.2.5. The Pathway Analysis of Differential Metabolites

The related metabolic pathway analysis was carried out on the differential metabolites of two rice origins by using the enrichment analysis of MetaboAnalyst and metabolic pathway retrieval of KEGG [22]. Five metabolic pathways were found, and the specific information is shown in Table 3.

The 10 kinds of differential metabolites in two origins were analyzed through retrieval of metabolic pathways, and five differential metabolites, including glycerol, indole, thymidine, myristic acid, and 9-(Z)-octadecanoic acid, were found in target metabolic pathways. The metabolic pathways, including the biosynthesis of fatty acids, glycerol metabolism, the biosynthesis of phenylalanine, tyrosine, and tryptophan, galactose metabolism, and pyrimidine metabolism, were found and matched. It is indicated that the difference of origins has an obvious effect on the various terminal metabolic pathways of rice.

4. Conclusions

In this experiment, the metabonomics of rice seeds in two main rice origins (JSJ and WC) in Heilongjiang Province were researched based on the GC-MS, 173 peaks were detected, and 54 metabolites were identified. Compared with the two origins, 9 unique metabolites were found in WC origin, 8 of which were unique metabolites in JSJ origin, and the 10 differential metabolites of distribution between JSJ and WC origins were selected. It is proved that the metabolites of rice in a different origin carry the information of their origin, and the difference of metabolites is feasible for the distinguishing of rice origins. The purposed method is feasible for the separation and identification of metabolites in rice seeds. Compared with the literatures [19, 23], the origin has an effect on rice metabolites, and the metabolites will be different in different origins. The method can be used for the isolation and identification of metabolites in rice seeds. The analysis of metabolic pathways shows that the difference of origins has an obvious effect on the various terminal metabolic pathways of rice. This research provides a reference for plant metabolomics. So it seems possible to extend this method to the separation and identification of the metabolites in other similar samples by varying the experimental conditions.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Informed consent was obtained from all individual participants included in the study.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Authors’ Contributions

Changyuan Wang and Dongjie Zhang conceived and designed the study. Yuchao Feng and Tianxin Fu performed acquisition of data. Liyuan Zhang and Changyuan Wang analyzed the data. Changyuan Wang and Yuchao Feng drafted the manuscript. Dongjie Zhang revised the manuscript.

Acknowledgments

This study was funded by the Heilongjiang Province Postdoctoral Science Foundation (Grant no. LBH-Z15217), Science and Technology Research Project of Heilongjiang Agricultural Reclamation Administration (Grant no. HNK135-05-01), Special Funding of Postdoctoral in Heilongjiang Bayi Agricultural University, the Innovation Fund of Postgraduate in Heilongjiang Bayi Agricultural University, and Program for Young Scholars with Creative Talents in Heilongjiang Bayi Agricultural University (Grant no. CXRC2017011).