Abstract

Simultaneously verifying the original region of green and roasted coffee beans is very important for protecting legal interests of the stakeholder according to the chemical analyzing method. 131 green coffee bean samples are collected from six different original regions and pretreated with three degrees (green, middle, and dark roasted); five stable isotope ratios (δ13C, δ14N, δ18O, δ2H, and δ32S) and twelve elemental contents (Al, Cr, Ni, Zn, Ba, Cu, Na, Mn, Fe, Ca, K, and Mg) of green, middle, and dark roasted coffee bean samples (131×3) were analyzed. Fractionation of stable isotopes and variation of elemental contents were evaluated, only isotope hydrogen (2H) significantly fractionated, and elemental concentrations increased with a certain rate during the roasting process. One-way analysis of variance (ANOVA) was used to compare the stable isotope ratios and elemental concentrations of all coffee bean samples from six different original regions. Random forest (RF) was employed to build a discriminating model for simultaneously verifying the original regions of green and roasted coffee bean samples; this model provided 100% accuracy. Inclusion of this mathematical model for simultaneously verifying the original region of green and roasted coffee beans had powerful distinguishing capability and which will not be influenced by fractionation of hydrogen (2H) and variation of element contents during the roasted process.

1. Introduction

In the past decades, China’s economy has made a great progress, also the coffee industry shows an overall upward trend, China’s coffee bean imports exceeded 100,000 tons for the first time in 2021, reaching 122,800 tons, most of them are imported from coffee-producing tropical countries, and specialty coffee increased sharply [1]. Chinese government carried out the “Internet plus plans” projects in those years; numerous trading companies/groups were founded, and many Internet celebrities appeared; those “Internet plus commercial companies” projects make coffee consumption become convenient and efficient; meanwhile, the coffee culture was promoted to spread quickly; more and more Chinese consumers have developed preferences for coffees that have unique flavor characteristic or are less abundant in the open market [2]. As we know that flavor, taste, aroma, and style of coffee vary significantly by the geographical origin of coffee beans, coffee beans from the most coveted regions of the world can command prices many times higher than the average global price and consumers have their own purchasing preferences of geographical coffee beans; moreover, consumers are paying increasing attention to coffee beans’ quality and safety [3, 4]. Due to the above reasons, the mislabeling of geographical origin of coffee beans has become another area of fraud. However, Chinese concerning government sectors (customs and administration for market regulation) have limited official technical methods available to check the mislabeled or flake geographical origin of coffee beans. Right now, inspection protocols for imported coffee beans are mainly based on the two national standards (GB/T19181-2018 and GB/T18007-2011) [5], both of them have no technical method about the tracing original regions of coffee beans. So, it is highly valuable to develop a stable, inexpensive, and reliable method to identify the geographical origin of coffee beans for protecting legitimate interests of many stakeholders of the coffee industry, like consumers, coffee farmers, traders, importers, and distributors. Thi sis special for concerning government sectors, which need reliable technical tools that are used to ensure the sutainable and healthy development of the coffee industry. Normally, consumers always purchased roasted coffee beans, so all of them more cared about original regions of roasted coffee beans. Distributors and coffee farmers traded with bags of green coffee beans, who more cared about original regions of green coffee beans. Because government sectors regulated the coffee industry, which cared about geographical origins of green and roasted coffee beans. Based on the above situations, the reliable technical tools which can simultaneouslyclassify the original region of green androasted coffee beans should be developed to meet all stakeholders' request.

Up to now, various analytical techniques [611] have been used to authenticate the geographical origin of coffee beans, such as terahertz (THz), near infrared (NIR), nuclear magnetic resonance (NMR), inductively coupled plasma mass spectrometry (ICP-MS), gas chromatographic time-of-flight mass spectrometric methodology (GC-TOF-MS), and isotope ratio mass spectrometry (IRMS). However, considering the stability, accurate, dependability, and cost of all those methods, stable isotopes and trace elements analysis are considered to be the most economical and effective to identify original regions of green coffee beans. Several studies [1215] also reported that stable isotope ratios and elemental contents are great indicators of original regions, meanwhile, combined with multivariate analyzed methods, which would be powerful technical tools for tracing the original regions of food resources and agricultural products, including coffee beans, tea, and wine.

Multielements and stable isotope profiles of coffee bean samples from four original regions of Ethiopia were analyzed by ICP, XRF (X-ray fluorescence spectrometry), and IRMS and then coupled with linear discriminant analysis, resulting in 80–89% of successful classification [16]. Meanwhile, Rodrigues et al. [11] used stable isotope ratios (δ13C, δ18O, δ14N, δ34S, and δ87Sr) and 30 elements of green coffee samples combined with multivariate statistic method; five Hawaii subregions (Hawaii, Kauai, Maui, Molokai, and Oahu) were verified with 100% accuracy. Stable isotopes and elements are used to determine the geographical origins of coffee beans [17], because both of them will not degrade during storage [18] which can be indictors of the soil and ecological environment type. However, coffee beans will lose 14–20% of its mass during the roasting process, including water loss, absolute elemental contents arise, and change in carbohydrate, oil, and protein composition [19], which lead to be uncertain that stable isotopes will fractionate or not, whether isotopes fractionation affects the distinguishing capability of the mathematical model. Meanwhile, trace elements would not be volatilized during the roasting process; their contents should be increased, which may over-curtain the information about geographical origin of roasted coffee beans [4]. Previous studies mostly focused on determining the geographical origin of green beans, even few studies that involved roasted coffee beans did not account for roasting-related changes in mineral concentration and stable isotopes [4, 6, 8, 17]. For the above reasons, the roasting-related changes in mineral contents and isotope ratios should be evaluated in this case.

The objective of this study was to develop a stable, reliable, and inexpensive approach which can simultaneously verify the original regions of green and roasted coffee beans. This technique tool can be applicable for consumers, also for traders, distributors, and other stakeholders, to give an exposure of counterfeited coffee beans in the open market and protect stakeholders of coffee beans from coveted regions.

2. Materials and Methods

2.1. Samples Collection and Preparation

131 green coffee bean samples (C. arabica) were obtained from reliable importers and traders from six different coffee growing countries, which are from Columbia, Nicaragua, Panama, Brazil, China, and Rwanda, respectively. All those samples have more specific location information more than just a country of origin; the information is provided in Table 1 and Figure 1. The green coffee sample beans (1 kg/time) from Table 1 are put into the roaster for different degrees of roasting, the following steps show how to pretreat these samples; for green bean samples, there is no treating; for middle-roasting beans, preheating roaster to 200°C, adding 1 kg green samples, slowly heating the roaster to 210°C in 10–11 mins, make sure the Agtron value was 75; for dark-roasting beans, preheating the roaster to 200°C, adding 1 kg green samples, slowly heating the roaster to 223°C in 13–15 mins, make sure the Agtron value was 50.

The method of sample powder preparation was similar with previous paper's [4], each green and roasted coffee bean samples (100 g) was grounded in a mill (AG204, Mettler Toledo, Switzerland), three times for 5 mins each time, to obtain the particle size of <1 mm to achieve a homogeneous sample. After grinding, samples were dried overnight at 45°C and then 5 mg sample powder was weighed in tin capsules (4 × 6 mm) that were then folded close (P6, Mettler Toledo, Switzerland).

2.2. Stable Isotope Ratios Analysis

The δ13C, δ14N, and δ34S values of coffee beans (green and roasted) were analyzed [20] by IRMS-Flash HT 2000 (Thermo Fisher, Bremen, Germany). Samples and references are placed in an automatic sampler, which were reduced by combustion in a reactor at 1020°C and then converted to CO2, N2, and SO2, respectively. After dehydration by Mg (ClO4)2, the samples were separated on a column (70°C) and then entered an IRMS ion source, the carrier gas was helium with a constant flow of 180 mL/min, and reference gas was CO2 with a constant flow of 50 mL/min. For correcting measurement precision of the isotope ratios,one reference material was incorporated at every test batch (8 samples). δ13C, δ14N, and δ34S data were calibrated against USGS-43 (in house standard, δ13CV-PDB = −21.28‰, δ15NAir = 8.44‰, and δ34SV-CDT = 10.46‰).

The δ2H and δ18O values of coffee beans (green and roasted) were analyzed [20] by IRMS-Flash HT 2000 (Thermo Fisher, Bremen, Germany). For hydrogen isotope ratio (δ2H) analysis, encapsulated samples in a silver capsule were placed in a desiccator together with reference material for longer than 72 h to adjust the atmospheric effect. Samples and reference materials are placed in an automatic sampler, both were pyrolyzed to H2 and CO at 1300°C and separated through a column with 85°C before entering an IRMS ion source, the carrier gas was helium with a constant flow of 200 mL/min and references gas was CO2 with a constant flow of 110 mL/min. For correcting measurement precision of the isotope ratios, one reference material was incorporated at every test batch (5 samples); meanwhile, every sample was analyzed 2 times to prevent the memory effect when hydrogen isotope ratio was tested. δ13C, δ14N, and δ34S data were calibrated against USGS-56 (in house standard, δ2H = −44.0‰ and δ18O = 27.23‰).

2.3. Elemental Contents Analysis

The standard method was used to measure the elemental content of coffee beans samples [14] and briefly described as follows: 1.0 g coffee powder was added into an acid-cleaned Teflon vessel and sequentially reacted with HNO3 for complete digestion, and those samples were dried and dissolved in 5% HNO3, and then filtered by membranes (0.45 μm). Ca, K, and Mg were analyzed using inductively coupled plasma atomic emission spectroscopy (Optima 8300, PerkinElmer, USA) and the following elements (Al, Cr, Ni, Zn, Ba, Cu, Na, and Fe) using ICP-MS (NexION 1000, PerkinElmer, USA).

2.4. Statistical Methods and Data Analysis

Coffee bean samples for the analysis were 131 × 3 (131 samples from six different producing regions; three different pretreated ways, green, middle, and dark roasted). The R-Project Version 3.5.1 software (https://www.r-project.org/) was employed to analyze the stable isotope ratios and mineral concentration data matrix. One-way analysis of variance (ANOVA) was used to analyze the roasted-related changes in isotope ratios and elemental contents and also compare the isotope ratios and mineral concentration from green, middle, and roasted coffee bean samples from six different original regions at a 5% significance level. Random forest (RF) was employed to evaluate discriminating power of isotope ratios and mineral concentrations, meanwhile, which was applied to build the distinguishing model of geographical origin of green, middle, dark, and mixed coffee bean samples. As a powerful machine learning classifier [21, 22], key advantages included nonparametric nature, high classification accuracy, and capability to determine variable importance. Over past years, RF has achieved a huge success in modeling high-dimensional datasets for a range of different purposes, such as products authentication in food analysis.

3. Results and Discussion

3.1. The Fractionation Effect of Stable Multi-Isotopes from Green to Dark Roasted Coffee Beans

Five stable isotope ratios (δ14N, δ13C, δ34S, δ18O, and δ2H) of three different degree roasted coffee bean samples (green coffee beans, middle, and dark roasted, total 131 × 3 samples) from six different original regions (Columbia, Nicaragua, Panama, Brazil, China, and Rwanda) were analyzed by one-way ANOVA, the results showed that only δ2H () had significantly changed, that means only hydrogen isotopes had significantly fractionated, and other four isotopes had no significantly fractionation during the roasting process (Table 2).

Isotope hydrogen (2H) has the largest range of stable abundance variation in nature, which is up to 250%; meanwhile, isotopes fractionating process was strongly affected by the temperature [23]. The fractionation of hydrogen (2H) might be related to the evaporation water of coffee samples during the roasting process; there is dramatically changing of temperature (almost up to 200°C). Several authors [2426] also proved that the hydrogen (2H) had significantly fractionated during processing of food products (tomato, noodles, and tea), which was also related to the temperature and water changing.

For isotope oxygen (18O), there is relatively smaller difference between18O and 16O in mass, so the fraction effect of oxygen will be far less than hydrogen in evaporation [23]. Our work showed that isotope oxygen (18O) had no significantly fractionation during the roasting process. Previous studies [24, 27] showed that there is no obvious oxygen fractionation of beef and tea during the roasting process. Those above studies which can be applied to interpret that isotope hydrogen had significantly fractionated and isotope oxygen had not significantly fractionated during the roasting process of coffee bean samples.

As we know that intense chemical reactions should be finished during the roasting process of coffee beans, just like the Maillard reaction, which is between amino acids and reducing sugars, this will give coffee beans browned, distinctive flavor, and aroma [19], but there is no obviously carbon isotope fraction during the roasting process of coffee beans. Probably those reactions just only concerned about the chemical bond (-C-OH and H–C-), not concerned about the carbon-carbon bond (-C-C-) breakdown [23]. Bostic and Guo [27, 28] also indicated that there is no significantly changing of isotope carbon (13C) in yeast buns, sweet cookies, and roasted beef; all of them were related to Maillard reaction during the roasting process. Chemical conversion, physical transport, nitrogen cycle, and sulfate fix may cause stable isotopes fractionation, special in mineralization, and nitrification process [23, 28]. Table 2 shows that there is no significant isotope enrichment (15N and 34S) and (14N and 32S) depletion during the roasting process of coffee beans, which demonstrated that isotopes (15N and 34S) are relatively stable in roasting or baking process of food materials. Several studies [24, 27, 29] also presented isotopes (15N and 34S) of roasting beef, coffee beans, and tea that had no significant fractionation during the process, which means both nitrogen and sulfur isotopes are stable in the process.

There is no significant fractionation effect during the roasting process except isotope hydrogen in our work, which kept the consistency with the previous studies. Theδ18O and δ2H values of plant was not only related to the latitude of the plant growing regions and also the local precipitations. The δ13C values depend on the plant species (C-3 or C-4 plants), the δ14N values were significantly affected by the chemically synthesized fertilizers, and the δ34S is mainly from soil sulfate and atmospheric SO2, which can be interpreted that isotopes can be effective contributors for determining geographical origin of food resources. Several authors [3032] have proved that stable isotope ratios were excellent indicators for geographical origin of agricultural and food products, including grape wine, olive oil, onions, coffee beans, tea, and beef (fresh and roasted).

3.2. The Variation of the Multielements from Green to Dark Roasted Coffee Beans

Coffee beans were agricultural products; the multielements of coffee beans are mainly from the soil of coffee plantation, which are linked to local soil composition, so elements can be used to effectively discriminate coffees of the growing area [33]. Furthermore, elements will not be volatilized and be degraded, concentrations will be changed with mass loss of coffee beans during the roasting process, numerous studies [6, 9, 11, 29] used concentration ratios of elements to identify the geographical origin of coffee beans, which also demonstrated that many of the world’s coffee-producing regions can be distinguished from other regions of the world on the basis of elements’ ratios. However, the variation of elemental contents may obscure geographical information of coffee beans [19]. For this reason, we analyzed the contents of twelve minerals (Al, Cr, Ni, Zn, Ba, Cu, Na, Fe, Mn, Ca, K, and Mg) of all coffee bean samples from 6 countries (131×3) by ICP-OES and ICP-MS and then used the average element contents of green beans divided into average contents of green beans, middle, and dark roasted coffee beans, respectively.. Table 3 shows the element ratios (CGb/CGb, CMb/CGb and CDb/CGb, where C means content, Gb means green beans, Mb means middle roasted beans, then Db means dark roasted beans). Firstly, we can see that ratios of CGb/CGb are 1; secondly, ratio of CMb/CGb is 1.17, except elements (Al and Fe) is 1.18 from Columbia, and element (K) is 1.18 from Brazil, which is little bigger than others; thirdly, ratio of CDb/CGb is 1.21 except Al is 1.20 from Nicaragua, and Mn is 1.58 from China, which is much bigger than the other ratios, probably due to some abnormal samples or measurement errors. According to those ratios, we can see that the elemental contents were the lowest in green beans and elevated in roasted beans, and trace element concentrations increased, the ratios of element concentrations still stay relatively constant, and the effects also were demonstrated by Van Cuong et al. [33]. Meanwhile, Belitz et al. [19] summarized that coffee beans lose 14–20% of its mass during the roasting process of coffee beans, which would push increasing of mineral contents of roasted coffee beans. Consequently, the variations of mineral contents of coffee beans are kept relatively constant; whether it would affect the accuracy of the regional origin classification model will be studied further.

3.3. Determining Origins of Green, Middle, and Roasted Coffee Beans

In this case, 131 samples (C. arabica) were collected from six countries: Columbia (Valle del Cauca), Nicaragua (Matagalpa), Panama (Volcan-candela), Brazil (Sul de Minas), China (Baoshan city), and Rwanda (Nyamagabe). Considering the species of coffee may influence the stable isotope ratio and elemental contents, only C. arabica was employed in this work. Columbia [34] was the third biggest countries of producing coffee beans, which provided high quality beans for consumers; special Valle del Cauca, which is located in the Andes Mountains, has high altitude (1450–2000 m) and fertile soil, which provides perfect condition for coffee tree, its coffee has not only amazing acid and elegant sweetness but also brings complicated floral flavor, good balance, silky mouthfeel, and long after-taste, which has a huge number of fans in this world. Coffee industry [35] was the important economic support for Nicaragua; they mainly planted the traditional varieties (Caturra and Bourbon), provided high quality and diversified taste coffee; special Matagalpa mainly produced specialty coffee. All of the six regions are wonderful growing area of coffee, which provided excellent beans for consumers and buyer from different countries and always attracted the worldwide attention of coffee consumers [36]. The stable isotope ratios and mineral compositions of coffee beans were influenced not only by the agronomic practices and climate but also by the altitude, soil, and water of the growing area [12, 29]. To build reliable and stable discrimination model of geographical origin of green coffee beans, ANOVA (one-way) and RF (Random Forest) were conducted in this part. ANOVA is always used to analyze the differences among groups of data [14]. RF was an ensemble learning method with high classification accuracy, which was widely applied to classify the original region of food products, and it requires larger scale number of data matrix [21, 22]. The five isotope ratios and twelve elemental contents were combined together and analyzed by one-way ANOVA, the results showed that there is significantly a difference among original regions at 95% confidence level (Table 4), which means those parameters are effective for identifying geographical origin of green coffee bean samples. Random forest was carried out, the dataset was randomly divided into two groups, training group (70%), and test group (30%), the training group was built to the discrimination model, test group was to assess the accuracy of this model, and the results are shown in Table 5. The results indicated that seven green coffee bean samples from Columbia are totally classified into Columbia, and the other samples from left five original regions are the same situation, that means this model has 100% accuracy and powerful capability to classify the geographical origin of coffee bean samples from different regions, obviously combined isotope and element with random forest are effective approach to verify original region of coffee beans.

Meanwhile, the same way was applied to verify the producing region of middle and dark roasted coffee bean samples; the results are the same with green coffee bean samples; both of them have the significant difference among original regions at 95% confidence level (Tables 6and 7); meanwhile, classifying original regions with 100% accuracy (Tables 8 and 9) also conveyed that the fractionation effect of hydrogen during the roasting process did not affect the stability of the discriminating model of the original region for the corresponding roasting degree samples; meanwhile, the variation of element contents did not obscure the discriminating capability of this model. In other words, this result of this part provided sufficient support to establish a discriminating model of original regions whether the samples are green or roasted coffee beans.

This work and previous studies [9, 11, 17, 29] demonstrated again that stable isotope ratios and elements are excellent indicators of geographical origins of coffee beans, including other agricultural products, which was only associated with growing area, not influenced by the roasting process. Furthermore, taking stable isotopes and elements as indicators, coupled with chemometrics, it can be used to classify the geographical regions of green and roasted coffee bean samples at the same time.

3.4. Simultaneously Verifying Geographical Regions of Green and Roasted Coffee Bean Samples

All the data were combined together; RF was carried out again; the discriminating geographical origin model was built; Table 10 and Figure 2 show the results; the accuracy of this model was 100%; and all the samples are correctly identified into their own producing regions. Causing the great prediction ability of this method, the contributions of those parameters from the four models of coffee beans were analyzed (Table 11); we can see that the biggest contributor group is made of six parameters (Na, δ2H, Al, Ni, Mn, and Cr), the accumulative contributions are over 59%, hydrogen devoted 9.67%, 9.31%, 8.65%, and 9.69%, respectively. The stability of the discriminating model was not affected by the fractionation of hydrogen. Probably, the high accuracy of this method in this work was due to the remote distance among the original regions, which lead to the huge differences in growth environments of sample beans.

The mathematical model with 100% accuracy was first adapted in this study, which can simultaneously identify geographical origin of green, roasted, or mixed coffee beans; Figure 3 shows the procedures. Our work reveals that the fractionation of hydrogen and variation of elements will not influence the prediction ability of discriminating model of the original region of coffee beans that means combining stable isotopes and elements was a reliable method to trace the geographical origin of green and roasted coffee beans. This work was based on the isotope ratios and multielements, which can produce the promising results for original regions with different types of soil and climate; however, this approach may fail to identify the growing area which was adjacent countries with similar climatic environments. As we know that verifying the geographical origin of coffee beans was challenging, because the chemical profiles of beans was influencednot only by species and growing area, but also by the postharvest processing, storage conditions, and roasting procedures [3]. Based on the phenolic and methylxanthine profiles of coffee beans, Alonso-Salces et al. [37] misclassified 30–60% of the Cameroonian sample beans of original regions, which showed that phenolic profiles were not reliable indicators, which were not stable and can be easily influenced by species, toasting process, and other factors. The aroma and volatile compounds of coffee beans were analyzed by PTR-ToF-MS (proton transfer reaction time-of-flight mass spectrometer); the significant differences were founded [10, 38], which was a rapid and direct technique; however, the cost of this method is much more expensive than our works. Serra et al. [39] identified the original region of green coffee beans with 88% successful classification by carbon, nitrogen, and boron isotopes, and Anderson and Smith [40] tried to classify the geographical origin of coffee beans according to 11 elements; the results showed 70–85% accuracy; that means the accuracy of the two methods needs to be improved. Meanwhile, the NMR was also applied to trace the geographical origin of coffee bean samples [8, 41], which was a powerful technique, however, which need a complicated extraction process and its cost was expensive. Due to the cost and efficiency, the stable isotope ratios (IRMS) and multielements (ICP) gained more and more interest for researchers in tracing geographical regions of agricultural products.

4. Conclusions

Five stable isotope ratios (δ14N, δ13C, δ34S, δ18O, and δ2H) and twelve elemental contents (Al, Cr, Ni, Zn, Ba, Cu, Na, Mn, Fe, Ca, K, and Mg) are excellent indicators of growing area of coffee beans, and random forest was a powerful tool for predicating the geographical origin of coffee beans. In our work, the mathematical model showed great ability to predicate geographical origin of green, roasted, or mixed coffee beans at the same time; their accuracy was up to 100%. Meanwhile, the stability of this model will not be changed by the fractionation effect of isotope hydrogen and variation of element contents during roasting processing of coffee beans. However, this scale of the growing area of coffee bean samples was too big, which was only used to verify the producing area of coffee beans from different countries. So, smaller scale of growing area or subregion of coffee bean samples should be collected for further study in the next step, just for classifying the local smaller growing area of coffee beans. It can be concluded that more works are necessary to protect economic benefits of stakeholders in the coffee industry.

Data Availability

The data used in this study are available from the corresponding author on request.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This work was supported by the Key Research Programme of SZPT (6020310006K).