Abstract

Coffee is an important commercial product that arose various quality issues. Different techniques have been applied to detect coffee quality. This review focused on the recent updates in the detection methods of coffee from a targeted versus nontargeted perspective. This review introduced case studies of the current research progresses on targeted and nontargeted detection approaches. Their merits and demerits were evaluated as an analysis of coffee quality. The targeted approach, including liquid chromatography (LC), gas chromatography (GC), thin-layer chromatography (TLC), and capillary electrophoresis (CE), evaluates the quality of coffee by specific markers, whereas the nontargeted approach tests whether the sample is abnormal, without prior knowledge of what caused the abnormality, usually coupled with chemometrics. The nontargeted techniques commonly involve LC, GC, nuclear magnetic resonance (NMR), infrared spectroscopy (IR), ultraviolet-visible spectroscopy (UV-Vis), laser-induced breakdown spectroscopy (LIBS), and mass spectrometry (MS). This work may provide guidance for resolving most aspects of the quality problems in coffee, such as adulterant detection, species identification, and geographical origin discrimination.

1. Introduction

Coffee is one of the most consumed beverages around the world [1] and is the second largest natural trade product after petroleum [2, 3]. Besides its wide acceptance, a series of recent studies also indicated that coffee has a great inhibitory effect on obesity, diabetes [4], Alzheimer’s disease [5], cardiovascular disease, inflammation, cancer, and other chronic diseases [6]. Due to its excellent bioactivities, some by-products of coffee have been utilized in cosmetics, pharmaceutics, and other sectors [7, 8] to replace some plant-derived extracts [911]. In particular, spent coffee grounds (SCG) present significant antioxidant activity and can become a prospective environmental source, which reveals their valorization in the latest studies [1214]. The International Coffee Organization (ICO) estimates that 10.3 million tons were consumed worldwide in 2020/2021, with Arabica (Coffea arabica) and Robusta (Coffea canephora) as the two main commercial varieties. Between September 2021 and September 2022, exports of Arabica totaled 4.8 million tons, whereas Robusta exports amounted to 2.9 million tons [15]. Coffee is cultivated in more than 80 countries, of which Brazil is the world’s leading producer and exporter, with a total production of about 4.1 million tons in the crop year 2020/2021, followed by Vietnam with 2.6 million tons [16]. In the Chinese market [17], coffee consumption rose more than four-fold in the 25 years between 1992 and 2017. In fresh coffee consumption, China has the highest consumption with a strong average annual growth of 17% per capita, followed by the Public of Korea (9%). The consumption of soluble or instant coffee increased at a high rate of 12% per year, second only to Vietnam (13%). It can be seen that the coffee industry has huge development potential in many developing countries.

Due to its attractive commercial value, there are diversified quality issues presented in the coffee industry, which may harm the customers’ economic interest or even threaten public health [18]. Figure 1 depicts quality issues in coffee. Detection of potential adulteration, identification of varieties, and discrimination of origins from beans are all effective measures to ensure the quality of coffee. Among them, the most typical and widely addressed one may be economically motivated adulteration (EMA) [19]. There are three types of coffee in the market: green, roasted, and processed coffee [20], all are susceptible to being adulterated with the by-products of coffee or other plant materials, such as coffee husks [21, 22], waste coffee grounds [23], brown sugar [24], acai [25], cheaper grains (wheat [26], rice [27], rye [28], barley [2, 29, 30], corn [26, 3133], soybean [32]), and chicory [30, 34]. Besides, intentionally mislabeling lower-valued beans as their higher-valued counterparts and blending different species or origins of beans are other possible forms of fraud [35, 36]. In response to the increasing public concerns of fraudulent practices about coffee, the consumers and actors of the food industry sector, including the related government agencies and food companies, demand the methods for the characterization of coffee to be adequately reliable, accurate, high-throughput, low-cost, as well as easy-to-operate with minimal human intervention [37].

Authenticating coffee according to its quality traits has been an important field for nearly two decades [37]. Typically, the workflow for detection is divided by sample preprocessing such as liquid-liquid extraction (LLE), dispersive liquid-liquid microextraction (DLLME) [38], solid-phase extraction (SPE) [39], and solid-phase microextraction (SPME) [40], followed by appropriate detection techniques [41].

Targeted and nontargeted techniques are commonly applied for food quality detection. The targeted approach finds a target or a marker compound of interest that threatens food safety or alters the product quality and consequently builds a clear extraction, purification, and analysis workflow for studying a specific case. It is reliable and effective when the compounds have been well studied or the metabolisms well understood [42]. In contrast, the nontargeted detection approach treats the sample as a complete entity of chemicals and finds the overall proportion changes of chemical compositions [43, 44]. A comparison of the authentic or pure food with their counterfeited or adulterated counterparts was achieved by fingerprinting methods combined with chemometric modeling [45, 46]. To conclude, both targeted and nontargeted approaches can be used for analyzing the quality of coffee from different aspects [42].

This review focused on a brief summary of the recent developments in the quality assurance of coffee. Interested readers may also refer to previous reviews [37, 41, 4749] for a more comprehensive understanding with respect to the detection techniques. Figure 2 shows a complete workflow for detection. Table 1 summarizes the advantages and limitations of different detection methods in coffee quality.

2. Targeted Approach: Analysis of Specific Markers by Chromatography and Other Techniques

Chromatography may be the most commonly used technique for detecting possible adulteration in coffee [47]. Various chromatographic techniques were reported to accurately measure specific markers in coffee, including liquid chromatography (LC), gas chromatography (GC), thin-layer chromatography (TLC), and capillary electrophoresis (CE). Table 2 summarizes the key characteristics involved in liquid and gas chromatography in food analysis. The results can be used to determine possible adulteration of coffee to identify different species, genotypes, grown with different cultivation modes, or degrees of roasting for coffee beans. Figure 3 summarizes the quantitative chromatographic analysis of specific marker compounds in coffee adulteration detection.

Except for specific metabolites, there are many other targeted techniques applied in quality analysis. For example, the properties of morphology, thermology, and immunology in coffee can be characterized by the microscope, digital image processing, array sensors, sequencing methods, thermal, and immunological detection techniques. Although not widely applied compared to traditional chromatographic analysis, they can offer attractive properties, such as low cost and easy-to-use nature. Table 3 concludes all works related to targeted analytical approaches in this review.

2.1. Targeted High-Performance Liquid Chromatography (HPLC) for Characterization of Coffee Quality

Different components, whether major or minor, can be utilized as the maker in evaluating coffee quality. Polysaccharides account for about 50% of the dry weight of green coffee beans, of which branched arabinogalactan, linear mannan (or galactomannans), and unsubstituted cellulose are the three main polysaccharides [41]. As one of the main components in coffee, carbohydrates vary with the different products. For instance, arabinogalactans constitute about 1/3 of polysaccharides in mature coffee beans, with their content (14% w/w) lower in the green Arabica beans than in the Robusta beans (17% w/w). Mannan and cellulose are around 22% (w/w) and 7% (w/w), respectively [89], in these two green coffee beans. These compositional differences make it possible for carbohydrates to be effective markers in coffee authenticity evaluation. Carbohydrates endow coffee with a unique aroma during roasting due to the Maillard reaction. Consequentially, these aromas can be detected and characterized by chromatographic analysis when added intentionally. High performance anion-exchange chromatography with pulsed ampere-metric detection (HPAEC-PAD) is frequently used in detecting carbohydrates. The principle of HPAEC-PAD is the weak acidity of carbohydrates to achieve selective separation at elevated pH, using an anion-exchange stationary phase without derivatization [90]. Girard et al. [67] measured the maximum concentrations of total glucose and xylose in pure soluble coffee by HPAEC-PAD, which were 2.32% and 0.42% (w/w), respectively. The adulteration of commercial soluble coffee can then be detected by glucose and xylose. Garcia et al. [68] used HPAEC-PAD combined with chemometrics to identify possible adulteration of coffee husks and corn in coffee. The results showed that galactose and mannose concentrations were higher in pure coffee, with levels of 8.25% (w/w) and 9.65% (w/w), respectively. Pan et al. [69] established HPAEC-PAD analysis for the detection of five major monosaccharides, including mannitol, arabinose, galactose, glucose, xylose, mannose, and fructose in roasted coffee beans, berry powder, black corn, and barley. The five monosaccharides were all present in coffee, while only glucose was detected in berry husks (41.57 ± 2.16%) and no mannose in black corns. Apart from HPAEC, high performance liquid chromatography (HPLC) is also commonly used for carbohydrate analysis. Domingues et al. [70] applied HPLC-HPAEC-PAD and postcolumn derivatization HPLC-UV-Vis for carbohydrate analysis to detect the adulteration of triticale and acai seeds in Brazilian coffee. The results showed that galactose, glucose, and mannose were the main monosaccharides in Arabica coffee, triticale, and acai seeds, respectively. Principal component analysis (PCA) was performed for coffee samples with the same adulteration ratio; the two largest principal components together could explain 99.0% of the data variance obtained by the HPAEC-PAD method, while 95.9% by postcolumn derivatization HPLC-UV-Vis. In general, the resolution and sensitivity of postcolumn derivatization HPLC-UV-Vis were lower than that of HPLC-HPAEC-PAD. However, when considering the ease-of-operation and instrument popularity, the former may be preferable for routine screening of adulterants, while the latter is superior for the development of quantitative and predictive modeling. Cai et al. [18] established routines for detecting oligosaccharides by ultraperformance liquid chromatography (UPLC) and high-resolution mass spectrometry (HRMS). It was used to detect rice and soybean adulterants in coffee. Oligosaccharides are extracted and purified and then derivatized with 2,4-bis(diethylamino)-6-hydrazino-1,3,5-triazine before being injected into UPLC-HRMS for analysis. A partial least squares-discriminant analysis (PLS-DA) was used to identify 17 oligosaccharides that are present only in rice and soybeans. The presence of as low as 5% of rice or soybeans in coffee can be detected following this approach. To sum up, HPAEC-PAD is a promising technique for the detection of carbohydrates in coffee, but few studies have applied it to the detection of coffee adulteration in recent years, possibly due to the limited instrument availability itself.

Other active components may also serve as potential quality markers. Tocopherols and tocotrienols are vitamin E homologues naturally occurring in plants and their by-products [41]. Tocopherols and tocotrienols can protect coffee lipids from oxidation. Tocopherols exist in α, γ, or δ-forms in most plants, but coffee is uniquely rich in β-tocopherol and poor in α-tocopherol [41]. As a result, tocopherols have become another significant discriminator for coffee adulteration detection. Jham et al. [71] determined that the percentages of α, β, γ, and δ-tocopherol in six coffee varieties were 29.0, 61.7, 3.3, and 6.0%, respectively, while 3.6, 91.3, and 5.1% in six corn varieties, respectively, by HPLC. These differences can then be used to detect coffee adulterated with corn. Moreover, γ-tocopherol tends to be best in distinguishing coffee from corn, and R2 was 0.8848. Winkler et al. [72] detected 14 Brazilian roasted coffee mixed with 1–20% (w/w) roasted and ground corn. The tocopherol contents were analyzed by HPLC. A multiple linear regression (MLR) model of corn adulteration by stepwise linear regression analysis was established, with the obtained R2 of training and validation sets of 0.950 and 0.961, and the root mean squared errors (RMSE) of the model were 1.66 and 1.43, respectively. This work could detect the adulteration of corn in coffee as low as 5%. Tavares et al. [73] mixed 5–50% (w/w) of roasted husks, cleaned roasted husks, and roasted maize into ground roasted Arabica coffee. Normal-phase HPLC was reported to extract lipids. The mean Scott–Knott tests, linear regression analysis, PCA, soft independent modeling of class analogy (SIMCA), and linear discriminant analysis (LDA) were performed to evaluate the ratio of adulteration by analyzing the tocopherol amounts. The PCA and LDA models identified 5% adulterants in coffee and accurately predicted the adulteration of corn greater than or equal to 10% and coffee by-products greater than or equal to 20%.

Phenolic acids are also important and ubiquitous markers for plants. Phenolic acids are functional compounds in fruits and vegetables [91]. The main phenolic acid in coffee is chlorogenic acid (CGA) [91]. CGA can provide flavor of bitter, sour, and astringent in coffee brewing, especially caffeoylquinic and feruloylquinic acids [92]. Feruloylquinic and caffeoylquinic acid also furnishes excellent functions of high antioxidant, antibacterial, antiviral, and chemopreventive capacity, which improve the nutritional value of brewing coffee [93]. The differences in phenolic compounds can be served as makers with different roasting degrees, species, and origins [94]. Gornas et al. [74] investigated the phenolic acid profiles of brewed coffee. A variety of factors are studied, including different geographical sources and species, caffeinated and decaffeinated, roasted to different degrees. The phenolic acids were tested using UPLC. The result indicated that the total amount of phenolic acids in Robusta was higher than that in Arabica. Besides, raw coffee beans contain two to six times more phenolic acids than roasted coffee. Mehari et al. [75] analyzed 100 phenolic compounds by UPLC-MS from Arabica coffee beans in Ethiopia, and the consequent PCA results showed that 3-caffeoylquinic acid, 3, 4-dicaffeoylquinic acid, 3, 5- dicaffeoylquinic acid, and 4, 5- dicaffeoylquinic acid were the most discriminative phenolic compounds in the main regional and subregional raw coffee beans of Ethiopia. The recognition and prediction abilities of LDA at the regional level were 91% and 90%, respectively, and 89% and 86% at the subregional level, respectively.

Homostachydrine is a potentially characteristic compound used to detect Robusta coffee. Homostachydrine is a positively charged betaine found in alfalfa, citrus, and achillea genera [91]. It is considered an osmotic fluid in plants and plays an important role in the adaptation of plants to arid or high-salt environments [95]. Servillo et al. [50] measured homostachydrine in coffee by HPLC-ESI-MS, and the results showed that the content of homostachydrine in Arabica beans was 1.5 ± 0.5 mg/kg, while 31.0 ± 10.0 mg/kg in Robusta beans. Therefore, Robusta and Arabica species in roasted coffee mixtures can be determined precisely.

Some of the usual adulterants themselves can be used directly as markers for adulteration detection. Phosphodiesterase 5 (PDE 5) inhibitors can be blended into the premix of instant coffee to enhance the sensory properties of coffee. Yusop et al. [76] used LC-HRMS to detect PDE 5 inhibitors in coffee. Within the expected sample concentration range, the accuracy range was 88.1–119.3%, with a limit of detection (LOD) and limit of quantification (LOQ) < 70 ng/mL and 80 ng/mL, respectively. In subsequent research [77], optimizations of chromatographic separation, MS conditions, and sample preparation were made. The specificity, linearity, range, LOD, and LOQ of the obtained data about 23 targeted PDE 5 inhibitors were obtained.

2.2. Targeted Gas Chromatography (GC) for Characterization of Coffee Quality

Gas chromatography is one of the classical techniques for the characterization of volatile substances [60]. Lipids in coffee were mainly composed of triglycerides (TAGs), sterols, and tocopherols. TAGs account for 75% of the total lipids, in which linoleic acid and palmitic acid are the main fatty acids [41]. The nonsaponifiable portion of the rest consists of 19% totally free and esterified diterpenoid alcohols, 5% totally free and esterified sterols, and a small number of tocopherols [41]. The unique fatty acid profiles of different coffee varieties have been used for quantifying the percentage of Robusta in coffee mixtures. Martin et al. [78] extracted lipids from green and roasted coffee beans and then analyzed the contents of fatty acids by GC. Ten fatty acids, including myristic, palmitic, palmitoleic, stearic, oleic, linoleic, linolenic, arachidic, eicosenoic, and behenic acids, were used to distinguish Arabica from Robusta beans. PCA and LDA completely differentiated Arabica from Robusta coffee, identifying oleic, linolenic, linoleic, and myristic acids as the most differentiated fatty acids. However, the fatty acid profile of coffee is affected by roasting conditions. Roasting at a high temperature can lead to the increase of trans fatty acids; thus, consistency of roasting conditions should be ensured. Similarly, Alves et al. [79] analyzed the fatty acids of 24 coffee samples using the fatty acid profiles. They also effectively distinguished different species of coffee such as Arabica and Robusta. Romano et al. [80] analyzed the fatty acid spectrum of mixed coffee by GC. After Soxhlet extraction and derivation, the sample was formed into methyl ester, which was then analyzed by gas chromatography-flame ionization detector (GC-FID), and total monounsaturated fatty acids (ΣMUFA), linolenic acid (cis18: 3n–3) concentration, the 18 : 0/cis18 : 1n–9 ratio, and the ΣMUFA/ΣSFA (total saturated fatty acids) ratio can be used to determine the proportion of Robusta in Arabica coffee. Hung et al. [51] obtained fatty acid profiles with the eight highest concentrations: palmitic, stearic, oleic, linoleic, linolenic, arachidic, eicosanoid, and behenic acids by Soxhlet extraction followed by GC. Oleic acid, linoleic acid, and linolenic acid found by PCA were the characteristic fatty acids of coffee beans, consistent with the model bulit by Martin et al. [78]. LDA, clustering analysis, and neural network analysis had good discrimination and classification effect on Arabica and Robusta coffee. The result showed that Robusta and Arabica beans were rich in oleic and linolenic acid, respectively. These studies highlighted the importance of the application of multiple chemometric analyses in a targeted dataset.

Diterpenes are also compounds of interest in coffee for detecting coffee adulteration. Cafestol, kahweol, and 16-O-methylcafestol (16-OMC) are three major diterpenes found in coffee lipids [41]. Arabica coffee contains a large amount of cafestol and kahweol, while Robusta coffee mainly contains cafestol. Cafestol in Arabica and Robusta coffee is 5.2–11.8 g/kg and 1.2–4.2 g/kg, respectively [41]. For example, Pacetti et al. [81] presented a method based on GC-FID, in which the ratio of the peak area of kahweol to kahweol and 16-OMC was determined and correlated with the contents of Robusta coffee by a cubic polynomial function with an R2 of 0.998. Kresse et al. [82] obtained the corresponding relationship between the contents of 16-OMC and Robusta beans using LC-GC-FID.

2.3. Thin-Layer Chromatography (TLC) for Characterization of Coffee Quality

Sterols can be used as a specific marker of coffee. Valdenebro et al. [83] extracted sterols by TLC and further detected the percentage of Arabica beans in adulterated coffee by principal component regression (PCR). In addition, the stereoscopic structure of TAGs can also be used to distinguish Arabica and Robusta coffee. Cossignani et al. [52] separated TAGs by TLC, and then the fatty acid methyl (FAME) was obtained by esterification. The high-resolution gas chromatographic fingerprints were further obtained. LDA, based on the fingerprints, can effectively distinguish pure roasted coffee and containing 10% Robusta coffee from coffee mixtures by stereospecificity analysis of TAG.

2.4. Capillary Electrophoresis (CE) for Characterization of Coffee Quality

In addition to HPAEC, capillary electrophoresis (CE) can also be used for the analyses of carbohydrates, a specific marker in coffee. Daniel et al. [32] reported a capillary electrophoresis-tandem mass spectrometry (CE-MS/MS) method, which could detect adulteration of corn or soybean in coffee by evaluating the spectra of monosaccharides after hydrolysis. The detection was completed within 12 min, and the LOQ of all monosaccharides was less than 1.8 mg/100 g dry matter. However, in the early detection of adulterants in coffee, Nogueira and do Lago [84] only analyzed xylose and glucose obtained by acid hydrolysis by CE. The LOQ of the two monosaccharides was 0.2 g/100 g dry matter. According to these works, CE can be used to identify adulterants in coffee, especially suitable when coupled with mass spectrometry.

2.5. Other Targeted Techniques for Characterization of Coffee Quality

Except for specific metabolites in coffee determined by chromatography, the properties of morphology, thermology, and immunology in coffee characterized by targeted approaches such as microscope, digital imaging process, array sensors, sequencing methods, thermal, and immunological detection techniques. These techniques may offer certain attractive properties with some unique advantages over traditional chromatographic methods [41].

2.5.1. Microscopic Morphology Detection

The scanning electron microscope (SEM) obtains images by the electron beam. The combination of histological technology based on the SEM and digital analysis enables magnification more than 300,000 times compared with the optical microscope, thus enabling direct detection of morphological changes for processed foods. It can be used to detect barley, corn, and other grains mixed into coffee [28]. On the other hand, the SEM can also identify adulterated coffee by observing its particle size. For example, Liu et al. [25] identified coffee mixed with acai fruit, fried barley, and black corn powders with the electron microscope and distinguished the adulterants successfully through morphological differentiation.

2.5.2. Digital Imaging Morphology Detection

Besides microscopic imaging, conventional digital imaging techniques can also obtain information about coffee. Its result depended on varied processing and analysis techniques. Souto et al. [85] discriminated pure coffee from adulterated coffee with husks and sticks using digital images combined with the successive projection algorithm for variable selection in association with LDA (SPA-LDA). Compared with SIMCA and PLS-DA, SPA-LDA yielded good performance, attaining a mean accuracy of 92.5% for both the training and test sets. Lopez et al. [58] trained the coffee images obtained by the camera to detect the inclusion of chicory and barley in coffee using the convolutional neural network (CNN), with a classification error of less than 1%. de Araujo et al. [96] established models with one-class partial least squares (OC-PLS) and data-driven soft independent modeling of class analogy (dd-SIMCA) for nondestructive certification of high-quality coffee by NIR and digital images. The dd-SIMCA using RGB histogram for chemometrics-assisted color histogram-based analytical systems (CACHAS) had a 100% of sensitivity and specificity for Brazilian gourmet, superior, and traditional ground roasted coffee.

2.5.3. Array Sensor Techniques

Array sensors are able to collect combined responses simultaneously from sensors that respond highly selectively to the presence of complex compounds, without complex separation [41]. Typical array sensors include an electronic nose and an electronic tongue that can be applied to the identification of coffee adulterants by using multiple nonspecific sensors of cross-reaction responsive to changes in physical properties or surface reactions. Lopetchart et al. [97] successfully identified coffee samples from different origins by electronic nose. The electronic nose and electronic tongue can also be used as a combination or auxiliary tool for other classification methods except for sensory analysis [98, 99]. Suslick et al. [86] identified the complex aroma of coffee with the colorimetric sensor combined with PCA and hierarchical cluster analysis (HCA). Kim et al. [56] discriminated different degrees of roasting coffee by a colorimetric sensor array coupled with PCA and HCA. Rodrigues et al. [100] obtained electrochemical impedance spectra of water-based extracts of coffee samples by using a single unlabeled sensor. PLS-DA and SIMCA were used to conduct supervisory pattern recognition. The PLS-DA model could correctly identify 100% and 96% adulterated and unadulterated samples.

2.5.4. Sequencing Analysis

Sequencing is another powerful tool for the specific detection of dopants in coffee. The widely used approach nowadays in terms of sequencing is to quantitatively detect small amounts of specific nucleotide sequences presented in each specific coffee species by real-time fluorescence quantitative polymerase chain reaction (PCR). Ferreira et al. [87] detected barley adulterated in ground roasted coffee and soluble coffee by developing a real-time fluorescence quantitative PCR method that used specific nucleotide sequences. The DNA of barley, corn, and rice could be quantified at as low as 0.614 pg/mL and 16 pg/mL, respectively. This method has high sensitivity and specificity and is suitable for the detection of adulteration in roasted and soluble coffee. Combes et al. [57] amplified single nucleotide polymorphisms (SNP) using PCR in Arabica and Robusta coffee beans. The SNP sequence of the targeted chloroplast genome was analyzed by the high-resolution melting (HRM) method. Quantification of their mixing proportions could be detected as low as 1% (w/w) Robusta in Arabica coffee, achieving effective identification of the two species of coffee. Couto et al. [101] mixed Arabica and Robusta coffee with different percentages and identified and quantified the percentages of Arabica in different coffee mixtures through PCR.

2.5.5. Thermal Analysis

Different qualities, geographical origins, and degrees of roasting of coffee have different thermal properties. Brondi et al. [31] determined the incorporation of corn into coffee by differential scanning calorimetry (DSC). The adulterated samples can be discriminated from unadulterated samples of coffee by corn at levels below 1% by combining with PCA and PLS models, and there was a good correlation between estimated and reference concentrations. Pereira et al. [59] analyzed corn, coffee husk, and coffee stick mixed into coffee by derivative thermogravimetric analysis (DTG) coupled with the partial least square method, and the prediction errors were 2.6%, 1.4%, and 7.7%, respectively.

2.5.6. Immunological Techniques

In addition, some immunological techniques can also be used to detect the adulteration of coffee. For instance, Trantakis et al. [102] developed a low-cost, one-time, and rapid qualitative test paper method for Robusta in Arabica coffee. Suryoprabowo et al. [88] developed a rapid, simple, sensitive, and semiquantitative lateral flow immunoassay (LFIA) based on gold and fluorescence labelled monoclonal antibody (mAb), achieving rapid determination of tadalafil (TDL) in coffee.

3. Nontargeted Fingerprinting Techniques Coupled with Chemometrics for Analyzing Coffee Quality

Nontargeted or untargeted approaches were methods aimed at drawing conclusions from the overall fingerprints, i.e., without skipping any minor components that might contain useful information. The fingerprinting, combined with chemometrics, provides compositional analysis in a nonselective way by entire chromatograms or spectra [103]. The rapid development of nontargeted methods has become a predominant method of coffee fraud detection in recent years as it is high-throughout, easy to operate, and information-rich. The nontargeted chromatographic, mass spectrometric, and spectroscopic fingerprinting techniques can be applied to coffee safety and quality assurance. Table 4 listed papers referred to in this section.

3.1. Chromatography-Based Adulterant Identification with Chemometrics
3.1.1. Liquid Chromatography (LC) Applied for Nontargeted Detection

Some nontargeted fingerprints obtained by liquid chromatography assisted with chemometrics can effectively identify the adulteration of coffee. Cheah and Fang [104] developed the untargeted analysis of coffee adulteration based on UPLC and chemometrics and built two models based on PCA data: distinguishing coffee samples from adulterated coffee and identifying specific adulterants for adulterated coffee. The sensitivity (SE), specificity (SP), reliability rate (RLR), positive likelihood (+LR), and negative likelihood (−LR) were 0.875, 0.938, 0.813, and 14.1 mixed with 0.133, respectively, for the first model; the SE, RLR, and LR of the second model were 0.333, 0.333, and 0.667, respectively. Nu et al. [34] used high performance liquid chromatography-ultraviolet-fluorescence detector (HPLC-UV-FLD) instrument to obtain HPLC-UV and HPLC-FLD fingerprints at the same time and used PLS-DA to classify instant coffee, instant decaffeinated coffee, and chicory powder. The results showed good classification with R2 greater than 0.999 and calibration error less than or equal to 0.8%. The prediction margin of error ranges from 2.9% to 3.2%; Nunez et al. [30] also used HPLC-UV-FLD to identify chicory, barley, and flour in coffee. The classification accuracy of the PLS-DA calibration and prediction model was 100%, and the LOD of the adulteration level was as low as 15%, and the calibration and prediction errors were 1.4% and 2.4%, respectively. Moreover, when the same method was applied to detect coffee blended with different coffee species (Colombian coffee mixed with Ethiopian coffee and Vietnamese Arabica coffee mixed with Vietnamese Costa coffee), the classification accuracy of PLS-DA reached 100%, the R2 of the PLS calibration model was greater than or equal to 0.988, and the calibration error was less than 3.4% [105]. The prediction error was 3.5%–7.5%.

3.1.2. Gas Chromatography (GC) Applied for Nontargeted Detection

Nontargeted fingerprinting can be obtained by GC for volatile substances directly. Oliveira et al. [29] evaluated the feasibility of detecting the inclusion of roasted barley in coffee by GC-MS. In this method, solid phase microextraction (SPME) was first performed on several ground coffee and barley, and then the headspace volatiles were analyzed using GC-MS. The adulterants were separated effectively with PCA. They found that the higher degree of roasting of the adulterated samples, the easier it was tantamount to tell them apart. This method could detect as little as 1% (w/w) of roasted barley in dark roasted coffee. da Silva et al. [106] classified coffee from different geographical sources and genotypes using GC-Q/MS and multivariate analysis based on metabonomics. PCA and PLS-DA can classify samples with an accuracy of 100%. The score and load plots showed that galactitol, fructose, inositol, and other substances were potential metabolic markers of coffee. Dong et al. [99] detected volatile compounds in roasted coffee beans by headspace-solid phase microextraction-gas chromatography-mass spectrometry (HS-SPME-GC-MS). PCA and CA could effectively distinguish coffee beans with different drying treatments.

3.2. Spectroscopy-Based Adulterant Identification with Chemometrics

The visible, near-infrared (NIR), midinfrared (MIR), Raman (RS), ultraviolet (UV), and nuclear magnetic resonance spectroscopies (NMR) contain unique fingerprinting information. Generally, it is necessary to reduce the dimensions of data by combining with chemometrics owing to the complex spectral data.

Chemometrics such as pattern recognition and detection of outliers [44] can be applied to achieve the quality identification of coffee. Nontargeted monitoring methods such as PCA, factor analysis (FA), and cluster analysis are often used as algorithms. Moreover, chemometrics may help not modeling but every aspect of the entire analysis workflow. For instance, before modeling was applied for spectral data, the stability of the spectrometer and the spectral noise should be examined. The inaccurate prediction results should be eliminated by data pretreatment. In recent years, models using spectral methods have been established to detect adulterants in ground coffee to identify coffee with different types, degrees of roasting, and geographical origins. Although this process requires complex equipment and data analysis using mathematical procedures, detection is often rapid and simple once parameters and methods are established.

3.2.1. Nuclear Magnetic Resonance (NMR)

NMR presents the spin states of carbon and hydrogen atoms. The number of existing specific hydrogen nuclei and the situation of their surrounding environment can be determined based on H-NMR, without derivatization or separation. It is worth noting that in recent years, NMR combined with chemometrics has been increasingly popular in the field of food quality analysis, such as the identification of beer [125], determination of fat content in milk [126], and adulteration of peanut oil [127]. It is also widely used in the adulteration detection of coffee. For example, Ribeiroa et al. [107] also used H-NMR to detect the adulteration of roasted coffee blended with corn, coffee husks, barley, and soybean. The mixed samples with four adulterants were identified. PCA showed a great classification performance. Toci et al. [94] created fingerprints of roasted coffee by NMR coupled with a multivariate statistical program, through which coffee from different origins can be identified. Gunning et al. [108] used low-field NMR to judge the absence of 16-OMC in Arabica coffee by the occurrence of an abnormal marker peak (3.16 ppm). Adulterants with content as low as 1–2% (w/w) in arabica coffee can be identified by this method. Milani et al. [109] conducted quantitative detection of adulterated coffee added in barley, corn, coffee husk, soybean, rice, and wheat by H-NMR combined with chemometrics analysis. The LOD for detecting adulterants in medium and deep-roasted coffee was 0.31%–0.86%. The PCA effectively identified adulterated coffee, and the SIMCA model classified all samples in both training and prediction sets correctly. They indicated that NMR could be a faster and more accurate choice for improving coffee quality control. Happyana et al. [110] obtained spectral images of roasted Arabica coffee in 4 Indonesian regions by NMR. The models of PLS-DA and orthogonal projection potential structure discriminant analysis (OPLS-DA) distinctly identified coffee from different geographical origins and cleared discriminant metabolites for each coffee. Burton et al. [27] studied 292 coffee samples by using H-NMR. They quantitatively predicted the percentage of Robusta coffee in the mixture from 12 coffee components. The integration and spectral deconvolution helped quantitative analysis. Alves et al. [53] evaluated coffee with different roasted times and temperatures by NMR. They completed the discrimination of different coffee varieties. To conclude, NMR is highly precise and suitable for chemometrics modeling. The NMR has been widely studied in recent years and continues to deliver accurate fingerprints.

3.2.2. Infrared Spectroscopy (IR)

Infrared spectroscopy is traditionally an effective tool for obtaining “fingerprints” of complex organic samples. In fact, one of the foremost chemometrics applications is IR spectra analysis. In addition, this method is usually nondestructive and requires a small volume, which is a powerful tool for rapid analysis of the quality of coffee samples.

NIR has been applied to detect relevant quality issues about coffee, including determining different geographical origins of coffee, varieties, degrees of roasting, and adulterants [128]. Buratti et al. [98] analyzed the spectral and morphological data of raw coffee, ground coffee, and coffee beverage by NIR, electronic nose, and electronic tongue coupled with PCA. The results showed that the classification rate of PCA was 100% with NIR. Chen et al. [111] established a rapid identification method for coffee combined NIR with an adulterant screen algorithm, which effectively identified coffee mixed with acai powder and barley. The lowest identification content of acai powder and barley mixed with coffee was 2% and 5% (w/w), respectively. Giraudo et al. [112] identified 191 coffee from nine countries by Fourier transform-near infrared spectroscopy (FT-NIR) coupled with partial least square regression (PLSR) and PLS-DA. The prediction accuracy of its model was higher than 98%. Chakravartula et al. [113] analyzed the adulteration of chicory barley and corn in coffee by FT-NIR and convolutional neural network (CNN). The results showed that CNN had excellent predictive performance (R2 higher than 0.98), better than the PLS model. Manuel et al. [54] conducted PCA and HCA for unsupervised identification using NIR data from 34 coffee samples. The accuracy of the model in the validation and test sets was 100% and 87% by the algorithm of SIMCA, respectively.

MIR coupled with chemometrics can also provide a fast and accurate technique to detect and predict adulteration based on the food matrix [129]. Fourier transform infrared spectroscopy (FTIR) can identify defected and nondefected coffee [3], which is widely used in the midinfrared region. Extensions of FTIR probes, including attenuated total reflection (ATR) for liquid samples and diffuse reflectance (DRIFT) for solid samples, further extend the convenience of operation. ATR collects information from the surface of the sample, while DRIFT provides information from the entire sample [21]. Compared to ATR-FTIR, the spectrum provided by DRIFT has a higher absorption intensity and shows a more efficient performance in distinguishing between different qualities of rough coffee, making it more suitable for analyzing roasted coffee. Briandet et al. [114] distinguished instant coffee adulterated with glucose, starch, and chicory using Fourier transform infrared (FTIR) spectroscopy in ATR and DRIFT modes, combined with PCA and LDA. Reis et al. [115] conducted quantitative evaluation of adulterants in roasted coffee by FTIR-DRIFT and chemometrics. Later, they [116] detected coffee husks, waste coffee grounds, barley, and corn mixed into coffee by FTIR-ATR and PLS. The correlation coefficients of both calibration and verification sets could reach 99%, with errors of 0.69% and 2%, respectively.

Some researchers compared the performance of NIR, MIR, and NMR spectroscopy in coffee classification and identification. As early as 1997, Downey et al. [130] identified coffee varieties that used NIR and MID-IR spectroscopy coupled with chemometrics including PCA, factorial discriminant analysis (FDA), and PLS and found that data from combined spectral regions had better performance. Bona et al. [131] successfully classified coffee from different geographical origins by using near-infrared and midinfrared combined with support vector machines (SVMs). The results showed that both the sensitivity and accuracy of the method were 100%. By comparison, in some work, it was found that NIR yields better performance than MIR. Medina et al. [132] compared the ability of several spectral techniques such as attenuated total reflectance midinfrared, near-infrared, and H-NMR spectroscopy, which were applied for identifying coffee geographical origins. They found that NIR technology was inferior to ATR-MIR and H-NMR techniques.

Raman spectroscopy (RS) can be used in the identification of coffee varieties. Luna et al. [117] studied coffee variety identification. They classified coffee varieties by RS coupled with direct sample analysis and chemometrics. The LDA, mixture discriminant analysis (MDA), quadratic discriminant analysis (QDA), regularized discriminant analysis (RDA), partial least squares-discriminant analysis with Bayesian inference (PLS-DA with Bayesian inference) and SIMCA after processing mean-centering (MC), and multiplicative scattering correction (MSC) were compared in this study. The results were promising, 98.7% of the samples were correctly classified by LDA, while MSC, MDA, RDA, QDA, PLS-DA, and SIMCA could all achieve 100% of the classification. Correct classification rates of LDA MDA, RDA, QDA, PLS-DA, and SIMCA were 62.7%, 70.7%, 62.7%, and 62.7%, respectively. MSC could provide more accurate results compared with the MC approach, further underlining the importance of spectral data preprocessing.

3.2.3. Ultraviolet-Visible Spectroscopy (UV-Vis)

UV-Vis can be used either hyphenated as the detector of chromatography or independently as a spectral method for quality assurance of coffee. UV-Vis usually has a lower cost and wider instrument availability compared with IR. Yulia and Suhandy [33] obtained UV-Vis spectral data in the range of 250–400 nm by mixing different proportions of corn into coffee. PLSR, multiple linear regression (MLR), and PCR predicted the adulteration ratio of corn in coffee. The results showed that PLSR yielded better prediction. Suhandy and Yulia [118] quantified the percentages of adulteration in roasted coffee by UV-Vis combined with chemometrics. They established a multivariate calibration model for the percentage of low (0–20%, w/w), medium (30–50%, w/w), and high (60–90%, w/w) adulteration in accordance with PLSR. The percentages of adulteration were accurately predicted with R2 = 0.977. Suhandy and Yulia [119] also obtained the spectra of coffee and roasted rice in the range of 200–400 nm by using a UV-Vis spectrophotometer and established the PLS model through three data pretreatment algorithms. The t-test resulted that they converted spectral data of samples in the range of 250–390 nm, and the optimal PLS model was established in this range.

3.2.4. Laser-Induced Breakdown Spectroscopy (LIBS)

LIBS is a rapid, low-cost, and residual-free spectroscopic technique. LIBS is reported to identify geographic origination and variety, as well as to detect adulteration of coffee. The elemental spectrum was obtained based on the analysis of microplasma induced by laser pulse shooting on the surface sample. Anggraeni et al. [120] used LIBS to distinguish Arabica and Robusta beans from different geographical sources. Discriminant analysis showed that Ca, W, Mg, Be, Na, and Sr were the most abundant elements in raw coffee beans. Silva et al. [121] distinguished defective beans in coffee by LIBS, which can be used as a quality assessment of coffee. Zhang et al. [122] effectively differentiated different coffee varieties based on the combination of the LIBS technique and chemometrics. The classification accuracy of the radial basis function neural network (RBFNN) and SVM model exceeded 80%. Sezer et al. [26] detected wheat, corn, and chickpea adulterated in coffee by LIBS. The calibration and validation model of PLS had R2 > 0.989, and the LOD of these three adulterants were 0.45%, 0.52%, and 0.56%, respectively. Silva et al. [123] detected black, immature, and sour coffee beans in mixed coffee samples by LIBS and MLR, and the R2 of the fitted linear regression model was higher than 80%.

3.3. Mass Spectrometric (MS)-Based Adulterant Identification with Chemometrics

ESI-mass spectrometry (ESI-MS) has been widely applied in the analysis and characterization of food. Its combination with chemometrics can be applied to detect the adulteration of coffee. Aquino et al. [22] identified adulterated coffee containing coffee husks by the ESI-MS method coupled with PCA, which can quickly and reliably detect the coffee husks (10%, w/w) in adulterated roasted coffee samples. Correia et al. [124] compared negative-ion mode electrospray ionization Fourier transform-ion cyclotron resonance mass spectrometry (ESI-FT-ICR MS) and attenuated total reflection Fourier transform spectroscopy (ATR-FTIR) in the midinfrared region. The PLS model demonstrated that the former had lower LOD and LOQ. Monteiro et al. [55] compared proton-transfer mass spectrometry (PTR-MS) with NIR. The PLS-DA and LDA-k-nearest neighbors (LDA-kNN) models showed the differences in coffee from different geographical origins. The analysis effect of data obtained by PTR-MS was slightly better than that of NIR. Lotz et al. [133] designed a liquid extraction pen (LEP), which is used in conjunction with ESI-MS for rapid certification of food products including coffee. The current application of mass spectrometry in the detection of coffee adulteration focuses more on the development of simple and rapid methods and also attaches more importance to the analysis function of chemometrics, which can distinguish coffee samples with smaller differences compared with the application of direct injection mass spectrometry more than ten years ago [134].

4. Conclusion and Future Prospects

This review highlighted targeted and nontargeted detection to ensure the quality of coffee. The targeted approach is reliable and effective when the compounds have been well studied or the metabolisms well understood, and its result is accurate and sensitive. However, the targeted approach may suffer from some disadvantages. Foremost, despite the growing number of incidents of food safety which appeared, the delayed response of food administration departments after serious safety accidents appeared. This phenomenon highlights the fact that the targeted approach is insufficient to handle the ever-increasing list of analytes, which must be included in a typical food safety assay, while with the limited knowledge of the range of each constituent in normal lots of food. During the nontargeted detection process, no single chemicals were used as standards or reference chemicals. Therefore, it may be able to perform early warnings for unknown coffee frauds. Moreover, nontargeted detection is also capable of large-scale screening of chemical or biomarker by comparing the normal and abnormal samples. The candidate biomarker can be further subjected to detailed targeted analyses. However, it needs a complex process of data with different chemometrics. In recent years, the nontargeted approach overall has a wider research interest in coffee detection.

Despite that the analysis of coffee quality can be considered a well-studied field due to many lengthy years of studies, there are still areas worth investigating further. For instance, the presence of multiple types of adulterants remains a challenge. If this problem is resolved, regulators will use spectroscopy-based methods for simple detection of products. Besides, given the different demands of manufacturers and consumers, the selection of detection methods varies based on different aims. It may be an interesting tendency to combine multiple detection techniques with their respective advantages. Looking forward, the nontargeted approach offer fast, easy, low-cost, and nonspecific fingerprinting identification that may become popular in the coffee industry. The improvement of relevant laws and the establishment of a traceability system can realize the effective control of coffee quality, and the continuous development of big data can lay a foundation for the continuous development of the coffee industry. As for coffee products, the chemical composition of raw coffee, roasted coffee, instant coffee, ground coffee, ready-to-drink coffee, and capsule coffee is greatly different due to the disclosed production ingredients and methods, and the related research on the change of the process involved in coffee products needs further exploration.

Abbreviations

16-OMC:16-O-methylcafestol
AM:Average model
ANNOVA:Analysis of variance
ATR:Attenuated total reflection
CE:Capillary electrophoresis
CNN:Convolutional neural network
CVA:Canonical variate analysis
DA:Discrimination analysis
DAD:Diode array detector
dd-SIMCA:Data-driven soft independent modeling of class analogy
DLLME:Dispersive liquid-liquid microextraction
DRIFT:Diffuse reflectance
DSC:Differential scanning calorimetry
DTG:Derivative thermogravimetric analysis
ECD:Electron capture detector
EM:Electron microscope
ESI-MS:Electrospray ionization-mass spectrometry
FA:Factor analysis
FAME:Fatty acid methyl
FDA:Factorial discriminant analysis
FID:Flame ionization detector
FLD:Fluorescence detector
FPD:Flame photometric detector
FT-ICR MS:Fourier transform-ion cyclotron resonance mass spectrometry
FTIR:Fourier transform infrared spectroscopy
GC:Gas chromatography
HA:Hierarchical analysis
HCA:Hierarchical cluster analysis
HPAEC-PAD:High-performance anion-exchange chromatography with pulsed ampere-metric detection
HPLC:High-performance liquid chromatography
HRMS:High-resolution mass spectrometry
ICO:International Coffee Organization
kNN:k-nearest neighbors
LC:Liquid chromatography
LDA:Linear discriminant analysis
LFIA:Lateral flow immunoassay
LIBS:Laser induced breakdown spectroscopy
LLE:Liquid-liquid extraction
LOD:Limit of detection
LOQ:Limit of quantification
mAb:Monoclonal antibody
MC:Mean-centering
MDA:Mixture discriminant analysis
MIR:Mid-infrared spectroscopy
MLR:Multiple linear regression
MLRM:Multiple linear regression model
MS:Mass spectrometry
MSC:Multiplicative scattering correction
NIR:Near-infrared spectroscopy
NMR:Nuclear magnetic resonance
NPD:Nitrogen phosphorus detector
PCA:Principal component analysis
PCR:Principal component regression
PDE-5:Phosphodiesterase 5
PLS-DA:Partial least squares-discriminant analysis
PLSR:Partial least squares regression
PTR-MS:Proton-transfer reaction-mass spectrometry
QDA:Quadratic discriminant analysis
RDA:Regularized discriminant analysis
RID:Refractive index detector
RMSE:Root mean squared errors
RS:Raman spectroscopy
RSD:Relative standard deviation
SCG:Spent coffee grounds
SEM:Scanning electron microscope
SIMCA:Soft independent modeling of class analogy
SPA-LDA:The successive projections algorithm for variable selection in association with linear discriminant analysis
SPE:Solid-phase extraction
SPME:Solid-phase microextraction
SVM:Support vector machines
TAGs:Triglycerides
TCD:Thermal conductivity detector
TDL:Tadalafil
TLC:Thin-layer chromatography
UPLC:Ultrahigh performance liquid chromatography
UV-Vis:Ultraviolet-visible spectroscopy.

Data Availability

All data used to support the findings of this study are included within the article.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

The authors would like to acknowledge the financial support by the National Key Research and Development Program of China (grant no. 2018YFC1602400).