Abstract

Sensory analysis or cup testing has been widely used in the coffee production chain for the validation of final quality. The tasters are responsible for defining the patterns and qualitative profiles of the drink based on the sensorial analysis and according to their gustatory sensibilities, which are often acquired by professional experience. However, the literature has not discussed in detail the relationship between the number of tasters and the consistency of sensorial analysis. Thus, using the bootstrap simulation methodology to estimate the optimum plot size, this study quantifies and proposes a specific number of tasters for the process of sensorial analysis of specialty coffees. The results indicate that the use of 6 tasters is sufficient to conduct sensorial analysis following SCA and BSCA protocol for coffees in the Arabica group, as well as 6 tasters for coil and Conilon coffees. From this number, no gains in precision are observed in the process of sensorial analysis of coffee with addition tasters.

1. Introduction

In the coffee industry, the tasting procedure is used to negotiate the price based on the quality of the drink, which is described by the tasters using personal opinion and tasting experience accumulated over the years [1]. Although the tasting process is widely used, for Di Donfrancesco et al. [2], this is not the best method to evaluate coffee quality, due to a range of factors that interfere with the tasting process.

According to Alvarado and Linnemann [3], the “taster” is a judge who performs the sensory evaluation, and this agent is in charge of evaluating the quality of the coffee and, consequently, the evaluation of quality is influenced by his sensorial perceptions.

The taster tends to prefer one sensory profile to the detriment of others or even according to commercial and industrial standards in order to meet the demand of certain clients. This sensorial classification of coffees is based on the taste of the cup. Tasting technique may vary and/or be modified based on the location of the tasting [4]. The field of sensory evaluation of coffees grew rapidly in the second half of the twentieth century, along with the expansion of the processed food and consumer products industries. Sensory evaluation is now an irreplaceable tool in the food industry and is used in key sectors in the production of specialty coffees [5].

However, there is no consensus in the literature about the number of testers to be used during sensory analysis of coffee, as well as the terminology for the profession. Traditionally, judges should be screened for sensory capabilities that meet the requirements of assessments [6]. In companies that have tasters, it is common that the team consists of a quality control leader and assistants; however, in many producing regions (origins/rural areas) the purchasing offices use only one person in this role.

It is known that tasters trained and experienced in sensorial analysis of coffee are not common because the method is not usually taught in colleges and technical schools in Brazil. For Dzung and Dzuan [5], one of the main problems in using experts in sensory evaluation is that the qualification of the tasters is not well defined. In accordance with ISO 856-2, experience is not the only criterion of a specialist; he must also be trained and have high sensory sensitivity.

Some methodologies, such as the Specialty Coffee Association (SCA), Brazil Specialty Coffee Association (BSCA) protocols for Arabica coffee (Coffea arabica L.), and the Uganda Coffee Development Authority (UCDA) for Conilon coffee (Coffea canephora Pierre ex Froehner), define procedures and provide protocols for sensory evaluation of specialty coffees. These methodologies are commonly adopted in Brazil and the rest of the world for coffee quality contests and scientific studies with applications of sensorial analysis.

The use of trained judges is an important part of the sensory evaluation tradition [6], but for Ross [7] the cost of sensory tests is very expensive, when a large number of specialists is used due to the difficulty of accessing these professionals.

The amount of tasters used in sensorial analysis of coffee may compromise the quality of the study. On the one hand, the use of few testers can cause loss of accuracy of the analysis. On the other hand, the use of many tasters can be expensive because it uses a greater number of tasters than necessary. In addition, an environment with many people can cause external noise, thus compromising the quality of the study, as already demonstrated by Pereira et al. [8].

Therefore, the definition of the number of trained coffee tasters has not been discussed in studies to define the optimal amount of coffee used in sensory analysis. Aiming for greater consistency in the implementation of SCA, BSCA, and UCDA sensory analysis protocols, this article used the bootstrap method [9] to estimate the variation as a function of the number of tasters, which is a statistical technique of resampling with replacement used in several academic fields [10].

Using the framework presented above, the study estimates the optimal number of Q-Graders and R-Graders (the Q Coffee System identifies quality coffees and brings them to market through a credible and verifiable system; a common standard for both Q Arabica (specialty grade) and Q Robusta (Fine Robusta Grade) coffee has resulted in a universally shared language and standard top scoring lots) to be used in sensory tests that adopt SCA, BSCA, and UCDA methodologies.

2. Materials and Methods

2.1. Preparation of Samples

The studies with Q-Graders and R-Graders were conducted and elaborated in the Laboratory of Analysis and Research in Coffee, LAPC, of the Federal Institute of Espírito Santo in 2016. All the coffees (Arabica and Conilon) come from the 2015/2016 harvest.

The roasting was carried out with a roaster Laboratto TGP2, and all the samples were toasted between 8 and 10 minutes. For the standardization of the roasting process, the set of Agtron-SCA disks, the degree of roast of these samples, was chosen among the colors determined by the disks #65 and #55, for both Arabica coffee and Conilon coffee. The roasting process was performed 24 hours in advance, and after the roasting and cooling the samples remained sealed. The grinding was performed after the time of 24 hours of rest after the roasting process according to the methodology of sensorial analysis established by the SCA. The grinding was carried out in electric mill model BUNN G3, with granulometry in 20 meshes, following the US standard.

For the evaluation of the Arabica coffees, the protocols of the SCA [11] and BSCA [12] were used. For the samples of Conilon coffee, the same procedure of roasting, resting, and grinding of Arabica coffee was established, but the UCDA protocol was used for sensory evaluation.

2.2. Method of Sampling

The quality of the Arabica coffee was evaluated using the SCA protocol, and it is expressed through a centesimal numerical scale. The tasting form provides an opportunity to evaluate eleven important attributes for coffee: fragrance/aroma, uniformity, clean cup, sweetness, flavor, acidity, body, aftertaste, balance, and overall and total note. Highly positive results arise from the perception of a balanced set formed by the evaluated attributes. The attributes of Arabica coffee are the same in the two protocols, SCA and BSCA.

The results of this sensory evaluation are established from a scale of 16 (sixteen) units representing quality levels with intervals of 0.25 (one-quarter of a point) between numerical values between “6” and “10.” Coffees were considered good from 6.00 to 6.75, very good from 7.00 to 7.75, excellent from 8.00 to 8.75, and exceptional from 9.00 to 10.00 points. The same interval procedure applies to all three protocols. Theoretically, a scale varies from a minimum value of 0 to maximum of 10 points. There are a few differences between the protocols; BSCA begins with 30 points and UCDA has one different score in sweetness regarding SCA protocol. Nevertheless, the results are the same for all coffees; if >80 points, coffee is considered of specialty grade; if <80 points, coffee is considered below specialty quality.

For Conilon coffee, the quality was evaluated using the UCDA protocol, and the following attributes were obtained: total score, fragrance/aroma, flavor, balance, salinity/acidity, body, bitter/sweetness, and clean cup.

In the case of samples of Arabica coffee and Conilon coffee, which were adopted in the experiments, all presented minimum scores of 80 points, ranging from 80 to 84 points, for both groups.

2.3. Conduction of the Experiment with Q-Graders and R-Graders

The tests were performed in a sensorial laboratory under actinic artificial light, at 23°C and with air circulation. The samples were arranged on two rectangular tables of 110 cm in height, 70 cm in width, and 1.35 meters of length.

For the accomplishment of the study, a blank experiment was conducted, composed of 10 tasters who evaluated 20 samples of Arabica coffee, with a minimum grade ≤ 80 points, considered specialty coffee according to SCA, BSCA, and UCDA sensory protocols.

The table of the 20 samples of Conilon coffee was carried out separately from Arabica coffee using the same 10 tasters who evaluated the Arabica coffee samples. In the case of Conilon coffee, the cut-off score of >80 points was used to select the 20 samples for cupping.

All the tasters used in the studies have certifications, either Q-Grader, R-Grader, or COB, as well as experience in performing sensorial analyses with the following protocols: SCA, BSCA, and UCDA.

2.4. Ideal Number of Tasters

For the grouping of the pair numbers of tasters and their respective coefficients of variation , we used the bootstrap method, where 1000 sample simulations were performed with 1, 2, 3, 4, 5, 6, 7, 8, 9, and 10 tasters [13].

In order to determine the optimal size of the tasting room, the linear regression method of plateau response was used [14]. The optimal number of tasters was proposed when the linear model becomes a plateau:Here is the response variable, is the linear coefficient of the linear model of the segment before the plateau, is the angular coefficient of this same segment, is the error associated with the th observation, is the plateau, and is the point of attachment of the two segments. and should be estimated.

For the statistical analysis, the free software R was used to perform the bootstrap simulations and to obtain the statistics of the method of reaching the optimal number of tasters [15].

3. Results

Tables 1 and 2 show the results obtained from the 1000 sample simulations of the gustatory characteristics for Arabica and Conilon coffees, respectively, using the bootstrap method, with tasters and their respective coefficients of variation.

According to the results presented in Tables 1 and 2, it is evident that the coefficient of variation as a function of the number of Q-Graders and R-Graders decreases up to a certain point, and from there the increase of the number of tasters for the sensorial analysis does not help to increase accuracy.

Figure 1 shows that 6 tasters are required to evaluate total score (a) and fragrance (b) and 5 for flavor (c) and balance (d) of Arabica coffee using the linear regression method plateau.

In the graphs shown in Figure 2, it is estimated that 5 tasters are needed to evaluate the acidity (a), body (b), sweetness (c), and aftertaste (d) characteristics of Arabica coffee, by the same regression method.

The results related to Figure 3 show that 5 tasters are needed to evaluate fragrance (a), flavor (b), and acidity (c) and 6 tasters for sweetness (d) of Conilon coffee.

Figure 4 shows the results regarding the amount of tasters necessary to evaluate the gustatory quality of Conilon coffee and, specifically, that 5 tasters are required to evaluate the attributes aftertaste (a), balance (c), and total score (d) and 6 tasters to evaluate clean cup (b).

4. Discussion

These results confirm the work done by Bhumiratana et al. [16] and Di Donfrancesco et al. [2] who used six trained testers to perform sensorial analysis of coffee. Further, these results are in parallel with Cook et al. [17], who adopted the use of 15 tasters initially and, after more careful selection based on sensorial sensitivity, used 6 tasters for Arabica coffee.

Other authors used a number of tasters inferior to those found and recommended in this study. The works of Bosselmann et al. [18], who adopted 3 tasters, used the SCA methodology to perform the coffee quality analysis, Alvarado and Linnemann [3] used 1 taster and a judge with 12 consumers trained to conduct their study, and Pereira et al. [19] adopted the use of 3 tasters.

In the study by Ribeiro et al. [20], 11 experienced testers were used to perform the sensorial analysis of coffee.

However, many authors do not indicate if the tasters had any type of international standardization in the materials and methods, indicating once again the lack of standardization of the process of sensorial analysis of coffee. In the work of Pereira et al. [19] and Evangelista et al. [21], the authors do not even identify the number of tasters used in their studies to perform the sensorial analysis of coffee.

Relevant considerations proposed by Chambers et al. [22] emphasize that in addition to the minimum number of tasters it is necessary to study the consistency of who is carrying out the analysis. For the authors, the three-member panel, trained and experienced, had smaller residual error equal to one square of the semitrained panel for a sensory analysis study of chicken, turkey, and other birds, indicating that, in addition to number, consistency should be respected and widely observed. This indicates and reinforces the need to use professionals such as Q-Graders and R-Graders, since these professionals are previously trained to perform such activities. This fact is also verified by Chambers et al. [23], because, for the authors, the training time contributes to the level of accuracy of the evaluator, reinforcing the need for evaluators’ training.

Thus it is evident that standardized methodologies can give greater robustness to research and academic works that use this technique. Many studies use Q-Graders, R-Graders, and experts in sensory analysis and often do not provide certification of the specialists. It is necessary to demand more veracity of sensory analysis of coffee, which is impossible without consistency in the number of testers.

Nebesny and Budryn [24] have observed that there are differences between the perceptions of women and men during the sensorial analysis of coffee (in the aroma evaluation question). In the same line, Cook et al. [17] verified that differences in gender and age, as well as psychological factors, have been attributed to the different perceptions of the sensorial analysis. As Pereira et al. [8] have described, the noise factor has been pointed out as one of the greatest villains of the consistency of sensorial analysis with the use of Q-Graders.

It is plausible that judges’ perceptions may vary according to these criteria, but this method is more standardized in the process of classification.

In this way, it is possible to express from the results shown that the number of 5 to 6 Q-Graders and/or R-Graders would be sufficient to ensure accuracy of the results of the sensory analysis (cup tests) and that the gains with the same would not be significant with the use of more testers for the decision making. As such, the results presented in Figures 1, 2, 3, and 4 show that the coefficient of variation decreases with more Q-Graders and/or R-Graders in sensory analysis. Moreover, the optimum number of tasters occurred when the linear model becomes a plateau, and the linear regression method of plateau response is a robust tool to quantify and validate the total need of tasters in sensory analysis.

5. Conclusions

The modeling applied in this study allows concluding that according to the data tested, it is possible to recommend the minimum number of evaluators, for these conditions. However, this approach is limited to the data of this study. In the case of limitations on the availability of Q-Graders or R-Graders, the simulation and regression methods with plateau can be a solution model. Based on this the conclusions are as follows.

It is necessary to use 6 or more Q-Graders to perform the sensory analysis with the SCA and BSCA protocol in scientific studies and in routine taste tests for marketing purposes.

For the UCDA protocol, the use of 6 or more R-Graders for the sensory analysis of Conilon coffee is recommended, in research and in sensory analysis for commercialization.

Further studies should be developed about the accuracy and consistency of Q-Graders and R-Graders. In addition, it is necessary to improve and approximate the scientific methods of validation of the level of accuracy of these professionals, so that institutions that offer such courses can raise the level of precision of these professionals through less subjective techniques.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Acknowledgments

The authors acknowledge the Federal Institute of Espírito Santo for supporting this research and also the translation and review of this article, as well as the Q-Graders’ and R-Graders’ participation, who dedicated themselves to the realization of this study. They also acknowledge National Council for Scientific and Technological Development (469058/2014-5), CNPq, the Secretariat of Professional and Technological Education of the Ministry of Education, SETEC, for the availability of resources for research, and the Credit Unions System in Brazil, SICOOB, for the support and funding of research.