Reference calibration is a useful technique when sensory evaluation is not feasible or practical. This study was conducted to predict the crispness perception of potato chips evaluated by instrumental means through the reference-calibrated method. To investigate the relationship between sensory evaluation and instrumental measurement data, six different standard references for crispness were used. Instrumental crispness was predicted by measuring the total area of the chip with a texture analyzer. The samples were also evaluated by six trained panelists. Nine chips with various textures were tested by a mechanical test. The Fechner, Stevens, and Beidler models were applied to investigate the correlation across the reference scores with the total area then predicting the sensory crispness for the nine chips. The relationship between the instrumental total area and the sensory crispness of the standard references was found to be nonlinear over the standard crispness scale. The Fechner model seemed to be the best predictive model (R2 = 0.8; RMSEC = 1.3; robustness = 2.4; model discrimination index = 3.3) for predicting the sensory crispness of chips. This study suggests that standard references with a reference-calibration method can be used to calibrate the crispness of potato chips. It is the first study to demonstrate that sensory crispness of potato chips could be predicted using comprehensive references for crispness. Based on the results in this study, an equation established by the Fechner model could be utilized to estimate the various crispness of potato chips under circumstances in which sensory evaluation is not practical or available.

1. Introduction

Sensory texture has been considered to be a multidimensional sensory attribute which was perceived and evaluated by humans [1]. Therefore, the sensory texture of food has traditionally been determined using a trained sensory panel as an effective tool [2]. However, training and evaluating sensory texture by trained human panelists is quite cumbersome since panel training is time consuming, and compensating the panel for their time and effort is costly [3]. In addition, the variability between human subjects can become an issue even when the subjects are highly trained for descriptive analyses, so the demand for instrumental measurements has been increasing in the food texture.

Since texture is best perceived by humans, instrumental measurements for texture should demonstrate a high correlation with human perception to replace the sensory panels [1, 4]. To correlate the sensory and instrumental texture measurements, a common practice is required to the results from both sensory evaluation and instrumental measurements to determine how well they correlate with each other. If a strong correlation (r = ±0.8∼±0.9) is observed [5], predictive models based on the relationship between these two variables can be established [6]. A weak point of this practice is that if sensory evaluation is not feasible under certain circumstances, obtaining the predictive models through conventional means would not be possible.

Szczesniak et al. [4] first introduced the sensory standard rating scales. The intensities of the textural attributes such as hardness, brittleness, chewiness, gumminess, viscosity, and adhesiveness were evaluated for the entire scale range, and then, commercial products were used as anchor for references. In this way, the intensity of a given textural attribute for the samples to be tested can be evaluated by a simple comparison with the standard ratings of reference samples. This standard rating scale method based on references has been adopted and executed in sensory evaluations [7]. The reference points are selected to indicate the different intensities of the scale and are applied to accurately calibrate the panelists in a manner similar to that of the pH buffer calibrating the pH meter [8]. Following training, all of the panelists must use scales in the same way and must rate the specific attributes of a particular sample at the same intensity [9]. Standard references can help calibrate the tool for predicting sensory scores. Sanahuja et al. [10] stated that psychophysics of multisensory integration can play a key role to improve relationships between sensory and instrumental results. Psychophysics handle the relationship between measurable stimuli (i.e., sensory intensity) and the corresponding responses (i.e., sensory ratings). Such psychophysical models as the Stevens, Fechner, and Beidler have been used to correlate instrumental measurements and sensory intensity scales [8]. Briefly, Stevens’s law or Stevens’s power law explains a relationship between the magnitude of a physical stimulus and the resulting intensity to be perceived [11]. Fechner’s law is based on the fact that each individual would have a different sensitivity to a stimulus. It describes that the perceived sensation is proportional to the logarithm of the stimulus [12]. Lastly, Beidler proposed a new model to overcome a weakness of the Stevens’s and Fechner’s law in a point that merely exploit mathematical equations to better fit observed sensory data [13]. In his modeling, human psychophysical response is proportional to the neurophysiological response.

Meullenet et al. [14] successfully employed standard references for sensory and instrumental hardness. They reported that the sensory hardness of 21 food products was satisfactorily predicted using psychophysical models based on the relationship between the instrumental peak forces and the standard scale values for the references. Furthermore, Xiong and Meullenet [8, 15] presented a new alternative approach in which instrumental measurements are related to the sensory standard rating scales to determine the correlation between the sensory and instrumental measurements for texture parameters. Specifically, the standard references are used to calibrate instrumental data, which can then be used as a panel to predict the sensory intensity ratings for the calibrated attributes. The authors referred to this approach as a reference-calibration method and reported that it showed potential for predicting the sensory hardness of food products using instrumental texture measurements.

Crispness of food is mostly assessed by trained sensory panels and is the most frequently used term to describe the texture of potato chips, and it is the primary attribute to determine their quality [10, 16]. The reference-calibration method has not yet been applied to measure the crispness of snacks. Therefore, the objective of this study was to apply the reference-calibration method to predict the sensory crispness of potato chips using three psychophysical models: Stevens, Fechner, and Beidler models.

2. Materials and Methods

2.1. Sample Preparation

Original flavor potato chips (325 g in each bag) commercially available in Korea were purchased from potato chip producers. The manufacture dates on the packages of the samples were all identical, and all samples came from the same lot. To obtain a wide spectrum of crispness from the same quality of potato chips, moisture content (MC) was selected as a key factor to be adjusted since Peleg [17] reported that dry cellular snacks lost brittleness by becoming soggy with moisture uptake. A constant temperature and humidity chamber (DA-HC-280, Dong-A Science, Si-hung, Korea) was used to modify the MC of the various potato chips. The chips were placed at a relative humidity (RH) of 30–70% and stored for 1–15 h in the temperature and humidity chamber. Similar size of the chips was selected, and then, approximately 130–150 chips were placed on 8 different trays at the same time in the chamber. As shown in Table 1, chips with 9 different MCs, including the control (i.e., untreated chips), were produced at 30% RH for 1, 3, 8, and 15 h and at 40%, 50%, 60%, and 70% RH for 15 h. Once each of the treatments was finished, approximately 20 g of each sample was repackaged with polypropylene bags (Seongwon Vinyl, Seoul, Korea), which was identical but unlabeled in comparison with the commercial one, and stored at 20 ± 1°C for further analyses. The MC of the repackaged chips was monitored during 8 days of storage, which was the period of the entire experiment to ensure whether or not the MC of the repackaged chips was constant over the entire period of the analysis. The MC of the repackaged chips was stable over that period (data not shown), and the entire experiments were conducted within 8 days. The MC of the chips was measured in triplicate.

2.2. Mechanical Measurements

A single compression test was performed using a texture analyzer (TA.XT.plus, Stable Micro Systems, Surrey, UK) equipped with a 5 kg load cell. The total area under the curve (TA) and the number of force peaks (NFP) in a time versus force curve were obtained during fracturing of the potato chips. These two parameters were selected because they are widely used to estimate texture, particularly crispness, in the evaluation of textural attributes of snack products [18]. A three-point bending rig was used to predict the crispness of potato chips. The chips were mounted on three-point bending supports with a distance of 10 mm between two supports. A spherical probe with a 6.35 mm diameter was used to fracture the chips. A trigger force of 25 g, a travel distance of 5 mm, a crosshead pretest speed of 10 mm/s, a test speed of 5 mm/s, and a posttest speed of 10 mm/s were used as the instrumental settings [19]. The data acquisition rate was 500 points/s for the instrumental measurements. The TA and NFP were calculated by the macro options of Texture Exponent.

2.3. Descriptive Analysis

Six trained panelists (two males and four females, aged 30–49 years) with at least 100 h of Spectrum™ method training participated in the descriptive analysis of the crispness of potato chips. Since the panelists had extensive experience in describing snack products, they received 5 h of training for assessing the crispness of the reference chips including potato chips. The reference products for crispness were used from Jeong et al. [20]. The panelists, then, conducted evaluations in two steps for the nine potato chips that had different MCs. The panelists were instructed on the definition, technique, and evaluation of the crispness of the reference samples during training [21]. The panelists were then instructed on how to evaluate the crispness intensity of the standard references using a 15-point scale. After the panelists conducted two practice evaluations to review the whole process of the testing procedures, the actual testing in two sessions was performed in individual booths under a red lighting system to minimize the color effect across the samples. A warm-up sample was provided for each panelist in each session prior to the start of the test. The test samples were presented one by one to the panelists in duplicate. The order of the sample presentation followed a completely randomized block design (block = replication). Warm spring water and crackers were served as a palate cleanser. The panelists rated the crispness intensity of the samples using a paper ballos. Monetary compensation was provided after the final evaluations.

2.4. Calibration Models

Six references were used for the sensory crispness attribute. The pairs of scale values and the TA for the references were used to establish the regression models (i.e., calibration models) that explain the relationship between the TA and sensory crispness. The number of force peaks (NFP) was not used for the model calibration because of their low correlation with sensory crispness (refer to Section 3.1). The calibration models predicted in this study were developed using three well-known psychophysical models: the Fechner model, the Stevens model, and the Beidler model [8]. These models were used to develop a calibrated predictive model for the crispness of potato chips.

The Fechner model is given bywhere is the absolute intensity of the stimulus (TA in this study), is the perceived intensity of the sensory response (crispness in this study), and is a constant between 0 and ∞.

The Stevens model (power law) is given bywhere is the exponent of the power function, a measure of the rate of growth of the perceived intensity (i.e., crispness in this study) as a function of the stimulus intensity. When is greater than 1, the sensation grows faster than the stimulus (e.g., electric shock); when is less than 1, the sensation grows more slowly than the stimulus (e.g., smells); when is equal to 1, the sensation is proportional to the stimulus.

The Beidler model (dose-response model) is given bywhere is the maximum absolute intensity of the stimulus (i.e., maximum crispness intensity in this study) and is the constant, which is the association constant. Unlike the Fechner and Stevens models, it assumes that the response has an upper limit () that cannot be exceeded regardless of how great the intensity of the stimulus is. In practice, is usually the maximum value of the scale used for a given sensory attribute. For example, a line scale of 0–15 was used in this study, and was set to 15.

In general, the intensity of the stimulus is known, and its corresponding sensory response is unknown. Conversely, all sensory responses (standard scale values) are known, and the intensity of the stimulus is unknown for the reference-calibration method.

2.5. Statistical Analysis

To examine the goodness-of-fit of the models established in this study, the root mean square error of calibration (RMSEC) and root mean square error of prediction (RMSEP) were calculated. The RMSEC and RMSEP are the measurements of the average differences between the predicted and the observed response values at the calibration stage and at the prediction stage, respectively. The RMSEC/RMSEP and the coefficient of determination (R2) were also calculated for each predictive model. In addition, model robustness was assessed based on the ratio of RMSEP to RMSEC. As the ratio is close to 1, the model is considered robust. The model discrimination (MD) ability was calculated using the ratio of the standard deviation of a sensory attribute and RMSEP. It generally assumes that models with a large ratio (i.e., MD ≥ 2) are considered discriminative [22].

For the instrumental measurements, one-way analysis of variance (ANOVA) using a GLM model was applied and the mean values were separated using Fisher’s least significant difference test at . For the sensory data, two-way ANOVA was conducted using a mixed model that treats MC as the fixed effects and the panelist-panelist by product interaction (if significant, α = 0.05) as the random effects. All analyses were performed using SAS (version 9.1, SAS Inst. Inc., Cary, NC, USA).

3. Results and Discussion

3.1. Mechanical Measurements and Sensory Evaluation of Potato Chips

The mechanical textural properties of the TA and NFP for each sample are shown in Table 2. The mechanical textural properties significantly changed as the MC of the potato chips increased. This result indicates that structural changes in texture occurred because of the moisture in the chips. The TA tended to decrease as the MC of the potato chips increased, showing a correlation coefficient of 0.935 (data not shown), whereas the NFP did not follow a linear correlation with the MC of the potato chips. It followed a positive correlation with MC at 2.2–5.0% but showed a negative correlation thereafter. This finding represents that the TA is a better predictor than the NFP of the crispness of the potato chips over the entire range of the MC. Thus, only the TA was used to establish the regression models to explain the relationship between mechanical texture and sensory crispness. Similar research was conducted by Salvador et al. [23], who reported that a random pattern of the intensities of crispness was observed with the MC of commercial potato chips. On the other hand, there was a linear correlation between the linear expansion and the number of sound peaks in cassava crackers [24].

The sensory crispness intensity of potato chips as the MC changes is also presented in Table 2. The intensities of sensory crispness decreased from 11.92 to 3.33 as the MC increased from 2.2 to 9.2%, indicating a strong negative correlation coefficient of −0.975 (data not shown). Thus, the trained panelists discriminated the differences of crispness except the potato chips with 3.0% and 3.3% of the MC. There was a strong negative correlation coefficients between the TA from the mechanical measurements and the sensory crispness intensity (r = −0.974), and a poor correlation was found between the NFP and the sensory crispness intensity (r = −0.599). Therefore, it was justified that the TA was selected to develop the regression models for predicting the crispness intensity based on a mechanical parameter for potato chips.

3.2. Instrumental Test for the Food References

Six standard references for a sensory crispness scale developed for potato chips are shown in Table 3. The references were purchased at local markets. They were screened from a number of snack lists with a wide range of texture diversity. Their relative crispness intensity was finalized with a consensus from all panelists. All references were cut into half-inch cubes except the Goldfish due to its geometric characteristic.

The mean TA of the six standard references by the instrumental measurements is also shown in Table 3. The ANOVA showed that all the references were significantly different across the references. This result was somewhat expected because the references had been chosen by trained panelists with discussion to represent the different intensities of crispness. The TA increased from Dr. You to Corn chip but not in a linear manner. According to Table 3, the relationship between the sensory crispness scale values and the TA values for the six reference samples was a nonlinear pattern over the entire sensory scales, thus indicating that using the psychophysical models to predict the perception of potato chip crispness would be more appropriate than applying a simple linear model. Thus, the psychophysical models were used to predict the perception of potato chip crispness.

3.3. Calibration Models

For the reference-calibration method, the sensory scale values of the standard references are known, whereas the TA values for the references are unknown [8]. Therefore, the sensory scale value is treated as the independent variable (), and the TA is treated as the dependent variable (). The relationship between the sensory scale values () and TA () was fitted using the Stevens, Fechner, and Beidler models shown in equations (4)–(6), respectively. The important parameters were calculated as follows:

Following the model calibration, all the fitted models (equations (4)–(6)) were used to predict the sensory crispness of the potato chips.

3.4. Prediction of Sensory Crispness

The sensory crispness intensity for the 9 potato chips and their predictive values using the three psychophysical models (Stevens, Fechner, and Beidler) are shown in Table 4. For the sensory crispness of chips, all the predictive models showed similar coefficients of determination (R2) from 0.7 for the Beidler model to 0.8 for both the Fechner and Stevens models. The RMSEC value, the average error of calibration expressed in the same unit as the sensory ratings, was less than 1.0 for the Stevens model but was greater than 1.0 for Fechner and Beidler models. In terms of the RMSEC, the Stevens model (RMSEC = 0.9) was the best among three psychophysical models and the Fechner and Beidler models had slightly higher RMSEC values at 1.3 and 1.9, respectively.

The RMSEP value, the average error of prediction expressed in the same unit as the sensory ratings, was observed to be the smallest for the Fechner model (Table 4). This finding disagreed with the results from Meullenet’s study [14], which reported that the Beidler model had the smallest RMSEP and the other models were not satisfactory with fitting the texture of 21 food samples ranging from cream cheese to carrots. One possible reason for the disagreement between the two studies could be the samples. Previous study by Meullenet et al. [14] covered the entire range (0–15) of intensities for hardness. On the other hand, the potato chips used in this study covered narrower intensity range (3.3–11.2) of the standard scale of crispness. This difference may generate that the references used for sensory hardness were better established than those for sensory crispness and thus the crispness intensity is unlikely to be linear. The relationship between the sensory scale values and the TA values did not seem to be linear for the range of crispness scale (3.3–11.2) in this study, resulting in a better fit by the Stevens model.

Furthermore, the validation coefficient of determination for the responses (coefficient of determination for validation) was compared with that of the calibration to determine the robustness of the models established (Table 4). Given that a robustness close to 1 indicates robust models, the Fechner and Beidler models (robustness = 2.4 and 2.2, respectively) were more robust than the Stevens model (robustness = 5.2). Relatively larger robustness values were observed for the Stevens models because of the higher values of the RMSEP. This result may indicate that the relationship between sensory crispness and the TA values is not satisfactorily predicted by the Stevens model. As shown in Table 4, the MD values indicated that the Fechner and Beidler models (MD = 3.3 and 3.8, respectively) were more discriminative than the Stevens model (MD = 2.3) although all three models were discriminative. Models with a large ratio (i.e., MD ≥ 2) were reported to be highly discriminative [22].

According to Bourne [25], the long debate on the most suitable psychophysical model continues. For example, the Stevens model was found to be the best model in predicting meat tenderness perception in a study that examined meat tenderness perception using psychophysical models [8]. However, a consensus is unlikely to be reached given the fact that the Fechner model seems to be better since it has satisfactorily predicted sensory crispness for the chips (R2 = 0.8; robustness = 2.44; SD = 11.2; and MD = 3.3) in our study. Therefore, the reference-calibration method is sensory-perception specific and should be tested on various food products.

Throughout this study, the main advantage of the reference-calibration method for crispness intensity is the use of standard references with scale values to identify the correlation between sensory ratings and instrumental measurements, which is then used to predict the sensory crispness of unknown samples. The major disadvantages of the reference-calibration method are as follows: (1) The sample sizes should be precise for instrumental measurements and (2) controlling the range of predicted scores, which are normally narrower than the range of sensory intensity ratings evaluated by trained panelists, might be difficult.

4. Conclusions

This study suggests that using the reference-calibration method for predicting the crispness of potato chips is possible and that references can be used to calibrate not only the panelists but also the mechanical instruments for crispness measurement. This method can be useful under circumstances in which sensory evaluation is not practical or available. For the potato chip crispiness, the Fechner model seemed the most appropriate to estimate the crispiness from mechanical measurement. More research is necessary to determine if other sensory attributes can be predicted using the reference-calibration method.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that they have no conflicts of interest.


This research was supported by the Main Research Program (Grant number: E0187000-01) of the Korea Food Research Institute (KFRI) funded by the Ministry of Science and ICT, Republic of Korea.