Abstract

Stroop interference and facilitation effects were documented in the visual, auditory, olfactory, and gustatory modalities. This study extends the Stroop phenomena also for kinesthetic and haptic tasks. In a touch-enabled computer interface, participants touched and manipulated virtual objects (cylinders, cubes, and tiles), through a pen-like stylus, and identified their haptic qualities (weight, firmness, vibrations). Similarly, participants were stimulated with a mechanical force pushing their hands lightly towards a specific direction which they had to identify. While performing these identification tasks, participants were simultaneously presented with words or symbols that were congruent, neutral, or incongruent with the experienced kinesthetic/haptic sensations. Error rates and response times were affected in the following order: congruent < neutral < incongruent. As technologies advance into multisensory systems, engineers and designers can improve human-computer interactions by ensuring optimal congruence between all the inter- and intra-sensory elements in the display.

1. Introduction

The Stroop effects are among the most famous examples of interference and facilitation in the perception-cognition-action loop. In the original “Color-Word” version, Stroop [1] found that naming the ink color of incompatible color words (e.g., the word RED printed in green ink) was much slower and more error prone than naming the ink color of control items (e.g., the letters DDD printed in green), while naming the ink color of compatible color words (e.g., the word GREEN printed in green ink) was much faster and less error-prone than naming the color of a control item. These effects occurred despite participants’ focus on their task—to rapidly name the ink colors of the words presented on a list—and it showed that for literate adults the processing of language is automatic and unconscious, and therefore participants could not ignore the meaning of the words and were affected by them [16].

The Stroop interference and facilitation effects were replicated in many different visual tasks. For instance, when the task was changed and participants were asked to count the characters in a display, ignoring their semantic identity (e.g., say “three” to XXX (neutral), 333 (congruent), 555 (incongruent)), their response times (RTs) were longer in the incongruent condition and shorter in the congruent condition as compared to the neutral condition [7]. Similarly, when participants named geometrical shapes while incongruent words were embedded into them (e.g., the word CIRCLE embedded in a square), an interference effect of longer RT was observed [8]. The spatial relationship of elements in a display may also give rise to the effects. A delayed response was found when participants had to identify the spatial location of a word in relation to a fixation point in an incongruence condition (e.g., the word BELOW presented above the fixation point, etc.) [9]. Similarly, when participants had to judge whether an arrow was pointing up or down at various heights inside a surrounding rectangle, RTs were faster on an arrow pointing up; the higher it was inside the rectangle, and on an arrow pointing down, the lower it was inside the rectangle [10]. The Stroop effects were observed also for more primal visual tasks—identifying emotions in facial expressions. Happy, sad, and angry expressions were recognized faster when paired with congruent words compared to incongruent words [11]. Even the mere font size of printed words can affect semantic congruity. Participants presented with word pairs of animal names (Ant-Lion, Elephant-Cat, etc.) judged which one in each pair was printed in larger font size. An increase in RT occurred when the relative font size was incompatible with the visual images of the animals evoked by the semantics of the word (e.g., the word “Ant” printed in font 32 paired with “Lion” printed in font 16 required longer RT than vice versa) [12]. Similarly, the arrangement of word pairs can create an interference or facilitation. In a semantic relatedness judgments, faster RT were reported for word pairs with congruent iconic relations with their referents (e.g., the word ATTIC presented above the word BASEMENT) compared to pairs with a reverse iconic relation (e.g., STEM above BRANCH) [13].

Experiments with auditory tasks revealed similar results. When the word LEFT or RIGHT was uttered to participants’ right or left ear while they located the source of the sound, longer RTs were observed in the incongruent condition (e.g., the word LEFT uttered to the right ear, etc.) [14]. An interference effect was also demonstrated when subjects heard the words MAN or WOMAN and were asked to identify the speaker’s gender in a conflict versus no conflict comparison [15, 16]. The same results were found for musical tones. When the words HIGH and LOW were presented together with musical tones and participants had to rate them (high or low pitch), RTs were longer in the incongruent conditions [1719].

The Stroop effects were also demonstrated in the chemical senses. An olfactory perception study reported that odor detection was faster and more accurate when it was presented with a semantically congruent picture (e.g., odor of diesel and a picture of a bus) compared to odor only condition or odor with incongruent picture [20]. In another study employing the priming effect, participants were stimulated with a pleasant/unpleasant odor and afterwards presented with words describing pleasantness (sweet, fruity, fragrant, etc.) and unpleasantness (bad, disgusting, repulsive, etc.) while their task was to rapidly name the ink colors of the words. Background odor affected RT differentially. That is, the odors primed and facilitated the processing of congruent words, causing an interference with the task—naming the ink color of the words (which requires ignoring its meaning), while the presentation of incongruent odor-words resulted in faster RT as the word’s meaning was more easily ignored [21].

An interesting case of semantic interference and facilitation in the gustatory modality was also reported. A synaesthetic musician, experiencing specific tastes in response to hearing particular musical intervals, was asked to identify these tones while experimenters delivered different flavors to her tongue. She responded significantly faster when the flavor and tone were congruent and slower when the flavor and tone were incongruent, compared to neutral condition [22].

While the classical demonstration of the Stroop effects and most of its versions were unisensory, that is, the different elements of the stimulus, being congruent, neutral, or incongruent with each other, were all within the same sensory modality (vision, in the classic Color-Word example, and auditory in the aforementioned examples [1416]); nevertheless, there are strong indications that the Stroop phenomena are not limited to unisensory congruity/incongruity. For instance, when participants had to name color patches, presented visually, while listening to incongruent words (e.g., seeing purple and hearing YELLOW), RTs were significantly slower compared to pairing the patches with neutral noncolor words [2327]. Similar results were obtained in the Picture-Word versions, where picture-naming slowed when participants simultaneously heard incongruent words, relative to neutral words [2831]. The aforementioned reports on the Stroop effects in olfactory and gustatory senses [2022] also support the existence of cross-sensory Stroop effects.

The sense of touch is different from the other senses. As touch is limited to the area of contact with the object, in order to apprehend the whole object or the qualities of a surface, voluntary movement must be made to compensate for the smallness of the tactile perceptual field [3234]. Therefore, the hand acts as both a perceptual system exploring the environments and a motor system reacting to the tactile-kinesthetic cues. In the context of interference/facilitation in the perception-cognition-action flow, it makes the sense of touch unique as perception and action are linked more closely in touch than in the other senses. Thus, the current study aimed to further investigate the Stroop cognitive interference and facilitation phenomena in various kinesthetic and haptic tasks. For that, we conducted four experiments employing different kinesthetic/haptic sensations in various cross-sensory and multisensory (haptic-audio-visual) settings.

2. Methods and Results

2.1. Apparatus

We used a haptic virtual environment system enabling users to see virtual objects and touch them through a haptic device. This touch-enabled computer interface (by Reachin) consists of a computer screen tilted 45 degrees and reflected on a flat horizontal mirror (Figure 1). Wearing stereo goggles (by StereoGraphics), users can see these visual reflections in a three-dimensional space underneath the mirror. In their hand, users hold a pen-like stylus connected to an engine that is able to generate a wide range of mechanical forces, with six degrees of freedom allowing the stylus to move in all four directions, upward and downward (PHANTOM desktop by SensAble Technologies). The stylus is represented on the screen as a stick with a small black ball at its tip (see Figures 3, 4, and 5), and since the visual display is fully synchronized with the force-feedback mechanism, all manual manipulations of the stylus and the various objects on the screen are visible. Thus, users have a unique experience of seeing 3D virtual objects in a 3D space and being able to touch and manipulate these objects, through the stylus, and feel in their hand sensations simulating fingertip contact with the objects, similar to what one feels when using a stick to touch or manipulate similar real objects in the real physical world. Figure 1 presents the system. Further details and technical descriptions are available at: http://www.reachin.se, http://www.sensable.com, http://www.stereographics.com and http://www.3dconnexion.com.

2.2. Experiment : Hand Movements in Different Directions

In this experiment participants held a stylus in a fixed location, and in each trial the computer generated a force pushing the stylus, and the hand holding it, in a certain direction. Arrows that were congruent, neutral, or incongruent to the direction of the force appeared on the screen, simultaneously with the movements. We hypothesized that RTs and error rates would be affected in the following order: congruent < neutral < incongruent.

2.2.1. Experimental Design

Participants were presented with a circle located at the center of their visual field. With their hand, they held the stylus locating its visual representation (a stick with a small ball at its tip) at the middle of the circle. On each trial the computer generated a constant force (0.6 Newton), through the stylus’s engine, in one of three directions: rightward, leftward, or upward. Thus, participants’ hand, holding the stylus, was pushed lightly outside the circle in one of these three directions.

2.2.2. Participants

Twenty one students, 10 females and 11 males, were paid for participating in the experiment. Mean age was years (range: 20–27). The experiments were carried out under the guidelines of the Technion’s ethical committee and with their approval. All participants were unaware of the purpose of the experiment.

2.2.3. Procedure

Participants were instructed to fixate their gaze at the circle, to hold the stylus with their dominant hand, and to place its visual representation inside the circle. On each trial their task was to rapidly report the direction of the forces exerted on their hand by pressing (with their other hand) designated buttons on a SpaceMouse (by 3DConnexion) for rightward, upward, and leftward hand-movements.

To ensure that participants rely solely on their kinesthetic sensations and to prevent them from having a visual cue of seeing their hand and the direction in which it had been pushed into—the stylus’s visual representation was presented only before the first trial, in order to assist participants in locating it inside the circle (while their hand was under the mirror and invisible). However, once the trials began, the stylus’s visual representation disappeared throughout the entire experimental session. In addition, the laboratory room was kept darkened. These arrangements ensured that participants looking at the horizontal mirror were able to see only the reflection of an empty gray screen with a circle at its center (Figure 2), but not the stylus’s visual representation, their hand, or its movements.

On each trial, simultaneously with the exertion of the force, a label containing a symbol appeared inside the circle. This symbol was congruent to the kinesthetic sensation (i.e., the hand was lightly pushed leftward and the label contained an arrow pointing leftward “”, the hand was pushed upward and the label contained an arrow pointing upward “↑”, the hand was pushed rightward and the label contained an arrow pointing rightward “→”), incongruent (i.e., the hand was lightly pushed leftward but the label contained an arrow pointing rightward, the hand was lightly pushed upward but the label contained an arrow pointing downward “”, or the hand was lightly pushed rightward but the label contained an arrow pointing leftward), or neutral (i.e., the hand was lightly pushed either rightward, leftward, or upward and the label contained a small circle “O”). Participants were instructed to provide their judgments on the force direction, by relying solely on their kinesthetic sensations regardless of all other cues. The total number of trials for each participant was 240, containing an equal share of congruent, neutral, and incongruent conditions (80 trials of each).

Before the experimental session, participants were trained briefly on their task, experienced kinesthetically the forces in the different directions, and were all able to easily discriminate between the three directions. The trials in the different directions and conditions were arranged randomly within each experimental block. Response times were measured and registered by the computer. Time counting started once the force was exerted through the stylus and stopped once participants pressed the buttons.

2.2.4. Data Analysis

A repeated-measures ANOVA (with Bonferroni adjustments) tested whether there was an overall effect between the congruent, neutral, and incongruent conditions. The ANOVA was followed by pairwise t-test comparisons. This analysis was done for both accuracy rates and response times, in all four experiments reported henceforth.

2.2.5. Results

The ANOVA for the accuracy rates in the congruent, neutral, and incongruent conditions revealed an overall effect for condition . Paired t-tests showed that when the arrow was congruent with the kinesthetic perception participants made errors only in 0.9% of the trials, significantly less than their errors when the arrow was incongruent with the kinesthetic perception (3.3%). In the neutral condition, where a small circle appeared instead of an arrow, participants erred in 1.2% of the trials, significantly less than the incongruent condition , although not significantly different from the congruent condition.

For the correct answers RTs, the ANOVA revealed an overall effect for condition . Paired t-tests showed that the mean RT in the congruent condition ( ; mean S.D.) was significantly shorter than RT in the incongruent condition . Mean RT in the neutral condition was significantly longer than RT in the congruent condition, and significantly shorter than RT in the incongruent condition . Results are summarized in Table 1.

2.2.6. Discussion

This experiment provides a clear evidence for the occurrence of the Stroop effects also in kinesthetic perception. Participants did not have any redundant cue from another sensory channel except their kinesthetic sensations. Nevertheless, RTs as well as the error rates were affected differentially according to the semantic congruency between the kinesthetic perception and the symbols. Since participants, in this experiment, received the kinesthetic stimulation when they were mainly passive, we designed the next three experiments with participants engaged in active explorations of virtual objects.

2.3. Experiment : Light/Heavy Cylinders

This experiment was designed to further investigate the Stroop effects also in tasks where participants made haptic judgments by actively manipulating virtual objects.

2.3.1. Experimental Design

This experiment was designed as a weight discrimination task. In each block of trials, participants were presented with a visual display of 15 cylinders ordered in 3 rows (5 cylinders in each row). With the stylus they lifted each cylinder and decided whether it was light or heavy. All cylinders were visually identical and varied only on their haptic features, activated once participants manipulated them. Since those were only “virtual” cylinders which could be programmed freely, and for practical considerations, we eliminated in this experiment the gravitational force entirely. Thus, a cylinder’s mass was determined solely by its inertial mass, that is, the relation between the force exerted on it and its acceleration (Newton’s 2nd law). Half the cylinders had a small mass (equivalent to 5 kg), and the other half had a relatively larger mass (equivalent to 200 kg). In the real physical world, where gravitational force influences the weight of objects, a small cylinder with a mass of 5 Kg can be considered “heavy” and a cylinder with a mass of 200 Kg is almost impossible to lift. But since we programmed these virtual cylinders without any gravitational force (zero), these masses (5 and 200 Kg) refer only to the inertial masses. Hence, when participants lifted the 5 Kg cylinder it was felt as “light” and vice versa for the 200 Kg cylinder. The force exerted on each cylinder was determined by the participant himself and was typically lower than 1 Newton. That force exerted on a light cylinder caused it to accelerate (upward) in a speed of 0.2 m/sec, while a heavy cylinder accelerated in a speed of 0.005 m/sec.

2.3.2. Participants and Procedure

Thirty students, 15 females and 15 males, participated in the experiment. Mean age was years (range: 18–30). Participants held the stylus in their dominant hand and saw its visual representation (a stick; see Figure 3) on the screen. Since the stylus’s position in space was fully synchronized with its visual representation, any movement of the stylus (located under the flat horizontal mirror; see Figure 1) entailed similar movements of its representation on the screen. Participants were instructed to place the stylus under each cylinder and “lift” it by pushing it upwards with the stylus. The task was to report whether the cylinder was light or heavy, by rapidly pressing a specified button for a light cylinder and another button for a heavy one. As a cylinder was lifted and moved past the end of the screen, it disappeared, and the participant proceeded to the next cylinder and repeated the same procedure.

Touching a cylinder and exerting a minimal force (0.5 Newton) on its bottom activated a label that appeared on the cylinder. The label contained a word or a symbol that was congruent to the haptic perception (i.e., the cylinder was felt as light and the label read LIGHT, or the cylinder was felt as heavy and the label read HEAVY), incongruent (i.e., the cylinder was felt as light but the label read HEAVY or vice versa), or neutral (i.e., the cylinder was felt as light or heavy while the label read ###; see Figure 3). Participants were instructed to provide their weight discrimination judgments by relying solely on their haptic sensations regardless of all other cues (the labels).

In this experiment, the main cues were haptic. Nevertheless, there were also visual cues available, since a light cylinder accelerated faster than a heavy cylinder. These visuo-haptic cues were congruent in all experimental conditions, meaning that a haptically light cylinder was always presented visually as accelerating faster than a heavy cylinder, and the experimental manipulation was only on the word labels attached to the cylinders which were congruent, neutral, or incongruent with the haptic sensation.

Since the magnitude of the Stroop effects is optimal in speeded reactions where participants are hurried to respond as quickly as possible, participants were instructed to follow a structured path in each block of trials. Specifically, they started with either the rightmost or the leftmost cylinder on the lower row and followed with the adjacent cylinder on the same row until they lifted all five cylinders of that row. Then, remaining on the same side of the workspace, they preceded up to the next row. In addition, to further encourage participants to respond rapidly, a feedback screen was presented following every block of trial (each screen with 15 cylinders) that indicated whether that experimental block was completed below or above a predefined standard level.

The total number of trials for each subject was 180, containing an equal share of congruent, neutral, and incongruent conditions (60 trials of each). All the general procedural details (e.g., a brief training prior to the experimental session, randomization of trials, receiving the stimuli in the dominant hand, and responding in the non-dominant hand, etc.) were similar to those of the 1st experiment. Time counting started when participants pushed the cylinder upward and stopped once they pressed the buttons.

2.3.3. Results

The ANOVA for the accuracy rates in the congruent, neutral and incongruent conditions revealed an overall effect for condition . Paired -tests showed that when the word’s meaning was congruent with the haptic perception participants made errors only in 3% of the trials, significantly less than their errors when the word’s meaning was incongruent with the haptic sensation (10.4%). In the neutral condition where the characters were irrelevant to the haptic sensation participants erred in 4.4% of the trials, significantly less than the incongruent condition , although not significantly different from the congruent condition.

For the correct answers RTs, the ANOVA revealed an overall effect between the congruent, neutral and incongruent conditions . Paired -tests showed that the mean RT in the congruent condition (  ms; mean S.D.) was significantly shorter than RT in the incongruent condition . Mean RT in the neutral condition was significantly longer than RT in the congruent condition , and significantly shorter than RT in the incongruent condition ; see Table 1.

2.3.4. Discussion

Participants, in this experiment, had a visual cue in addition to the kinesthetic cue. However, they were instructed to provide their judgments by focusing solely on their haptic sensations. RTs and the error rates were affected differentially according to the semantic congruency between the haptic stimulation and the symbols. This experiment extends the Stroop effects also for bi-sensory conditions where participants actively explore the haptic qualities of an object. We designed two additional experiments in order to test also other haptic sensations (softness/hardness judgments, vibration detection) in tri-sensory (visuo-audio-haptic) cues combinations.

2.4. Experiment : Soft/Hard Cubes

This experiment investigated the Stroop effects in another haptic task where participants made softness/hardness judgments by actively exploring virtual objects.

2.4.1. Experimental Design

The 3rd experiment was designed as a firmness discrimination task while the general procedural details were all similar to the 2nd experiment. Participants were presented with a visual display of 18 virtual cubes ordered in 3 rows (6 cubes in each row). Half the cubes were “soft” and generated a slight resistance (stiffness of 100 Newton/meter) when participants pressed them, and the other half were “hard” and generated a relatively stronger resistance (stiffness of 1000 Newton/meter).

2.4.2. Participants and Procedure

Twenty three students, 11 females and 12 males, participated in this experiment. Mean age was years (range: 19–26). Participants were instructed to gently “press” each cube inwards with the stylus, and report whether it was a soft or a hard cube, by rapidly pressing designated buttons for soft/hard cubes. Touching a cube immediately activated a label that appeared on the cube. The label contained a word or a symbol which were either congruent, incongruent or neutral to the haptic perception (i.e., the cube was felt haptically as soft while the label read SOFT, HARD or ###, resp.; see Figure 4). Again, participants were instructed to provide their firmness discrimination judgments by relying solely on their haptic sensations regardless of all other cues. The total number of trials for each subject was 210, with an equal share of the congruent, neutral and incongruent conditions (70 trials of each).

In addition to the main haptic cues about the cube’s softness or hardness (felt by pressing the cubes) there were also visual and auditory cues. The surface of a soft (but not a hard) cube deformed and wrinkled a bit when pressed inward, and the computer generated a sound (of a natural deformation of a rigid plastic material) only at the hard cubes but not at the soft ones. These audio-visuo-haptic cues were congruent in all experimental conditions, meaning that a hard cube always felt hard, did not deform visually, and generated a specific sound when it was pressed, and vice versa for the soft cubes. Thus, the experimental manipulation was only with the labels appearing on the cubes which were either congruent, neutral or incongruent with the haptic sensation.

2.4.3. Results

The ANOVA for the accuracy rates in the congruent, neutral and incongruent conditions revealed an overall effect for condition . Paired t-tests showed that in the congruent condition participants erred only in 1% of the trials, significantly less than their 3.6% errors in the incongruent condition . In the neutral condition participants erred in 1.6% of the trials, significantly less than the incongruent condition , although not significantly different from the congruent condition.

The ANOVA for the correct answers RTs revealed an overall effect between the congruent, neutral and incongruent conditions . Paired t-tests showed that the mean RT in the congruent condition (  ms; mean S.D.) was significantly shorter than RT in the incongruent condition (  ms). Mean RT in the congruent condition was even significantly shorter than RT in the neutral condition (  ms). The difference in RT between the incongruent condition and the neutral condition was not significant; see Table 1.

2.4.4. Discussion

This experiment employed a multisensory combination of audio-visual-haptic cues that were always congruent with each other, but were presented together with a linguistic cue that was either congruent, neutral or incongruent with these tri-sensory cues. Here again, participants were instructed to provide their judgments by focusing solely on their haptic sensations. RTs and the error rates were affected differentially according to the semantic congruency between the haptic stimulation and the symbols. This experiment extends the Stroop effects also for softness/hardness judgments.

2.5. Experiment : Vibrating/Still Tiles

The 4th experiment investigated the Stroop effects in another haptic task where participants actively touched virtual objects and had to detect which of the objects generated vibrations.

2.5.1. Experimental Design

The 4th experiment was designed as a vibration detection task while the general procedural details were all similar to the 2nd and 3rd experiments. Participants were presented with a visual display of 27 virtual tiles ordered in 3 rows (9 tiles in each row). Half the tiles generated vibrations (in the X and Y axes, at an amplitude of 0.55 Newton and a frequency of 5.5 Hz.) once participants touched them, while the other half did not generate any vibrations and the stylus remained still when participants touched them.

2.5.2. Participants and Procedure

Eighteen students, 8 females and 10 males, participated in the experiment. Mean age was years (range: 19–25). Participants were instructed to gently touch each tile with the stylus and report whether it generated vibrations or not, by pressing designated buttons. A label that was either congruent, incongruent or neutral to the haptic perception (i.e., a vibrating tile with a label VIBRATING, STILL or ###, resp., etc.) was activated immediately upon touching a tile (Figure 5). Participants were instructed to provide their judgments by relying solely on their haptic sensations regardless of all other cues. The total number of trials for each subject was 240, containing an equal share of the congruent, neutral and incongruent conditions (80 trials of each).

In addition to the main haptic cues (vibrations) there were also visual and auditory cues. A vibration-generating tile caused the stylus’s visual representation to shake in a congruent manner, and it was accompanied with an engine sound. These multisensory cues were not present in the still tiles. In this experiment too, the audio-visuo-haptic cues were congruent in all experimental conditions (a vibrating tile always occurred with a visually shaking stylus and an engine sound and vice versa for the still tiles). Thus, the experimental manipulations were only on the labels attached to the tiles which were either congruent, neutral or incongruent with the haptic sensation.

2.5.3. Results

Participants’ error rates were 0.7%, 0.4% and 1.7% in the congruent, neutral and incongruent conditions respectively. However, none of these differences between the conditions was statistically significant. The ANOVA for the RTs of the correct responses revealed an overall effect for condition . Paired -tests showed that mean RT in the congruent condition (  ms; mean S.D.) was significantly shorter than RT in the incongruent condition (  ms). Mean RT in the neutral condition (  ms) was significantly shorter than RT in the incongruent condition , and longer than RT in the congruent condition, although not statistically significant; see Table 1.

2.5.4. Discussion

This experiment employed a multisensory combination of audio-visual-haptic cues that were always congruent with each other, but were presented together with a linguistic cue that was either congruent, neutral or incongruent with these tri-sensory cues. Here again, participants were instructed to provide their judgments by focusing solely on their haptic sensations. RTs were affected differentially according to the semantic congruency between the haptic stimulation and the symbols. This experiment extends the Stroop effects also for a vibration detection task.

3. General Discussion

The experiments presented herein, investigated the occurrence of the Stroop interference and facilitation effects in four different tasks employing active and passive kinesthetic and haptic sensations, in different cross-sensory, bi- and tri-sensory combinations.

The sense of touch encompasses several distinct sensory systems. The tactile (cutaneous) system receives inputs from nerve endings embedded in the skin. The kinesthetic (proprioception) system receives inputs from mechanoreceptors connected to the muscles, tendons and joints. When both, tactile and kinesthetic, sensory systems operate together it is termed the haptic system [35]. In experiment , participants were passive and the stimulation was mainly kinesthetic, that is, inputs came via mechanoreceptors connected to the muscles signaling a light push of the hand. Experiments extended the study also for participants who actively explored the qualities of virtual objects and for haptic sensory inputs, that is, where both, the tactile (skin receptors) and kinesthetic (mechanoreceptors attached to muscles, tendons and joints), systems were stimulated.

A cross-sensory Stroop type where the kinesthetic sensation was congruent, neutral or incongruent with the semantics of the visual information was employed in the 1st experiment where a force pushed lightly the stylus-holding hand in one of three directions. The experimental arrangements ensured that only kinesthetic sensations were available, but not any visual cues.

Experiments contained multisensory cues, so the experimental manipulations of the semantic (in)congruency were both, inter- and intrasensory. In the 2nd experiment participants received bi-sensory cues about the cylinder’s heaviness or lightness. The main cues were haptic but in addition, they also received visual cues, since a light cylinder accelerated faster than a heavy cylinder. Tri-sensory cues were available in the 3rd and 4th experiments. The main cue about the cube’s softness or hardness was haptic, but there were also visual and auditory cues—the deformation of the soft cubes was visible and the rigidness of the hard cubes was audible. Similarly, the main cue of a vibrating/still tile was haptic, but visual and auditory cues were also available. The vibrating tile was accompanied with the shaking of the stylus’s visual representation and with an engine sound.

Typically, multisensory signals are processed faster than a unisensory signal [3640]. Nevertheless, when semantic symbols are also involved (in the form of words, pictures, etc.), the multi-sensory stimulation enhances performance only when the semantic information is congruent with the other sensory signals and conveys the same meaning. Otherwise, the semantic incongruity cancels the advantage and enhancement of the multi-sensory stimulation [41].

Our 1st experiment provides a clear evidence for the occurrence of the Stroop effects also in kinesthetic perception. Participants did not have any redundant cue from another sensory channel except their kinesthetic sensations and their RTs and accuracy rates were affected differentially according to the semantic congruity with the linguistic symbols. The following three experiments (2nd–4th) were designed with redundant visual or/and auditory cues to ensure ecological validity since many of our haptic experiences involve cues from multiple sensory systems. Altogether, the results of the four experiments of the current study extend the Stroop interference and facilitation phenomena demonstrating them also in kinesthetic and haptic tasks—discrimination of weight/firmness and detection of vibration/movement-direction—when participants’ attention is focused mainly on the haptic sensation (as they were instructed to).

An interference between linguistic symbols and haptic stimuli was already demonstrated in a previous study [42] in which participants were presented, visually, with words associated with size (STRONG, HEAVY, WEAK, LIGHT) while responding manually with incongruent, small or large, knobs that they held in their hands. However, in that study the incompatibility was between the stimulus (e.g., a word associated with a large size) and the response object (e.g., the participant’s task was to press a small knob)—a paradigm that resembles more the Simon effect [43] than the Stroop effect—an incongruence between the different elements of a given stimulus or combined stimuli. It is not clear whether the Stroop and the Simon effects share similar mechanisms (e.g., [4448]). Therefore the current study focused on incompatibilities between the different elements of a given stimulus (or multi-sensory stimuli) rather than on incompatibilities between a stimulus and its response requirements. Furthermore, in that previous study [42] the task demands directed participants’ attention to focus more on the visual modality, whereas in the current study participants were clearly instructed to provide their judgments based on their haptic sensations.

The current demonstration of the Stroop effects in kinesthetic and haptic tasks converges with the existing Stroop literature to suggest that the cognitive mechanisms of the Stroop facilitation and interference effects are not limited to a specific sensory modality. Rather, the Stroop types of cognitive facilitation and interference are a-modal phenomena occurring with all sensory stimuli at a higher and abstract level of representation and cognitive processing.

4. Applications

Human-computer interaction technologies are advancing towards multisensory systems that include visual, auditory and (when applicable) haptic sensory feedback. The principles of the Stroop effects, as shown here, can be used by engineers and designers to improve users’ performance. For instance, ensuring maximal congruence between all the (uni- or multi-sensory) elements in a given display can enhance the cognitive experience and facilitate responses. Similarly, where applicable, the addition of linguistic descriptions that are compatible with the other sensory cues can significantly enhance performance. In certain situations, gaps of graphics fidelity and technological limitations may be bridged and compensated by adding congruent linguistic cues. It is also noteworthy that, in a cost-effectiveness consideration, it is generally better to focus on, and give priority to, the visual cues, since vision is often the dominant sensory modality [49].

Acknowledgments

This research was funded by the EU project PRESENCCIA—Presence: Research Encompassing Sensory Enhancement, Neuroscience, Cerebral-Computer Interfaces and Applications. The authors thank Mr. Gad Halevy and Dr. Moran Furman for their invaluable assistance in programming the system for the experiments.