Abstract

Classifying the many types of video games is difficult, as their genres and supports are different, but they all have in common that they seek the commitment of the player through exciting emotions and challenges. Since the income of the video game industry exceeds that of the film industry, the field of inducting emotions through video games and virtual environments is attracting more attention. Our theory, widely supported by substantial literature, is that the chromatic stimuli intensity, brightness, and saturation of a video game environment produce an emotional effect on players. We have observed a correlation between the RGB additives color spaces, HSV, HSL, and HSI components of video game images, presented to participants, and the emotional statements expressed in terms of arousal and valence, recovered in a subjective semantic questionnaire. Our results show a significant correlation between luminance, saturation, lightness, and the emotions of joy, sadness, fear, and serenity experienced by participants viewing 24 video game images. We also show strong correlations between the colorimetric diversity, saliency volume, and stimuli conspicuity and the emotions expressed by the players. These results allow us to propose video game environment development methods in the form of a circumplex model. It is aimed at game designers for developing emotional color scripting.

1. Introduction

It is widely accepted today that emotions are a key element in the success of video games. They can create a variety of experiences that players will encounter, and these can usually be described using a formal taxonomy, such as flow [1], presence [2], immersion [3], and fun [4]. Producing emotions through environments or narration is a key challenge in this industry. Human emotions have been studied in literature since antiquity, and they were initially opposed to reason [5]. Scientific research on emotions, in the fields of neuroscience and psychology, has recently shown that if their role is primarily to take part in the survival of the species, contributing to preservation and reproduction, they are not provided exclusively as reflexes and relayed only to limbic brain but are also cognitive [6]. Many phenomena and contextual stimuli result in the production of complex emotions, which can be classified into categories with 5 basic emotions [7] and 21 known combinatorial emotions [8], or with a dimensional space based on their valence and activation level [9]. The literature has shown for several years strong links between visual stimuli and emotions [10]. Effects of colors on emotions were studied several times and although there are diverging results about which colors promote ​​a positive or negative mood, these works show a strong correlation between colors, hue, saturation, and brightness with emotional arousal, valence, and dominance [11]. Several studies show that the human brain (or the macaque’s, as reference) is much more active in environments of colors like red compared to yellow [12]. A 23-culture semantic differential study of affective meanings also reveals cross-cultural similarities in emotional color perception [13]. We will not discuss the case of each color in a virtual environment, but we will instead focus on the general effects of hue, saturation, and lightness on the emotions of players. We present a study involving participants on the correlations between the RGB hue, saturation, and value components of 24 frames of video game environments and the emotional effects collected in a semantic subjective questionnaire, which was based on the IAPS survey [14]. This method is based on a schematic circumplex dimensional model of emotions in a virtual reality context [15]. This model is included in an evolutionary perspective of cognitive overgeneralization; it does not take into account the personal experiences of players which inevitably lead to an idiosyncratic positioning. We do not question the strength of these environmental aspects but we seek to define a more general methodology.

This method allows game designers to create emotional environments in accordance with the curves of interest (script pacing) to keep players in an optimal state of flow, depending on the ratios of challenge, boredom, and engagement [16]. Since the flow experience in games is accepted as being related to the emotional involvement of the player [1719], emotions are therefore seen as an essential part of the production of a video game. These emotions are the result of both gameplay and narrative but can also be produced by games with active environments. This is the kind of emotional environment that we offer to help build with color scripting via our circumplex model of emotion induction (Figure 1).

2. Methods

2.1. Image Analysis

There has been many studies on emotional colors (term defined below), both isolated or combined in pairs [2022], and they have been in use for a long time in areas such as advertisement design. However, studies on the emotional impact of complex images are more rare. Both the difficulties of the choice of semantic subjective interpretation models and also of contextualizing complex images make this work fascinating and difficult. Most of these studies use clustering principles that segment images into small regions of homogeneous color [23]. However, excluding this segmentation allows for a more comprehensive analysis of the emotional picture. In a 2007 study, Gao et al. have shown that, in the context of a multicultural study (eight groups from different international regions), participants showed a strong importance on lightness and chroma as emotional colors [24]. It seemed interesting to us to study the emotional influences of colors in the paradigms of video games taken as a whole context. We therefore selected 24 still images from video games categorized into different genres, using the Elverdam and Aarseth dynamic classification to ensure a wide range [25]: Racing, FPS, RPG, Casual, Strategic, and Experimental. For each of these six themes, four still images from video games were selected. These four images were randomly selected in a real-time footage database comprising 20 frames. All these still images were produced by the studios that own them. The 120 preselected video games are from the video games database rewarded by the Academy of Interactive Arts & Sciences. We have calculated the RGB value of each still image by adding up the red, green, and blue values. Color analysis showed 4 color groups: Group 1 = total ; Group 2 = total    and total HSV ; Group 3 = total    and total HSV ; Group 4 = . These analyses aim to show images with homogeneous color values. All images are presented as standardized 800 × 450 pixels 72 dpi JPEG compressed ISO/CEI 10918-1 UIT-T T.81 in an online questionnaire. Our random selection of images, drawn among the 120 initial images corresponding to the 6 selected categories of video games, allows us to avoid a subjective selection through the presence of contextual stimuli such as emotional facial expressions or contextual actions. We chose an online questionnaire to help in reaching a larger number and a better diversity of participants. This method implies, however, that we were not able to control the quality of the color reproduction on the participant’s screens. It should be noted however that the color reproduction quality to web users today is considered good enough to propose visual online surveys. For example, browser statistics show that more than 99% of the visitors of a well-known web coding site, W3Schools, are using 24-bit or 32-bit color mode (cf. http://www.w3schools.com/browsers/browsers_Display.asp) (also known as “16.7 million” colors). Numerous online surveys either on color perception or with emphasis on color have been conducted for several years (e.g., [26]).

Figure 2 shows the chromatic analysis applied to each image of groups 1 to 6. We analysed the colors of each image in the RGB additive color space, HSV, HSL, and HSI, and were thus able to determine the perceptual uniformity thereof. We chose to compare image values in many colors spaces because there is no consensus regarding the way of measuring image lighting, color, or saturation, except on CIE guide models [27]. Some terms are synonyms. We have computed the hue from RGB as described by Preucil [28]:Hue is defined technically as “the degree to which a stimulus can be described as similar to or different from stimuli that are described as red, green, blue, and yellow” [29]. Brightness is “an attribute of visual perception in which a source appears to be radiating or reflecting light” [27]. In the RGB color space, it can be determined usingChroma is the “purity” of a color in relation to saturation; the lower the chroma level is, the more the colors are washed out [30]. We have calculated the average picture Chroma as follows [31]:Value is defined as the largest component of a color; it is described by Smith in the HSV “Hexcone” model by this simple formula [32]:Lightness is the midpoint between the minimum and maximum , , and values: the intensity average of the , , and values. In an HSI color space, lightness intensity and color saturation are given byFor a second method of image analysis, we used a color segmentation method based on a component difference threshold [33]. The difference threshold is the limit beyond which the color is considered belonging to a new group, not assigned to an existing one. The higher the threshold, the smaller the number of color groups that will be created by the analysis. The lower the threshold, the smaller the groups of similar color created by the analysis. The calculation of the color difference is the combined difference between the color model components used (RGB, HSV, or, for a more perceptually adapted model, CIE ). To minimize color groups in our analysis, we used a difference threshold of 40 with a CIE model.

The data are visualized on 2D and 3D hue and saturation spaces with colored proportional circles and balls representing the size of each group. (The tool is freely available online at: https://www.geotests.net/couleurs/frequs_svg.html?l=en.) All the data are saved in a CSV file. The analysis presents the hierarchy of color groups in the image (number of pixels) and their relations in terms of color components (gradients, oppositions, etc.). We are trying to establish whether a correlation exists between the volume of color groups in each image and the emotions described by participants. It is possible to refine the analysis of the image’s colored areas by estimating a visual saliency map of the image. Salience is a concept from cognitive psychology and robotics, which involves estimating the areas of the image that will attract attention quickly and hold it. Thus, one can identify which parts of the image will actually be viewed and perceived by users and determine their color composition. In the context of a game where the images are changing rapidly, this can be particularly important. Several algorithmic methods for highlighting areas of image saliency exist. One of the most cited algorithms of salience modeling is the iLab Neuromorphic Vision C++ Toolkit or INVT [34]. It was developed by the iLab Laboratory at USC Los Angeles. Another, more recent, algorithm seems efficient: it is referred to as “Image Signature” [35]. However, the most common model used (in more than 100 scientific articles) seems to be the Walther and Koch model, an evolution of previous Walther and Koch work [36]. We chose to use this model with the MATLAB software [37]. We then tested if a correlation existed between the volume of pixels considered salient in each of the images and the valence and activation of the emotions of the players. We analysed the saliency maps rather than highly targeted rescaled binary shape maps because our research is not focused on the analysis of the supposed major point of interest in each image (Figure 4), but rather on the potential link between the number of stimuli and emotional state of the players. We have specifically chosen to measure the volume of salient pixels in the Intensity Conspicuity Map for each image (Figure 5).

Images are analysed with the same method as for the first test: we used a color segmentation method based on a component difference threshold. Data are saved in a CSV file. We then calculate all the pixels in the projecting selection by removing black pixels. These values are then compared to values obtained in the subjective semantic questionnaires.

2.2. Questionnaire Subjective Semantics

Many studies have shown that the influence of colors can be affected by age and sex, as well as nationality and cultural backgrounds. However, studies also showed that, regardless of ethnic and geographical origin, humans experience some common emotions when faced with the same colors [24, 38, 39]. In most questionnaires of the previous experiments, participants were asked to define emotions that were more or less important from their point of view. Our approach is different: we did not use categorical classifications of emotions, but rather a dimensional classification of valence and arousal [9]. The IAPS survey is used to study image emotional impact on pleasure, arousal, and dominance. Initially, the IAPS system was developed to suggest a set of emotional pictures for psychological investigations on emotions and attention, but we did not use this set of emotional pictures. We used the IAPS scale semantic subjective questionnaire for measuring emotional impact from our own selected video game environments still images. The questionnaire is an online web form sent to participants via social networks. Filling the questionnaire is relatively long (35 minutes on average); this represents a real investment both in terms of attention and time, avoiding voluntarily erratic responses. The questionnaire is presented in English and French. The first page explains the terms of the measurement:

“HAPPY-UNHAPPY scale. At one extreme of the scale, you felt happy, pleased, satisfied, contented and hopeful. If you felt completely happy while viewing the picture, you can indicate this by placing a “Black Point” over the left. The other end of the scale is when you felt completely unhappy, annoyed, unsatisfied, melancholic, despaired, bored. EXCITED versus CALM dimension is the second type of feeling displayed here. At one extreme of the scale you felt stimulated, excited, frenzied, jittery, wide-awake, aroused. If you felt completely aroused while viewing the picture, place a “Point” over the number 0 at the left of the row. On the other hand, at the other end of the scale, you felt completely relaxed, calm, sluggish, dull, sleepy, unaroused. CONTROLLED versus IN-CONTROL. At one end of the scale you have feelings characterized as completely controlled, influenced, cared-for, awed, submissive, guided. Please indicate feeling controlled by placing a “Point” over the 0 number at the left. At the other extreme of this scale, you felt completely controlling, influential, in control, important, dominant, and autonomous.”

Then the instructions text invites the participant to rate the pictures: “Please rate each one as you actually felt while you watched the picture.” First, the participant must evaluate a practice picture and then he begins the questionnaire. Pictures are presented randomly. There are 6 pages to evaluate, each containing 4 pictures and their 3 evaluation scales from 1 to 10. Each participant’s unique IP address can only rate one questionnaire. Personal information is not stored about the participants, except their age and sex.

3. Results

We recruited participants on the web by the use of social networks. They have answered the semantic subjective questionnaire via their web browser (Google Forms). We have temporarily kept the user’s IP to avoid bounced questionnaires; all the IP addresses have been erased after this verification. Regarding the time taken to answer the questionnaire (generally more than 25 minutes), we argue that this delay helped avoid made-up answers. The group was composed of 31 females and 54 males, with an average age of 32 years. The participants answered the semantic subjective questionnaire after observing each of the 24 video game frames randomly presented (Figure 3). This method allowed us to collect data in a single online database. Since a difference of emotional sensitivity in video games has been shown to correlate to the participant’s level of knowledge of the media [40], we only asked participants playing video games for more than 2 hours per week to participate in the experiment.

The correlations between feelings of joy versus sadness are all strong in testing results related to the following: brightness: , , the value: , , chroma ,   , and lightness: , . The correlation is low regarding the links between hue and happy/unhappy (see Table 1).

The emotional activity collected in the subjective semantics questionnaire, by asking between calm versus excitement, produces only low correlations. The values are not significant, respectively, for brightness ,  , hue ,  , value , , chroma , , and lightness , (see Table 2).

The values of controlled versus dominant to assess feelings of fear or confidence of the players are, like the feelings of joy versus sadness, significantly and positively correlated: brightness ,  , value ,  , and chroma ,  , as correlations with joy and sadness, hue has a weak negative correlation ,   (see Table 3).

We tested if the amount of chromatic diversity in the images was significantly correlated with the different emotional states identified in semantic subjective questionnaire. Pearson’s correlations between volume chromatic diversity and the sense of joy versus sadness show strong correlation with ,  Sig. = 0.000 but also with arousal (corresponding to emotions calm versus excitement) with , Sig. = 0.003. The correlation is low contrast regarding emotions controlled versus dominant as fear or confidence (see Table 4).

We finally evaluate the correlations between the assumed number of salient pixels in each image identified after MATLAB analysis and removing pixels. This showed stronger correlation with the evaluated emotions of sadness Joy , Sig. = 0.000 and also with the emotions of fear confidence , Sig. = 0.004. On the other hand, the correlation between the volume of saliency and arousal was low (see Table 5).

4. Discussion

The results of our research imply that links exist between feelings of joy/sadness and environment properties: brightness, value, saturation, chroma, and lightness. For the brightness of images, the greater the color Saturation is, the more positive the valence of these feelings is. This corroborates previous studies which also showed that images leading to the perception of joy tend to be brighter, more saturated, and having more colors than images of sadness in a virtual environment [41]. The same links are observed with feelings of fear/confidence, with questions in the survey, as controlled/dominant. It also seems that if the video game environments are less saturated, the negative valence and the feeling of fear are higher. This is also true for the luminosity density: lower brightness induces a sense of fear, while high brightness seems to make players have more confidence. In all initial conditions related to brightness, chroma, value, or lightness, we have not observed, at this stage of the experiment, links with emotional arousal. However, a correlation was observed between arousal (a player’s emotional excitement or relative calm) and the chromatic diversity. We can assume that if an image has more sources of different colors, there could be increased neuronal activity related to observation [42], thereby generating the conditions for greater emotional activity [43]. This high color diversity also allows the generation of a positive sense of joy, which is also correlated with the number of observable saliency points in an environment. Salient elements require more activation of the rods of the visual system, as joy and fear require more awakeness. It is likely that in both cases the large volume of stimuli elements may indeed generate great neuronal analysis, leading to cognitive positive or negative valence. Several studies have shown that evolutionary selective pressures have resulted in slightly higher spatial acuity rather than increased chromatic sensitivity [44]. This could explain why color diversity produces lower emotional activity except for happiness and sadness. The pixel saliency number is more indicative of the volume of stimuli producing emotional activity. In our experiments we did not test the correlation between motion stimuli and emotion because several studies have already shown the very high influence of movement stimuli on emotional activation or the relative calm produced by the slow movement of these same stimuli [45, 46].

The circumplex model for colors design is defined as an analysis of evolutionary cognitive overgeneralization, not really within an idiosyncratic ecological context. However, we recognize that the thresholds of emotional activities are also linked to the personal experiences of each player; they are not discussed here. We obviously do not conclude that a saturated color environment could not induce an emotion such as fear or sadness. Other contextual factors and stimuli explain the induction of emotion.

In an upcoming experiment, we will test participants in a unique interactive virtual game. In this environment, we will change colors, lights, and saturation as suggested by our circumplex model to try to induce several emotions. The context will remain unchanged and we will not intervene on the stimuli of different participants, except for the color diversity variations, lightness, saturation, and perspective view (stimuli saliency). In the same way, stimuli motion speed will differ.

5. Conclusion

5.1. General Conclusion

Our research was conducted in order to design a tentative tool for the essential emotional phase in video game design: color scripting. The color design defines the chromatic aspects of each scene according to the emotions that the authors want to suggest. This tool was designed according to the previous circumplex model [15]. Our study is an analytical study proposing a methodology for defining the chromatic atmosphere of interactive environments based on a wide range of inducible emotions. The circumplex model could allow for the generation of emotions based on their positive or negative valences, but also according to their arousal: game designers can use volume of chromatic diversity, color saturation, brightness, and even motion speeds.

Making this tentative tool available to the gaming community is a priority for our team. We hope to contribute to increasing the quality of emotional experiences in the future of the gaming industry.

5.2. How to Use the Circumplex Model

To use the tentative circumplex model tool, we recommend first defining in advance what kind of emotion the environment should induce and then setting this emotional valence and arousal in the circumplex model (Figure 6). The positioning of the emotion in the schema defines the color, light, intensity, and motion speed to use. The top right area defines a large chromatic diversity: it does not specify a particular color, but the more distant the selected emotion is from the center, the more the virtual environment must contain a high number of colors. The lower right area defines poor chromatic diversity, the colors are organized in monochrome, and their kind depends on untreated or subjective cultural context here. If the chosen emotion is far away from the center of the diagram, the gradation will be less colourful. The bottom left area uses the same concepts of shades, but if the cursor moves away from the center of the diagram, the colors will appear less saturated and less bright. The top left area will use a variety of colors like the upper right area, but if the chosen emotion is far from the center, these colors will also be desaturated and low lighting. In the lower half of the circle, stimuli motions are slower, and the lower the chosen emotion in the circle is, the slower the stimuli will be. In the top half quarter, the more the emotion is at the top of the ordered axis, the faster the visual stimuli will be. The volume of salient stimuli is determined by a cutting plane, dividing the circle at 45 degrees, with emotions located at the top right and farthest from the center of the diagram area having the highest volume of stimuli. In contrast, the emotions of low farthest left of the center area will be those that show the least amount of visual stimuli.

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

Acknowledgments

The authors wish to thank all the participants of the experiment for the time spent on their research. Special thanks are due to Andrea Trapnell for her time spent.