Abstract

Human faces play a central role in our lives. Thanks to our behavioural capacity to perceive faces, how a face looks in a painting, a movie, or an advertisement can dramatically influence what we feel about them and what emotions are elicited. Facial information is processed by our brain in such a way that we immediately make judgements like attractiveness or masculinity or interpret personality traits or moods of other people. Due to the importance of appearance-driven judgements of faces, this has become a major focus not only for psychological research, but for neuroscientists, artists, engineers, and software developers. New technologies are now able to create realistic looking synthetic faces that are used in arts, online activities, advertisement, or movies. However, there is not a method to generate virtual faces that convey the desired sensations to the observers. In this work, we present a genetic algorithm based procedure to create realistic faces combining facial features in the adequate relative positions. A model of how observers will perceive a face based on its features’ appearances and relative positions was developed and used as the fitness function of the algorithm. The model is able to predict 15 facial social traits related to aesthetic, moods, and personality. The proposed procedure was validated comparing its results with the opinion of human observers. This procedure is useful not only for creating characters with artistic purposes, but also for online activities, advertising, surgery, or criminology.

1. Introduction

Since ancient times, people believe that the face is a window to the true nature of a person, the most direct way to their emotions and feelings [1]. People use information from faces to identify others, to guess their gender, age, or race, to make attributions such as personality, intelligence, or trustworthiness [2], or even to judge the emotions and intentions of the owners of the faces [3]. Our brain is specially efficient perceiving faces [4, 5] and processing the information extracted from them. These attributions are formed very fast; 34 milliseconds of exposition is enough for human brain to create a first impression of a face. So, the appearance of faces plays a central role in our everyday decisions [68] and in our relationships with other people [9]. For example, voting decisions [6, 10], criminal justice decisions [11, 12], mate selection [1315], or how we choose social partners [16] is influenced by what we perceive in the face of others.

Faces play a central role in art, design, or advertising to convey and elicit emotions. How a face looks in a painting or an advertisement can dramatically influence what we feel about them and what emotions are elicited. Studies are still being made on the face of the Mona Lisa and the emotions that her face conveys [17]. Previous works have proved that when looking at scenes containing human faces, observers tend to rapidly focus on the faces [18], even if faces do not occupy the most part of the scene. But faces are not important only for arts. Due to the importance of appearance-driven judgements of faces, face perception has become a major focus not only for psychological research, but for neuroscientists, engineers, and software developers [19]. New human-machine interaction systems and online activities like e-commerce, e-learning, games, dating, or social networks are fields in which it is common to use human digital representations that symbolize the user’s presence or that act as virtual interlocutor [20]. The importance of communicative behaviours of these avatars in new interaction systems [2125] has led to an increasing interest in creating realistic virtual faces able to convey appropriated sensations to users [2629].

The objective of this work is to develop a system to generate realistic looking synthetic faces that transmit to human observers the sensation of having a set of social traits each of them in a preestablished amount. The developed system must create faces with appropriate facial features to achieve this objective. Hereinafter, social traits will be used as any judgement that a human observer can make about the aesthetic characteristics of a face (e.g., attractiveness) or about the emotional state (e.g., sadness) or personality (e.g., dominance) of the owner of the face. In the same way, facial features will refer to the morphological characteristics of the faces.

Developing such a system must overcome two great difficulties. The first one is to establish the relationships between the facial features of a face and its social traits. Visual perception research has shown that human brain processes faces in different way to other kinds of objects [30]. Part-based perceptual models suppose that objects are processed on the basis of their components or parts [31]; although it is commonly agreed that this is the way in which we process most objects, faces are thought to be processed in a different way. In relational [32] or configural [33] models of perception, first-order features (like isolated face features) are processed in a part-based way, but second- and higher-order features emerge from the combination of several lower-order features, and these are used to make judgments from faces. The amount of information derived from second- and higher-order features used depends on the kind of judgment that is made from faces [32]. For example, it is suggested that face recognition depends mainly on first-order features and part-based information processing [34, 35], while more complex judgments require information from second- and higher-order features. Holistic perceptual models integrate facial features into a gestalt whole when the human brain processes a face’s information (holistic face processing) [36]. The pure holistic processing of faces, with no decomposition into parts, is not supported by the evidences that suggest that some judgements rely mainly on part-based processing of faces [30]. This leads to the mixed holistic/part-based models. These models do not exclude part-based processing from the global holistic processing during face perception [37, 38].

Therefore, to establish the relationships between facial characteristics and social traits elicited in the observers is challenging due to the complexity of the face perception process itself. But, if such a model that relates facial features and social traits is developed, another difficulty remains to create faces that convey a predefined set of social traits. It is possible to consider a face like a set of facial features. This way, the problem is to find the optimal combination of facial features that elicits, simultaneously, a preestablished quantity of each social trait. Therefore, the problem becomes a multiobjective combinatorial optimization problem. Moreover, the number of facial features to be considered can be high (nose, mouth, eyes, eyebrows, relative distances, etc.), as well as the number of possible types of each facial feature (how many types of noses, eyes, jaws, etc.). Therefore, the space of solutions of the problem can be huge.

There are systems to generate realistic synthetic faces and to synthesize emotional facial expressions since the last century [3942]. A common approach for modelling social traits in artificially generated faces is to systematically modify one facial feature over an existing face, asking people to assess the modified face in the range of the social traits of interest. The modified feature that obtains the best score is fixed and the process is repeated over another facial feature. Considering the holistic face perception model, this approach is far from being optimal. Some other techniques bear in mind that faces are perceived in a gestalt whole rather than as a collection of features independently considered. Among them, two sets of methods can be differentiated: psychological reverse correlation methods (PRCM) and reverse correlation methods in the context of face space models (FSRCM) [3]. PRCM alter faces using randomly generated noise. There are two popular PRCM techniques, both of them consisting in superimposing noise on images. In the first approach, the base face is unambiguous (e.g., a prototypical sad face), while in the second approach, the face is ambiguous (e.g., two facial expressions morphed in one face) [4345].

While the previous approach made use of noise to achieve its objective, FSRCM approach is focused on changing some characteristics of the faces directly. The procedure can be divided into two tasks: the first one is to develop a model of a face representation, and the second one is to establish the changes in the facial features of the face that lead to the desired changes in social judgments. Similarly to PRCM, FSRCM does not explicitly manipulate facial features. This approach makes use of a faces space, where faces are represented as points in a multidimensional space and each dimension is a property of the face [46]. Oosterhof and Todorov [47] followed this approach to generate models of perceived face trustworthiness, threat, and dominance. In a posterior work, they also built models of several other social traits, such as attractiveness [3, 48]. Walker and Vetter [49] used this procedure for aggressiveness, extroversion, likeability, risk-seeking, social skills, and trustworthiness and used the obtained models to manipulate real faces leading to the expected social attributions.

However, these previous methods have some important limitations. The results of PRCM procedures are models of the strategy used by observers when they assess faces. These models are obtained from a survey in which each participant assesses a big set of artificially degraded faces. The enthusiasm of the participant to perform the task will most likely decay with time, affecting the obtained models [43]. Moreover, both mentioned approaches need a large number of trials to model the expected social attributions in faces, which can lead to lose the participant’s motivation and to worsen the quality of the results. Another limitation of reverse correlation methods is that they are limited to create models of one category (e.g., trustworthy, dominant, etc.) per task. Outcomes may change considerably when the objective is to create faces that convey several social traits to some extent, considering simultaneously multiple traits.

In this work, we propose a very different approach to automatically create virtual realistic faces that convey several social traits simultaneously, each of them in a predefined quantity. This approach is, basically, to combine the appropriate set of facial features to form the faces. The facial features and their relative positions must be selected in such a way that impressions elicited in observers were as similar as possible to those established by the designer. In the first step of this approach, an evolutionary algorithm that looks for the adequate set of facial features to elicit the desired social traits is proposed. This kind of algorithms has been used before in evolutionary systems to generate faces of specific identity like EFIT-V [50] or EvoFIT [51].

Secondly, a model that relates the facial features of the faces to the social traits perceived by human observers is developed. This model is used as the fitness function of the evolutionary algorithm. Finally, the optimal set of facial features is combined to shape a realistic looking face. Using this new approach, the designer of the virtual face establishes the amount of each social trait that must be elicited (profile of social traits), and the system automatically generates the proper face.

2. A Genetic Algorithm to Generate Faces

Faces are characterized by their features (two specific eyes, a particular nose, a mouth, etc.) and by the spatial relation between them (relational information). The facial features considered in this work were selected considering previous studies. Internal features (i.e., eyes, nose, and mouth) seem to have significant importance in face recognition [52, 53]. Among the internal features, eyes play a key role in face information processing [54]. Some authors include the eyebrows in the eye area [55, 56] or consider the eyebrows as a major factor in the perception of a face [57]. Blais et al. [58] found that the mouth area is an important cue for both static and dynamic facial expressions, which was consistent with previous researches [59]. However, external facial features such as hair or the shapes of the cheek, the chin, or the jaw also play an important role in the way in which the brain processes the face information. According to Axelrod and Yovel [60], the fusiform face area of the brain is not only sensitive to external features but is also sensitive to their influence on the representation of internal facial features. Some works found that the face shape contributes significantly to faces discrimination [61, 62]. Considering these previous works, we decided to consider the internal facial features (eyebrows, eyes, nose, and mouth) and the jaw contour in this study. Although other features have effect on faces perception, e.g., hair and facial hair, skin tone, and facial proportions [14, 6367], we limited our study to those features that have a main effect on face perception, rather than considering features that may vary from time to time like hair (people can get a haircut). In addition to these five facial features, the relative positions between them will be considered. , , , and are the vertical positions of the eyebrows, the eyes, the nose, and the mouth, respectively, measured from a horizontal line that passes through the base of the jaw line (Figure 1). is the distance between the centres of the eyes. Therefore, one face can be defined by 10 parameters (EB, E, N, M, J, , , , , and ).

The number of faces that can be generated as a combination of these parameters depends on the number of different values that each parameter can take (the number of different eyebrows, noses, mouths, etc.). The number of features of each class included in this study will be discussed later. Considering a minimum of 10 features of each class, the size of the solution space is, at least, 1e10. Due to its complexity, the problem cannot be solved using enumerative or analytic procedures. Therefore, a genetic algorithm (GA) [68, 69] is used to look for the optimal combination of parameters. GAs explore the faces space performing a stochastic guided search based on the evolution of a set (population) of structures (chromosomes). Each chromosome represents a solution to the problem (a face). The population of faces is evaluated using a fitness function to measure its suitability for the requirements of the problem. Based on the fitness of each chromosome, a new population of faces, which inherit the best characteristics of their predecessors, is obtained. The new population of faces is the result of several transformations guided by genetic operators (selection, crossover, and mutation), which combine or alter the chromosomes obtaining new faces. This iterative procedure is repeated with a predefined number of iterations or until another stop criterion is reached.

Each chromosome is composed of 10 genes (Figure 1). Genes 1, 3, 6, 8, and 10 codify one facial feature of each class. The remaining genes codify the positions in which the features will be located in the face. According to the fundamental theorem of genetic algorithms [69], codifications that favour short and low-order schemata are preferable. Therefore, genes that codify the position of one specific feature have been placed close to the gene that codifies that feature.

The flow chart of the algorithm employed in this work is shown in Figure 2. An initial population of n (population size) chromosomes of faces is randomly generated. Roulette wheel selection [68] is used to choose the survivor and reproducer chromosomes in each generation. The ratio between survivors and reproducer is controlled by the (crossover probability) parameter. The number of survivors is n(1- ) - 1, while the number of reproducers is n. A single-point crossover operator is used to obtain the offspring from the parents. Mutation operator acts over survivors and the offspring to form a new generation. To complete the n chromosomes of the new generation, the best face of the previous generation is always selected to go on to the next (elitism).

The single-point crossover process is shown in Figure 3. After selecting two parents, a crossover point is randomly chosen. Two descendants are produced by merging the genes that remain on each side of the crossover point in each of the parents. The crossover is a closed operator since it always produces chromosomes that represent feasible solutions to the problem. The mutation operator is applied changing the allele that occupies a gene if a random number between 0 and 1 is less than (mutation probability). The new allele is selected randomly. A typical value for ranges between 0 and 0.1 [70].

3. A Model to Predict Social Traits Elicited from Facial Features

Two questions remain unsolved in the previously defined evolutionary algorithm. The first one is to establish the alleles of each gene that represent a facial feature in the chromosomes, i.e., the different eyebrows, eyes, noses, mouths, and jaws that will be considered as alleles. The second one is to create a model that relates the facial features that form a face and the social traits perceived by the observers, i.e., the fitness function of the algorithm.

3.1. Alleles of the Facial Features’ Genes

The sensations that a face elicits in human observers arise from the visual characteristics of the face. It is not possible to establish the number of different shapes that a human facial feature can take, but it can be supposed that features with similar appearance have the same effect on the perceived social traits. Considering this, we propose to create groups or clusters of features with the same appearance. All the features included in one cluster will elicit very similar sensations in observers. Therefore, all of them can be properly represented by one of the features of this cluster (representative feature). In this way, the number of possible alleles of a gene can be reduced to the number of representative features, i.e., the number of clusters of the feature.

To obtain the features clusters, a set of 93 images of faces (Figure 4(a)) was analysed. After reviewing several well-known databases [71], we selected the Chicago Face Database (CFD) [72]. This database contains high-resolution standardized images of real faces of Asian, Black, Latino, and White males and females with several expressions (including neutral). All the images in the database have the same size and resolution; faces have the same position, pose, and orientation, and the background and illumination are uniform. The homogeneity of the conditions in which the images were obtained was an important factor to select this face database because, for example, differences in the illumination can affect the way in which a face is perceived [73]. For this study, we selected the subset of 93 photographs of white males with neutral expression.

Using CFD supposes another advantage for our study. Each photograph is accompanied by information about the target face, and it has been rated by a large sample of participants on several social traits. We selected the following social traits: Afraid, Angry, Attractive, Baby-Faced, Disgusted, Dominant, Feminine, Happy, Masculine, Prototypic, Sad, Surprised, Threatening, Trustworthy, and Unusual. Participants responded on a 1–7 Likert scale (1 = not at all, 7 = extremely) except for Prototypic, that was responded on a 1–5 Likert scale. Prototypic was defined as in which degree the face seems typical; in our case, how much their physical features resemble the typical features of white people. Detailed information on the database generation and characteristics of the participants is available in Ma et al. [72].

We developed an algorithm to automatically process images from the database and to extract individual images of the facial features of each face (Figure 4(b)). Our objective was to extract the internal features (eyebrows, eyes, nose, and mouth) and the jaw contour. Two automatic facial landmark detectors were employed, one for the internal features [74] and another one for the jaw contour [75]. Then, each feature was extracted individually, centred within the image and crop so all images of a given type of feature have the same size and alignment.

Using this procedure, five databases of images of each feature were created. Then, eigenfaces (a holistic approach usually applied on whole faces) are used to characterize each facial feature by its global appearance [76] (Figure 4(c)). This method performs a principal components analysis over an ensemble of images to form a set of basis images. These basis images, known as eigenpictures, can be linearly combined to reconstruct images in the original set. This procedure allows for automatic, robust, fast, and objective characterization of the facial features considering their global appearance while summarizing the central information to characterize them. In this case, each facial feature was characterized using 45 eigenvalues. The same value was chosen for all of them in order to facilitate the subsequent clustering process, bearing in mind that the explained variances were about 85% or higher in all cases.

At this stage, the appearance of each feature could be characterized using 45 real values (eigenvalues). K-Means clustering algorithm [77] was selected to cluster the facial features using their eigenvalues as characteristics (Figure 4(d)). A drawback of using this method is that the number of clusters (K) must be predefined. The approach used to face this problem was to perform several K-Means executions varying K and to calculate Dunn’s Index [78] for each set of clusters. Dunn’s Index measures the compactness and separation of the clusters obtained for each K. A higher Dunn’s Index points to a small intracluster variance and a high intercluster distance; namely, the features included in each cluster are more similar among them and more different from the features belonging to other clusters. Therefore, the number of clusters for each feature was selected as the K that maximized Dunn’s Index. Using this procedure, eyebrows were classified in 10 clusters (EB1 to EB10), eyes in 19 (E1 to E19), noses in 12 clusters (N1 to N12), mouths in 9 clusters (M1 to M9), and jaws in 11 (J1 to J11). The classification of the facial features for each face in the CFD can be found in the Supplementary Materials of this work (available here). Finally, the features closest to the centre of their clusters were selected as representatives of their groups, and they will be used as alleles of the corresponding gene in the chromosomes of the faces (Figure 4(e)). In this way, all the features in the sample are represented by some allele that has similar appearance. As an example, Figure 5 shows the 9 mouths selected as representatives (alleles). Each allele represents all the mouths in its cluster. The mouths in clusters M3, M5, M6, and M7 are shown in Figure 5.

3.2. Predicting Social Traits from Facial Features

The GA proposed in this work needs an objective function able to measure the fitness of a chromosome with respect to the social traits profile that is looked for. A social traits profile of a face is composed of the scores of the 15 traits selected in the previous section: Afraid, Angry, Attractive, Baby-Faced, Disgusted, Dominant, Feminine, Happy, Masculine, Prototypic, Sad, Surprised, Threatening, Trustworthy, and Unusual. The fitness function for this problem can be formulated as in (1), being the desired score for the social trait t and Tt the predicted score for the social trait t of the chromosome evaluated. While the scores are known, the values of Tt must be obtained from 15 models, each of them able to predict how human observers would rate the face represented by a chromosome for one of the 15 social traits.Although how the social traits of a face are perceived depends on the whole face, the individual effect of each feature can explain part of the variation within the faces appraisals [79, 80]. A comprehensive discussion on this approach can be found in [81]. From this point of view, some studies have used additive models of the facial attributes appraisals that explain the majority of the feasible explained variance [82, 83], have related individual facial features to perceptions of the targets’ personality [84], or have predicted social traits evaluations from facial features with high accuracy [85]. Obviously, using these additive models some unexplained variation remains due to the interaction among the considered features and because the facial features included in the models do not cover the whole face.

Let us suppose a chromosome with alleles EB, , E, , , N, , M, , and J. To predict Tt (the score of the face represented by this chromosome for the social trait t), we propose the additive model shown in (2). In this equation, each is the individual score of the allele of the feature f assessed with respect to the trait t, and is the weight of the feature f in the assessment of the global face with respect to the trait t.The predicted scores of each allele of the feature f with respect to each social trait () are calculated using (3). In this equation, is obtained from (4), where nc is the number of features in the cluster that is represented by the allele and is the score in the social trait f of the face to which belongs the cluster member i. For example, Figure 6 shows how is calculated for the M5 allele (of the feature mouth) for a social trait t. The mouth M5 (a) is representative of a cluster of mouths (b). Each mouth in this cluster has been extracted from a whole face in the CFD (c), and these faces have scores ( for all the social traits obtained from a group of human observers (d). is calculated as the mean value of these scores. The scores of each face in the CFD for each social trait can be found in the Supplementary Materials of this work.As are computed using the mean of the scores of the faces of the CFD, the variance of values is much smaller than that of the scores given by the human raters. So that the models can take extreme values present in the CFD scores, are transformed like in (3). In this equation, and are the mean and the standard deviation of the values of all the alleles of the feature f for the trait t, and and are the mean and the standard deviation of the scores in the CFD for the trait t. In this way, values have the same mean and standard deviation as the original CFD scores.

The individual effect of each feature can explain part of the variation within the faces appraisals [79, 80], but each facial feature has different effect size. Using a weight per facial feature and social trait, like in (2), gives different importance to each facial feature on the formation of the impression of each social trait. The capability of the developed models to predict the perceived social traits lies in achieving a good fitting to the scores of human observers (available on the CFD). Therefore, it is necessary to find the best combination of weights. To do that, all the faces in the CFD were codified as their corresponding chromosomes. Then, we used a GA in which the fitness function was defined as the mean squared error between the model predictions on the chromosomes and the actual face scores of the assessed faces. Given the characteristics of the problem, using gradient-based methods such as Quasi-Newton method might be sufficient in this case; however, we used a GA because the structure of our big dataset was well conditioned to be used by our calculation module, and using another procedure would have required a time-consuming dataset processing.

The GA was configured to perform single-point crossover and uniform mutation. The crossover probability was set at 0.6 and the mutation probability at 0.001 on a population of 50 individuals. The permitted range for the weights was set to the interval [0; 1]. The selection method employed was Stochastic Universal Sampling, and the Survivor Selection Policy was fitness-based with elitism. The number of iterations was established at 200 000; however, this limit was never reached due to the early stopping condition implemented. This condition allowed for a maximum of 100 consecutive iterations without a change higher than 0.0001 in the best solution fitness. With this configuration, the optimization was performed individually for each social trait, resulting in a total of 15 sets of weights, one for each trait. The obtained weights, normalized to sum up 1 for each social trait, are shown in Table 1. Table 2 shows Pearson’s r correlation coefficient and mean square errors (MSE) between the results of the models and the actual faces scores. All the correlations were highly significant (p values under 0.01).

4. Generating Realistic Looking Faces from Chromosomes

Once the GA has found the optimal combination of facial features for eliciting a preestablished social traits profile, it is necessary to generate a realistic looking face combining these facial features. In order to achieve a realistic face, it is necessary to use an automatic seamless fusion method, which further adapts the illumination and tone of the different patches being sewed. The algorithm used in this work to achieve this task is the Poisson Image Editing method [86]. This algorithm makes use of the Poisson Equation and information of the gradient of the images in order to achieve a seamless fusion.

The process is depicted in Figure 7. A base face in which to paste the different features was generated using FaceGen software [87]. This base face is common for all the faces. The genes that codify the facial features (1, 3, 6, 8, and 10) are used to get the images corresponding to the facial features to be pasted and to create masks using the landmarks of the features. The masks are positioned over the base face in the positions established in the genes 2, 4, 5, 7, and 9 of chromosome (Figure 7(a)). Then, the images of each feature are pasted over the corresponding mask (Figure 7(b)). Finally, the Poisson Image Editing method automatically configures the new face.

5. Materials and Methods

A software implementing the GA and the Poisson Image Editing method was developed (Figure 8). This application permits two different tasks. On the one hand, it makes evaluating an existing face obtaining its predicted social traits profile possible. On the other hand, the software allows defining a social traits profile to be obtained, establishing the parameters of the GA, and generating a realistic looking face corresponding to the best chromosome found by the GA.

10 faces were generated using the software to test the performance of the GA to produce faces that elicit a preestablished social traits profile and the capacity of the models developed to predict the sensations elicited. To generate the faces, 10 different social traits profiles were used. The software was used to obtain 10 faces from these profiles. The 10 faces are shown in Figure 8. The objective was to compare the social traits profiles of the obtained faces with the opinion of human evaluators.

We must distinguish here between the desired profile of social traits that we initially established as objective and the profiles finally obtained for the faces. There are correlations between the perceived social traits. For example, a highly masculine face is usually perceived as dominant [72] or a baby-faced one as trustworthy. Some of the profiles used to generate the faces combined some usually highly correlated social traits like Masculine, Threatening and Dominant, or Baby-Faced and Trustworthy (like the faces 9 or 3 in Figure 8). In these cases, the algorithm was able to find a combination of facial features with a social traits profile very similar to the desired profile. On the other hand, some other desired profiles joint an unusual combination of social traits, like Dominant and Feminine (face 10), or Angry, Threatening, and Feminine (face 4), or include Unusual as a main social trait (faces 2, 5, and 6). These combinations include social traits that have negative correlations [72]. This means, for example, that changing a facial feature in a given face to increase the perception of Dominant will decrease the perception of Feminine. In these cases, the algorithm will find the face with the social traits profile nearest to the desired one; however, the differences between them will increase as the negative correlation between the desired social traits increases. In some extreme cases, the profile of the face finally obtained could be far of the desired one, for example, if the desired profile includes Feminine and Masculine simultaneously. In these cases, there is no combination of facial features that can achieve a social traits profile as the desired one.

Under each face in Figure 8, the 4 main social traits we used to define its desired profile are shown.

6. Results and Discussion

This work proposes an evolutionary algorithm to automatically create virtual realistic faces that convey 15 facial social traits, each of them in a predefined quantity, combining the appropriate set of facial features to form the faces. For each social trait, a model that predicts the scores of human raters has been developed. 10 faces with different social traits profiles were generated using the proposed procedure. To test the performance of the system, the results were compared with the opinion of human evaluators. 35 people participated in the survey, 16 men and 19 women. The ages of the participants were between 18 and 71 years old, with a mean age of 37. Participants were asked to assess the 10 created faces using the same scale as the CFD (1–7 Likert). To avoid the learning effect, the social traits and the face order were randomly presented to each participant.

Table 3 shows Pearson’s r correlation coefficient, p values, and MSE between the predicted scores and the actual faces scores by social trait. Positive correlations were found for all the traits, being strong and statistically significant for 8 of them, namely, Afraid, Attractive, Baby-Faced, Dominant, Feminine, Masculine, Sad, and Unusual. Low MSE between the predicted scores and the actual faces scores by social trait were obtained for these traits. Although moderate positive correlations were found for Angry, Surprised, and Threatening, these were not significant.

The main objective of this work was to generate faces that elicit a preestablished set of social traits on most observers. Figure 9 shows the results for each face. Blue bars represent the social traits profile predicted by the models. The orange lines are the mean of the scores of human participants (whiskers represent ± 1 times the standard deviation about the mean). The MSE between predicted scores and the means of the scores of the participants are shown for each face in Figure 9. The mean MSE between the predicted scores and the actual faces scores of 10 faces generated by the proposed system was lower than 0.64. Considering only the 8 social traits in which significant correlations were found (Afraid, Attractive, Baby-Faced, Dominant, Feminine, Masculine, Sad, and Unusual), the mean MSE for all the faces was 0.26.

Despite the complexity of the face perception process, the results obtained show that 8 of the models developed in this work have been able to establish the relationships between the facial features and the social traits elicited in the observers. In addition, the interrater agreement among people’s judgements on social traits of faces is usually low [72]. However, the proposed procedure was able to approximate the mean opinion of the human observers, finding strong correlations for these 8 social traits.

On the other hand, finding the combination of facial features that elicits several social traits simultaneously, each of them in a predefined amount, is a complex multiobjective problem. This work approached the problem using eigenfaces to create clusters of facial features with the same appearance and selecting one representative feature of each cluster to be used as alleles in a GA. The mean MSE obtained for the tested faces (0.26 on a 1–7 Likert scale) suggests the validity of this approach.

The models obtained in this work to predict social traits from facial features give insights on how important each facial feature is in the formation of each impression of a face. Each additive model considers the individual contribution of each facial feature to explain part of the variation within the appraisals of a social trait. The models add the individual contribution of each feature, weighted by its relative importance in the social trait assessed. The weights presented in Table 1 suggest the effect of each facial feature on the variation of each social trait. For example, in the case of Afraid, the eyes, the mouth, and the position of the mouth seem to have a bigger effect than, for example, the nose or the jaw. Therefore, if it is necessary to change the level in which a given face is perceived as Afraid, shifting the facial features with higher weights will have a bigger effect.

Even though there exists some works on this topic, any of them allows creating realistic faces conveying more than one social trait at a time. Dotsch and Todorov [45] use grey images with superimposed noise in order to achieve faces which convey trustworthiness or dominance. Vernon et al. [88] propose a system able to model social traits and produce cartoon-like computer-generated faces able to elicit three social traits: approachability, youthfulness, and dominance. Perhaps, the proposal closest to the one presented in this work is the one of Walker and Vetter [49], which is capable of creating realistic faces expressing only one social trait at a time. According to our best knowledge, this is the most comprehensive work, in terms of number of social traits considered, generating realistic looking faces that elicit a preestablished set of sensations on most observers.

However, some limitations of this study must be pointed out, mainly regarding the generalization of the findings. 93 faces of the Chicago Face Database were used to obtain the models relating facial features and facial assessments. The set of faces belongs to men between the ages of 18 and 40 years living in the Chicago (USA) area. The subjective classifications of the faces were made by a specific group of women and men probably from the same city [72]. Therefore, both the faces and the appraisals used to develop the models come from a specific community. The generalization of the results to faces of people from other communities must be carefully addressed.

Our future works will be intended for developing similar studies for female faces and for extending the results to other races. On the other hand, visual perception research has shown that human brain processes faces in a very complex way [30]. Although the first-order features play a central role in how a face is perceived, second- and higher-order features emerge from the combination of several lower-order features and are used to make judgments from faces. Using a larger face database in our future works would allow us to consider interactions between the facial features, at least of second order, and, probably, to improve the results obtained.

7. Conclusions

This work proposes a new approach to automatically create virtual realistic faces that convey several social traits simultaneously, each of them in a predefined quantity. To create the faces, a genetic algorithm selects the appropriate facial features (including eyes, eyebrows, nose, mouth, and jaw) and their relative positions, in such a way that impressions elicited in observers are as similar as possible to those established by the designer. The facial features used by the algorithm as alleles are obtained using the eigenfaces method. Using this method clusters of facial features with the same appearance were created, and one representative feature of each cluster is used as alleles. Several models that relate the facial features of the faces to the social traits perceived by human observers were developed. These models are used as the fitness function of the genetic algorithm. Finally, the Poisson Image Editing method is used to combine the selected facial features in a face.

15 models were developed to establish the relationships between the facial features and the social traits elicited in human observers. Positive, strong, and statistically significant correlations were found for 8 of them, namely, Afraid, Attractive, Baby-Faced, Dominant, Feminine, Masculine, Sad, and Unusual. To test the proposed procedure, several social traits profiles were established and the developed system was used to generate faces with these social traits. The social traits of the generated faces predicted by the models were compared to the opinion of human observers. The mean squared error obtained for the tested faces (0.26 on a 1–7 Likert scale) suggests the validity of this approach and that the system is able to approximate the mean opinion of the human observers.

Using the developed system, the designer can establish the amount of each social trait that must be elicited by a face, and the system automatically generates the proper face. People use information from faces to judge the emotions and intentions of the owners of the faces. How a face looks in a painting or an advertisement can dramatically influence what we feel about them and what emotions are elicited. In these fields, the procedure presented in this work can be used for creating faces that conveys the desired set of sensations to the observer. In the same way, it can be used in other fields like online activities or new human-machine interaction systems in which it is common to use human digital representations that symbolize the user’s presence or that act as virtual interlocutor.

Data Availability

The Chicago Face Database used to support the findings of this study is freely accessible on http://faculty.chicagobooth.edu/bernd.wittenbrink/cfd/index.html. All the images employed in this study and the results of the facial features clustering are available on https://github.com/flifuehu/.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Supplementary Materials

Scores for each social trait in CFD and classification of the facial features for each face. (Supplementary Materials)