Abstract

The development of robotics is undeniable in recent years. Many developing contries face the growth of the elderly population, it is the premise and impetus for the development of research on humanoid robots to serve humans. Many studies on various aspects of robotics are carried out in different parts of the world. Many novel methods were introduced for the design of the robot’s external appearance, internal mechanisms, and gestures. Recent research on humanoid robots is designed to appear to be copies of the anthropometric indicators of real people, which may affect the security of other people’s identities. Besides, these robots cause a feeling of horror in the user if their appearance is in the position of the uncanny valley. Therefore, these designs need to be carefully considered before fabrication. Artificial skin is studied for various purposes such as ensuring collision safety for industrial users and helping robots to perceive basic tactile sensations. This review consists of the recent literature on the interaction of appearance and behavior of robot interaction, artificial skin, and especially humanoid robots, including appearance, such as android and Geminoid robots. This work can provide a reference for humanoid robot research, including uncanny valley hypotheses, artificial skin, and humanoid robots.

1. Introduction

Nowadays, the developments of intelligent technology and science are undeniable. With the purpose of assisting humans in dangerous, arduous, or boring jobs, research about robots is carried out to solve the above problems. Especially with the aging population in developed countries, robots are used for individual persons or service industries such as restaurant service robots, elderly care robots, and self-propelled vehicles [1, 2]. Many different types of robots are designed for specific applications such as mobile robots, automatic systems, and cable robots [35]. However, the field of humanoid robots is always an exciting topic. There are many answers to the question “Why make a robot like a human?”. In a study by Hanson and Bar-Cohen [6], the authors explain that every product is made to serve human needs, so humanoid robots are the right approach for applications that serve humans. In addition, they can also help people study human behaviors and gestures that people are always attracted to because they are always curious about themselves. For example, studies on human behavioral manifestations were carried out by Kostavelis et al. [7], in medicine, especially in treating psychological diseases [8], or for educational purposes [9]. In addition, humanoid robots are also designed to serve entertainment needs with different sizes and different numbers of DoF depending on applications. A small-size robot designed with 21 DoF and tall 58 cm was one of the successful robots called Nao [10]. Nao robot was developed, and the latest version was published in 2018. Recently, HR-OS1 [11] was also a kid-sized robot that provided a low-cost humanoid robotics platform. It could provide speech, listening, movement, vision, navigation, et al. The adult-size robot style was also developed, such as the Dimitri robot—an open humanoid robotic platform [12] and the KHR-2 humanoid robot [13].

Research is expected to create robots that resemble humans both in appearance and behavior. However, the design of humanoid robots needs noting to the hypothesis because they may make a terrifying sensation when communicating if their appearance falls into uncanny valleys. An android robot is a robot’s type that resembles a human appearance or other artificial. They are usually covered with skin made from a flesh-like material. This type of robot has a complex structure to copy human behaviors, but it also causes fear to humans if its design is not completely human. To explain these problems, the uncanny valley hypothesis is defined by Mori et al. [14] and was later analyzed by many studies and conducted experiments to prove this hypothesis, and this is discussed in Section 2. In psychology, some situations of nonverbal communication can be highly purposeful [15]. This is why facial expressions affect the effectiveness of human and robot communications. In the design of humanoid robots, especially robotic androids, the sustainable theories of anthropometric medicine, and human psychology need to be carefully considered. Anthropometric data and facial landmarks are important references in robot design [16]. The face of the android robot is complex because the outside is soft and can express all kinds of emotions like a human. The deformations of the artificial skin of the android robot depend on the mechanical friction between the skin and the frame, material, elasticity, and thickness of the skin [17]. These are nonlinear properties that are difficult to determine and model to correct during the design phase because of influencing factors in manufacturing. Therefore, based on medical sustainability theories, the authors analyzed the movement of landmarks marked on the face of the android robot to evaluate the emotions they express [18]. And the emotions generated by robots are often adaptive authors to the Circumplex Model of Affect theory which is presented in Section 3.

Geminoid robot is a special android robot defined in the study [19] to combine interdisciplinary research in robotics and human exploration through robots. First, the Geminoid robots were designed to look like humans and were used as a new communication device, creating copies of actual people. To our knowledge, humans only want to create robots with human-like appearances for purposes that serve them; however, the robots’ ability to make their own decisions is not expected for these robots because their safety may be threatened. This can threaten the identity of the person described in Reference [20]. Therefore, most of the Geminoids are only fabricated and applied by the authors as a new communication system, except Erica and Otonaroid, which are discussed in Section 4. Besides, the research on the head android robot form is published with the expectation of creating a robot similar to a human, so the human anatomy was also studied. For example, the robot head may be made up of a frame and two cameras with control motors placed [21] to provide the robot a way to interact with its environment and even communicate through eye movements. Due to the fact that vision was the most effective method of gathering information from the environment [22], the robot head was almost entirely created in this manner. However, this is not entirely true in some of the cases outlined in the study [23]. These mechanisms are used to develop complex robotic head systems. In order for robots to have human-like gestures, the robot’s organs must ensure basic human movements. With the desire to create a reference for the design of humanoid robots, especially robots capable of expressing facial emotions and human-like appearance.

In this work, the objective is to provide a reference for the manufacturing of humanoid robots specifically. An overview of the state-of-the-art in humanoid robotics research and development is presented that focuses on all aspects of humanoid robot designs. This work is performed for some reasons. First, the satisfaction evaluation factors in communicating with robots are summarized and presented so that the techniques avoid hypotheses for the robot design. Next, research on robot skin is reviewed; this factor is applied to many fields not only for robot design but also for medicine, for example, materials in-depth studies. Comments and discussions are provided on humanoid robots, including features, applications, and mechanical design. Finally, future research directions are suggested, and the key challenges are indicated. In brief, the structure of this review paper is done as follows. Theories about emotions and some design considerations to avoid falling into the uncanny valley hypothesis are reviewed in Section 3. Artificial skin studies are updated in Section 4. Section 5 presents reviews of emotional expression humanoid robots including Geminoid robots, head android robots, and android robots. Discussions and limitations of this review and recommendations are in Sections 6 and 7, respectively. The final section is the conclusions of this review.

2. Methodology

As presented, with the desire to provide a document about the overall structure and characteristics of humanoid robots, we have tried to search and review the studies of researchers from all over the world. The Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines [24] were used to present this study.

2.1. Search Strategy

The literature was searched with the keywords: human-computer interaction, Circumplex Model, robot emotion, the appearance of the robot, the behavior of the robot, uncanny valley, extended uncanny valley, human-likeness, artificial skin, artificial skin for the robot, electronic skin, actroid robot, android robot, gynoid robot, computers are social actors, cyborg robot, feminine robot, Geminoid robot, humanoid robot, face robot, and head robot. The studies used in this work were selected in order: the title was relevant to the topic in question, then the abstract was screened to determine relevance, and finally, the full text was used for the review process. We only use English-language publications, and the publications we searched mainly on Google Scholar, ResearchGate, IEEE Xplore, and ACM Digital Library.

2.2. Eligibility Criteria

We limited the scope to review publications that focus on elements that make up or relate to humanoid robots with a particular focus on robots with lifelike structures. Theories of robot emotions are also briefly summarized, and we evaluated studies involving the uncanny valleys that are repeatedly referred to in the humanoid robot field. The criteria used for the selection of documents included publication in peer-reviewed conference journals and relevant English language to the design of the humanoid robot, the humanoid robot head, and the theories involved in their design. Exclusions from the references include unrelated studies, gray literature, studies performed on participants with mental health problems, and outdated publications except for sustainability theories.

2.3. Risk-of-Bias Assessment

To check the quality of the publications, a list of 27 questions proposed by Downs and Black [25] was carried out. This assessment was carried out independently by the two authors, and if the results were inconsistent, the evaluators revisited it or relied on an outside assessment to decide. Of course, the reviewers have been trained in theories to assess quality.

3. Interaction of Appearance and Behavior of Robot Interaction

In studies on the theory of emotion in psychiatric and neuroscience research, it has been shown that each human emotion will be independent of other emotions in its own behavior, and psychophysiology [26, 27]. Emotions are generated through the stimulation of neural pathways, and it is concluded that each specific emotion maps to a neural system. In psychology, valence is defined as intrinsic attractiveness, which means its value is positive when good and negative when opposed [28]. Emotions are developed by human evolution [26]; emotions are mentioned as happy, sad, surprised, angry, and so on. According to Feidakis et al. [29], authors stated that emotions are divided into 66 categories with 10 basic emotions (anger, anticipation, distrust, fear, happiness, joy, love, sadness, surprise, and trust) and 56 secondary emotions. The assessed emotions focus on the emotional dimensions, which are largely variations of the model proposed by Russell [30]; he proposed the Circumplex Model of Affect, which is adapted in Figure 1(a). According to several views, emotional responses have a number of different but connected elements [32, 33]. Breazeal [31] presented the role of emotions and behavior with a robotic head named Kismet; they relied on theories found in the field of psychology to help the robot recognize emotions and affective intentions in human communication. They propose that facial expressions are generated by interpolation in 3D space based on 3 factors: arousal, valence, and stance. A proposed 3D diagram based on theories of psychology and the Circumplex Model of Russell is shown in Figure 1(b). Emotional robots are used in many different fields to assist in daily life tasks, such as education, entertainment, and education [34, 35]. The study of Alexandros and Michalis [36] used software to evaluate the emotional states of people when interacting with the system. The author has tried to create software that is used to collect human emotions during machine communication and make automatic judgments based on Russell’s previous suggestion. However, it is very difficult to assess whether emotions on robots can respond to emotions in humans because this assessment is only based on the subjective perception of observers, the difference between cultures is also the cause leading to the difference in evaluation [37]. In fact, American people tend to observe oral cues rather than eyes; in contrast, Japanese tend to evaluate eye cues more in terms of emotions. In human-robot interaction, the uncanny valley could be considered in the emotional evaluation.

Nowadays, designs of humanoid robots are always interested in simulated human morphology for each specific application, such as providing health advice, taking care of the elderly and children or simply studying the human properties [38, 39]. Studies try to create robots that look like humans; however, there is a challenge with humanoid robots defined as “uncanny valley” by Professor Mori et al. in 1970 [14]. He commented on the human response to robot humanoids. The robot’s familiarity increased with the similarity of its appearance and movements until a peak point, when a subtle imperfection in appearance and movement became frightening. The uncanny valley occurs around 70% similarity to humans, as shown by Cheetham et al. [40], Mori’s diagram adapted in Figure 2. The uncanny valley hypothesis is of great interest because he showed that the increase in human responses to robot interactions did not increase linearly; specifically, an uncanny valley occurs when the robot’s appearance is close to that of a human but not quite. Recently, the arguments put forward to explain this hypothesis are believed to depend on the perspective of communication processes, including the evolutionary psychology perspective and cognitive conflicts [43].

Uncanny Valley studies are interested more and more in robotic design, computer interaction, and psychology. In fact, the results were published based largely on the authors’ evaluation methods for some of the participants. As Mara et al. [44, 45] used an android robot called Telenoid to assess the response when communicating between humans and robots. The authors tried to reduce the value of the uncanny valley through fictional stories that participants hear before communicating with the android robot. The participants are selected in the museum center thus their awareness of science and technology is higher. Moreover, the authors have not yet created a private space for the participants, so they can be distracted by outside actions. This affects the strength of the conclusions in this study. Much research is done with the aim of preventing robot designs from falling into the uncanny valley. For example, Yam et al. [46] performed 3 studies on “dehumanizing” of robots to reduce the strange valley. Three experiments were performed to find out the effects of dehumanizing to set the stage for uncanny valley reduction methods by 299 participants. This research opens the door to reducing the fear of humanoid robots based on the dehumanizing method. However, in experiments 2 and 3, only one robot was used for both humanoid and nonhumanoid parts, which may affect the evaluation of the participants. A later study by Yam et al. [47] conducted surveys on human responses to robots in the workplace. This is a rather detailed and extensive study from 50 U.S. States, Singapore, and India, and an online experiment with 4 studies conducted independently. In this study, they focused on assessing the effectiveness of using embodied robots as they said that robots in less agentic form had less impact, but the works related to disembodied algorithms were not evaluated in this study. Or, Mara and Appel [45] studied a method to reduce the uncanny valley, namely, before participants join in communication with the robot, they can hear and read a fictional story about the robot. The results of this experiment show that fear are significantly reduced in sci-fi conditions. Surveys of human-robot interactions attract many authors working on this topic with the aim of developing the social behaviors of artificial agents. Moreover, the authors evaluated the robot’s interaction through unconscious human behaviors when communicating with the robot, such as gaze or pitch of voice. For example, Minato et al. [48, 49] carefully observed micromovements such as shoulder movements caused by human breathing in order to assess the level of human acceptance of android robots. Both Reppliee Q1 and Q2 robot androids were used for these assessments. Specifically, in the study [48] gaze is the factor that is analyzed carefully in the communication process. Some of the models are constantly set up and this affects the outcome of the assessment. The authors concluded that the gaze is influenced by the naturalness of the android robot. However, this evaluation method is not considered suitable due to some uncontrollable characteristics. In subsequent research [49], these authors provided an assessment of breaking eye contact while thinking. This can be one of the measures of Android’s humanity; however, to enhance the strength of the conclusions, surveys with different questions with many robots and random participants are expected. Then, the authors continued their research on human-robot interaction. In this study, the appearance of the android robot was built and published later [50], which is possibly one of the most human-like androids [51]. Its silicon skin gives it a more realistic appearance and includes several sensors and actuators to allow for highly natural and pleasant interactions with humans. The gentle movement of the thoracic cavity during breathing is also noticed by the authors in these android robots. In this study, the authors have extended the uncanny valley on a 3-dimensional axis system based on 2 factors namely behavior and appearance to create a map to visualize the uncanny valley.

In [52], Dio et al. performed a behavioral analysis of 33 children aged 5 to 6 years through play with robots. The children are invited to join this research because, at this age, they are not subject to external influences from society. With the experiment results, they noted that there are differences between robots and humans. On the other hand, by only surveying one age group, the conclusions are uncertain, and it is necessary to compare the cognitive evaluation of different age groups to strengthen the conclusions of the study. In a recent study by Kim et al. [53], they used 251 types of real-world robots to demonstrate that the robot’s low- and medium-human likeness also has uncanny valleys. This is a study conducted on 900 participants. However, the results do not disprove Mori’s hypothesis but are a supplement to the hypothesis of human-robot interaction. This study did not offer measures to reduce the negative reactions to the uncanny valley, the authors focused only on analyzing emotions from experiments.

In the study by Ferrey et al. [41], they suggested that the uncanny valley is created based on category inhibition, and cognitive conflict. Various morphing techniques used to transform images from human to robot and from human to the animal were used in this study. From published results, it is shown that the negative peak of the valley is not always near the end (about 70%), but the uncanny valley can be located near the intermediate. This study was conducted through 2 experiments with 129 participants. Different from the stimuli of other studies, specifically the similarity between robots and humans, in this study the authors evaluated the response to both animal transformations to study the position of uncanny valley. In the study by Mathur et al. [42], they made 358 participants evaluated 182 images of human and robot faces. The results indicate that the uncanny valley is skewed towards the robot; this is far from Mori’s hypothesis about this valley. They also draw a line between humans and robots that is different from Mori’s previous publications. This renders the authors’ original hypothesis of confusion inexplicable for the valley effect. However, this is a report on the aspect of the human-robot confusion hypothesis that helps to supplement the hypotheses to explain the uncanny valley phenomenon. Table 1 is set up to summarize the research of authors that includes objectives, sample sizes, results, and descriptions. To our knowledge, these are 3 hypotheses made based on the results of surveys and assessments carried out in different regions, thus having the influence of objective factors such as public opinion, culture, and economy of each subject invited to the survey. Future studies are expected to make robot designs more human-like to avoid uncanny valley hypotheses. Subjects participating in the assessments are expected to be more diverse than in previous studies. Instead of focusing primarily on teenagers, authors should expand on samples. In addition to the assessment through the choice of robot or human image of the participants, it may be affected by surrounding factors, such as the starting impact, assessment time, and assessment location. We are waiting for a study with a large and wide enough sample size to be able to enhance the strength of the conclusions, and it will be a great reference for all research on robot-humanoid design.

4. Artificial Skin

The humanoid robot or android robot is defined as a robot designed to resemble a human appearance; they are used for various purposes, which is discussed in Section 5. Artificial skin is used for robots that play an important role in robot communication or have significance in the field of reconstructing damaged organs in medicine. The emotions on the human face are a combination of muscle bundles, blood surface, and skin movement, so to create robots with the same appearance as humans, artificial skin is an area that needs to be researched. In addition to the complex structure of the skin, it is organized as a network of sensors that generate human touch. Artificial skins are designed to create tactile sensations for robots, they can simulate human touch but mainly touch tactile. Most of the authors focus their research on touch sensing for the outer skin of the robot. In this section, several advances in skin design are summarized with discussions of the prospects and challenges.

For the purpose of simulating human skin, artificial skin is also designed with key functions, such as protection and creating artificial sensations for robots. The skin is the most important organ for sensing, including a variety of sensory receptors from the epidermis to the dermis. Artificial skins are usually designed with an outer layer of silicone skin, and sensors are placed in position behind this class. Artificial skins give the robot the ability to be aware of its surroundings and be able to interact more naturally. Tactile sensors are used in many different applications; robots include a tactile sensor and feedback system that are able to do more delicate and complex tasks than those that solely use visual perception, such as grasping and handling delicate items or identifying the qualities of materials [54]. The main functions of human skin such as sensation, regulation, protection, and artificial skin are also designed to fulfill these functions. Simulation of the processing of human skin is depicted in Figure 3, with (a) describing the general processing of human skin and (b) the basic process used in the design of artificial skins. The stimuli received externally through the system of sensors are transmitted to the information processor to output signals that control the peripheral devices according to predefined rules.

In [55], Ulmen and Cutkosky proposed a type of artificial skin with a ridge that created a brittle structure. A prototype was established in their study consisting of 16 sensors arranged for a 4 × 4 skin model. With this design, the requirements of low manufacturing cost and low noise are ensured, but because each sensor element is connected to a small microcontroller, this makes the model unable to be flexible. Compact, flexible designs and softer outer shells are expected in the future. Recently, Teyssier et al. [56] introduced a design of artificial skin for robots that has the same structure and appearance as humans for the purpose of performing tasks of interaction and cooperation with humans, protecting the safety of communication. The authors tried to create a skin that has mechanical properties similar to that of human skin and is capable of detecting touch. As a result of these studies, experiments with forces often give very promising responses. Experiments with different forces and different skin thicknesses are expected in these studies. The skin response is affected by external factors, so methods to remove the noise are also challenging for the authors. Previously, a skin designed to measure surface forces and pressures was introduced by Hoshi and Shinoda [57]. The proposed skin includes two layers and is based on power measurement. With a large array design for the entire robot’s skin, the elements are consecutively connected to simplify the protocol, but not yet optimal. Experiments of soft conveyors have been carried out with the aim of improving the safety of human-machine interaction. Similar to detecting humans approaching mechanical devices, Klimaszewski et al. [58] proposed a distributed robotic skin with low cost. The authors put a very high level of safety for humans at risk when the electronic skin layer is designed with both resistive and capacitive touch sensors open. When moving closer to the work plate, the capacitance changes compared to the reference capacitor plate. Previously, these authors demonstrated a tactile robotic skin [58] capable of directing contacts using two parallel layers of force-sensitive resistors. The results show the stability of the e-skin (average error near 3.72), but the size, resolution, and real-time running are the factors that are still alive and developed in later research. A new structure of silicone skin was proposed by Tomo et al. [59], and it was tested with different shear forces to create a human skin-like structure. The authors tried to improve their previous designs [60] by improving the sampling frequency and optimizing and reducing the size of components and conductors. E-skin was designed with two layers, including a PCB made from silicon using a 3D printer and a fabric layer.

To test the quality of artificial skin, the tensile strength of silicone rubber was evaluated by Mochtar [61]. The authors calculate the conditions and factors affecting the creation of facial expressions of the robot. The success of the expressions depends on the transformations of the skin on the outside of the robot, and the facial action coding system (FACS) was applied to create emotions for the robot. Experiments were carried out to check the deformability of the silicone skin and examine the shape of the face. 3D images were collected by 3D laser scanning with white background with a matrix used for calibrating. The research results are focused on measuring the changes in silicone skin and building 3D images from the scanning machine. In addition, the success of artificial skin for robots depends heavily on the materials that make up the skin. Recently, protein hydrogel materials have been studied extensively in many fields [62, 63], in a study by Gogurla et al. [64], silk protein hydrogels are designed like skin tissue capable of biomechanical energy harvesting. and feel the movements. This is research that opens up the possibilities of energy-generating skins in robotics, soft robotics, and biomedical implants. Hydrated melanin nanoparticles to form a multifunctional hybrid hydrogel show high potential in various applications, especially for humanoid robots and the biomedical field. An artificial skin was constructed from polyimide/copper/polyvinylidene fluoride (PVDF) was proposed by Dai and Gao [65] that detects 2D position, force, and humidity. The proposed device with low cost and fast responsiveness promotes research in the field of smart skin. However, this skin only detects the effects of 2D forces when in fact the effects come from different directions, tests on this are expected in future studies. In addition, studies on artificial skin are carried out with the goal of ensuring safety when interacting with robots, such as relying on accelerometers to estimate the robot’s posture [66], adjusting the parameters of multiple e-skin sensors [67], or using proximity sensors to predict human-robot interactions [68]. Besides, the robot is used in position evaluation experiments to correct the spatial dimensions of the skin [69]. This is done carefully by the authors by comparing the experimental accuracy of studies in the same field. Many approaches are used in combination such as CAD modeling, 2D layout, 3D reconstruction, and robot kinematics to calibrate skin. Research on artificial skin is understood in the direction of trying to simulate the perception of human skin applied to many different fields as presented. Research on leather materials is especially expected in the field of medicine. Moreover, e-skins are applied to robots both in life and in the industry to ensure safety for people when communicating with robots. Moreover, studies on materials to make human skin were also carried out [70, 71] to create a realistic feeling like human skin. These studies need to be connected to creating artificial skins for robots that look and feel like human skin.

5. Humanoid Robot with Emotional Expression

Robot is a term that is no longer strange to humans. Various types of robots have been developed with different shapes for different applications, such as animal robots and interactive robots [72, 73]. The research of a humanoid robot was developed; they can imitate or reproduce human movements. This type of robot is attracting a lot of attention and is at the center of many different research projects around the world. Studies by psychologists show that only 7% of the information received by the manuscript on November 16, 2021; the February 2, 2022 amendment was conveyed by spoken language, while 38% was expressed in language and 55% was conveyed by facial expressions. It can be seen that facial expression robots continue to enter people’s lives, which will be an inevitable trend [74]. The head is the most important organ in social communication, where emotions are expressed through looks and facial expressions. In fact, in some cases, communication without language can convey specific feelings. Although emotions can be expressed through words, they cannot be complete and true as expressed by facial expressions. One of humanity’s first steps in bringing robots closer to humans was the KISMET interactive robot head in 1998 [75]. KISMET is a product of the MIT laboratory in America. It is one of the first robots capable of displaying social interactions and emotional expressions with humans. Designed with a face similar to cartoon characters, the robot’s voice resembles a child, so creating sympathy for people is quite easy. Most of the mechanical parts are made from steel, so the total weight of the robot is 7 kg. KISMET can recognize different human expressions and interact with them through emotional expression (happy, sad, angry, neutral, surprised, disgusted, and scared).

An emotional robot head was announced by Kobayashi et al. [76] in Japan. Robot emotions generated by 18 control points were introduced in this study. Artificial skin is also made from silicone to create realism for the emotions expressed. However, the study was carried out in 1994, and several years later, no further publications were made. In another study by Hackel et al. in 2005 [77], a humanoid robot named BARTHOC was introduced based on the anthropometry of 4-year-old children. The proposed robot is capable of recognizing emotional content (happy, fearful, and neutral) from speech and reflects it with facial expressions. In [78], Rajruangrabin and Popa designed a robot head named Lilly, which is an upgraded version of the previous design of the author group called Hubo robot, which was designed according to the facial anthropometric index of Albert Einstein [79]. A telenoid robot [80] was created in 2010 by authors at a university in Japan. This was a new system of telecommunication that looked like a white child; voice and facial and head movements are communicated through the network and shown on the robot with the desire to transfer the human presence. The hardware of telenoid robot consists of 9 DoFs with a mass of about 6 kg. It has been introduced for a variety of applications, such as caring for the elderly [81], communicating with students [82], and treating neurological disorders [83], although emotions have not yet been expressed on the robot’s face, but they received a positive response from participants [84]. Note that a telenoid is just a telecommunication system, which means that they are only transmitting information from one person to another, not a robot capable of self-communication. Telenoid is widely used because of features such as soft outer skin, optimized actuators to reduce costs, remote control, and compact size [85].

“Geminoid robot” was introduced as a special remote-controlled robot; it is a form of the android robot which is a humanoid robot that is surrounded by materials like human skin. The Geminoid robot was introduced mostly by Professor Hiroshi Ishiguro and the robotics laboratory at Osaka University. The Geminoid robots were first proposed by Nishio et al. with the first version called Geminoid HI-1 (HI-1 for short), and HI-1 was modeled following Hiroshi Ishiguro’s anthropometric dimensions (a professor of robotics in Japan) and HI-1 is a remote-controlled android system, has no intelligence and performs preprogrammed tasks such as breathing and blinking. HI-1 is designed with a height of 140 cm in a sitting position and cannot stand in this version; it includes 50 DoF, namely, 13 DoF on the head which was used for facial expression [86]. Studies on Geminoid robots are widely used in studies evaluating human communication with robots or implementing AI models based on brain waves [8789]. After that, Geminoid robots were born, such as HI-2, HI-4, HI-5, Geminoid F, Geminoid HI-2, and Geminoid DK [9092], besides, robot androids were announced by authors, such as Otonaroid and Kodomoroid [93], with the properties and parameters of the Geminoids are presented in Table 2. In addition, a study published by Hashimoto et al. [94] introduced an android robot to serve the educational system called Saya. A robot is a teleoperated system set up for a variety of applications. The face robot is designed with 19 control points based on the theory of the Facial Action Coding System [95] to represent Saya’s facial emotions. The authors also present a survey on the acceptance of students from elementary to university. From the evaluated results, Saya looks a bit intimidating to elementary students and the university students think that Saya could be perfect if it can have movable limbs. This is a study on the application of telerobots in educational applications, but more experiments are needed to assess the contrasts between robotic and human teacher approaches.

These robots were designed with a focus on sophisticated facial expressions that represent emotions. According to the authors of these studies, Geminoid robots are created with the main purpose of taking care of and talking to children and elderly people from a distance capable of displaying gestures or participating in events from far away. A facial expression evaluation for the Geminoid F robot was performed by Becker-Asano and Ishiguro [91]. According to the authors, Geminoid F can represent 5 emotions including fear, anger, happiness, sadness, and surprise created from 7 DoF in the face’s robot, it is incapable of moving and standing upright. Most Geminoid robots can only operate in the upper body, which also limits the danger to human identity. Nishio et al. presented an experiment using remote-controlled robots for different tasks [96]. However, the Geminoids at the moment are only strongly developed in the direction of telecommunications devices, i.e. the authors do not embed AI algorithms so that the robot has the ability to act like a human. This is to avoid threats to robotic androids, the appearance of these Geminoids is so close to the humans it is copied that it is difficult to distinguish them from real people. This is a danger to falsifiers for false differentials as they appear to resemble real people. In the study of Ferrari et al. [20], they indicated that android robots will be threatening human identity because in their view these robots are likely to turn them into humans. Therefore, in our opinion, Geminoid robots or android robots should be produced and have clear regulations with them, strictly managing the identities of these robots to avoid threats to human identity.

Notably, an android robot called Erica introduced by Glas et al. [97] is also a Geminoid made in Japan; it was developed as a conversational robot with a variety of roles. However, it is designed to have expressive and human-like speech synthesis, considered an autonomous android. Erica is different from the previously proposed Geminoid robots, it is not designed to copy the appearance of any real people and it is autonomous research of Geminoid robots. In terms of appearance, Erica’s hardware was developed with other Geminoids with a telenoid-like face, the face proportions are designed based on theories of plastic surgery. Erica was the most human-like android; however, hand movements and the ability to move are still to be expected of Erica. As shown, Erica is made to look like a real person and can give independent answers. This is a concern about a person’s identity if not well controlled; perhaps that’s why Erica is not copied from any real people.

In addition to Geminoid robots, some research on the emotional expression of humanoid robots has also been carried out by many authors. For example, a humanoid robot was designed based on the anthropometric data of famous writer Philip K. Dick made by Hanson Robotics includes 28 DoF called PKD head robot [98]. This robot is covered with silicone skin that makes it look like a real human head. The facial emotions are implemented based on 28 servo motors. Today, this robot is still improved and used for many studies such as pain detection in medicine, and user surveys [99, 100]. Recently, an android head robot was introduced by Hyung et al. [101], they built a head with 13 DoF, namely, 3 DoF for the left and right eyes, and 10 DoF for the mouth. For the purpose of creating natural behaviors for robots, the authors used a face model for face recognition and the location of 13 action units, this robot can generate facial expressions, such as sadness, fear, and anger. They used 16 actuators to control 13 points identified from the probabilistic model to represent the back-mapping of the robot’s expressions. Android robots are not modeled and do not use forwarding for control. In addition, muscles in other locations of the face should also be considered in these studies to create realistic expressions for the robot. In their earlier research [102], Huang et al. described a more sophisticated control strategy that models both forward and backward transmissions, the next state was calculated based on the previous value. The recurrent neural network (RNN) model was trained from the available data to predict the next value. However, the authors only used 7 motors for the head so the states can be limited to an android robot. A study by Choi et al. [103] proposed an android head robot consisting of 30 DoFs that animate from 30 servo motors to express 13 basic emotions. The motor is arranged in 2 layers to optimize space and control the skin layer easily. The emotions are expressed quite clearly by the robot, but the interest state is not optimal. In 2011, Endo and Takanishi [104] introduced the emotionally expressive humanoid robot called Kobian, which was upgraded from a previously presented robot head study [105] for application to support daily activity. Kobian has the ability to walk on two legs, expressing emotions through gestures. The authors limited the emotions on the robot’s head to only 2 emotions of happiness and confusion from 4 DoFs in the head robot out of a total of 48 DoFs. This is a design solution to minimize the structure of the robot to optimize the position of the motors, it makes the robot become neater and more flexible in its movements. Kobian has a human-like structure with main parts such as arms and legs that can move, especially the robot’s hands are designed to imitate humans with soft materials to help the robot facilitate communication with people. Earlier, in a published study of a Hubo robot designed with the concept of return by Einstein [79]. The authors present a robot with the appearance of an Einstein-like head with 28 DoF and a silicone outer skin. The Hubo robot has a structure of 66 DoF for all robot movements; this can be an android robot that can move on its own.

An android robot called Nadine was applied in the study of intelligent interaction between humans and robots [106]. Nadine robot was introduced in the study of Ramanathan et al. [107], it was modeled after Professor Naddia’s external appearance, and its platform was developed from the previously proposed Eva robot [108]. Nadine’s brain can process the robot’s actions, combined with online search algorithms and chatbots to handle different situations [109]. The robot can track and maintain eye contact with the person interacting, and gestures and facial expressions were performed by motors located inside the robot’s head. When designing Nadine, the authors tried to describe the robot’s external appearance to avoid the uncanny valley effect. Recently, a humanoid robot named Sophia was announced by the company Hanson Robotic in 2015. Sophia was granted citizenship by the Saudi government in 2017, which is considered to usher in a new era for robots [110]. Sophia is a robot that can communicate, expressing basic emotions on her face. According to the announcement from the production company, Sophia was created to serve jobs, such as medical support, counseling services, and elderly care. Despite being equipped with many advanced technologies, Sophia still has not satisfied scientists in the field, because other scientists think that this is just a chatbot system with limited information [111]. Moreover, recently, a humanoid robot called Ameca was introduced by the company in the UK; it is capable of communicating and showing emotional behaviors on the face. However, the robot cannot move and is operated by humans, and the information about the Ameca robot is quite limited.

The movement of robots is also difficult for robots whose actuators are pneumatic or hydraulic systems because they need compressors to power this actuator. In another study [112], a cybernetic human HRP- 4C was developed which is biped humanoid robots. According to the authors, this type of robot has barriers, such as low commercial value, high cost, and perishability. Because the stability of the robot walking on two legs is highly dependent on the moving environment, the flexibility of the robot is limited. This is a very careful study, the authors have investigated the characteristics and bone sizes according to the true size of the human skeleton with a total height and weight of 1.58 m and 43 kg, respectively. HRP-4C is used as a dancer or a presenter in events [113]. Another robot, called Ibuki [114], was also designed as a childlike android with the ability to move on wheels. Ibuki is designed with a focus on the upper body with mechanical structures in the head, neck, arm, and wrist joints. The entire Ibuki consists of 46 DoFs with a mechanical height of 1200 mm and a weight of 38.6 kg with batteries. With 15 DoF equipped in the head, the robot is capable of expressing basic emotions through facial expressions. In the previous study [115], the authors expressed their feelings for Ibuki robot using gait-induced upper body motion. To the best of our knowledge, the Ibuki robot is a robot that has been carefully designed by the authors to be used in studies on the expression of emotions for robots, with the Ibuki wheel design being able to be more flexible during operation. Studies on the evaluation and analysis of the robot’s expressions in previous interactions with humans are expected to be carried out. In addition to the recommendations of humanoid robots or robot androids, some authors mention and study the behaviors of each part of the human body. For example, Penčić et al. [116] proposed a model that simulates the eyes of a woman with a total of 7 DoF. The movements of the eyeballs and eyelids were simulated by the authors according to human parameters. The size and pressure angles of the model need to be reduced to optimize the parameters according to the anthropometric dimensions of the head. Or, in the later work of Penčić et al. [117], a waist mechanism for humanoid robots is proposed. It includes 3 DoF for the whole system to generate waist movements like human activity. Or the study on detecting errors in the mechanism and correcting for the Sophia robot was done by Mayet et al. [118]. This is the first study on the automatic detection of failures of mechanisms automatically on the head of an android robot.

6. Discussion

Humanoid robots are always an opportunity and a challenge for researchers. The study of the robot android form always requires the authors to have an understanding of the theories in robot design, including uncanny valley and the determinants of human emotions. The theories are carefully considered to avoid creating robots that are intimidating to users. The most widely used hypothesis in the assessment of robot acceptance is stated by Mori. Currently, there are three theories put forward about the uncanny valley, as presented in Section 2. However, to date, no research has been conducted to conclude the hypotheses on this issue. A large-scale survey with a large sample size and randomness are needed to document future studies in human-robot interaction. The conclusions in the studies surveyed cannot be generalized as a robust theory because most of them have only been carried out in one or a few locations for a hypothesis proposed by the authors. Besides, methods of affective valence are considered tools for assessing affective perceptions [119]. Now that robots have reached the stage where they can be considered partners of humans, this means biological (ecological and technological) theories need to be considered to avoid falling into serious design flaws. In reference [120], depending on theological concepts, there were 4 factors that should be noted, namely, not to destroy natural relationships, not to compete with humans, to be consistent with social relationships, and should be accepted.

Robot androids have a human-like appearance, which creates advantages in their applications if they do not fall into the uncanny valley. However, creating robots that look and act independently like humans is something that humans do not want. The potential applications of android robots are enormous, especially in an aging population in developed countries. Android robots can be used to communicate the treatment of psychological diseases such as autism spectrum disorders studied by Kumazaki et al. [121, 122], or for teaching published by Halbach et al. [9], in medical applications [123], and so on. The effectiveness of the android robot is clearly shown in the study [121], and the pathological indicators have been reduced due to training courses with the android robot. Robots in this form can simulate humans to some extent, testing with larger sample sets is needed to check whether the effect of android robots is positive or negative. Design rules for humanoid robots need to be established to avoid adversely affecting human identity. Artificial skin is researched to create a fake feeling for robots to sense the surrounding environment and especially to ensure human safety in some industrial equipment. When used for external shaping for Android robots, external materials such as silicon and other additives have also been studied extensively [124]. This review provides literature related to humanoid robots, it can be a reference for research to improve and enhance the quality of robots.

7. Limitations of This Review and Recommendations

In this review, we tried to minimize the omission of comments. However, there are still some limitations to be pointed out as we only consider publications presented in English. This review used many different approaches from the authors, namely, some robots are made by large companies and are done by many experts, and some are done by individuals or small groups. America robot is a type of robot capable of performing emotions like humans, but this is a product for commercial purposes, so the documentation on this robot is limited. Besides, it is likely that the conclusions about people’s perception of the robots are affected by the intervention factors, the forms of organization, and the location of the surveys are also likely to affect the psychology of the participants who were to assess the effect of the uncanny valley.

The population is tending to age in developed countries, and the development of robots for the purpose of daily tasks is encouraged. Based on published studies, we want to suggest recommendations for future research. First, theories about the uncanny should be studied on a larger sample set, which means they should be done on a random and widely distributed sample set from many regions of the world. This is intended to strengthen the conclusions about the uncanny valley. Next, robot designs should focus on facial expressions because human emotions are as diverse as 66 emotions [125] and most of the robot’s responses to emotions are not optimized. Moreover, the costs should also be considered in the commercialization of the robots. To our knowledge, most of the robots are commercialized with very high prices [126]. Finally, government roles need to be considered carefully when creating robots that look and behave like humans.

8. Conclusions

Humanoid robots are more and more attractive, and the development of robots is undeniable. As stated, most of the studies on human-robot interaction were performed in several localities with different sampling methods, so the strength of the conclusions for this hypothesis remains uncertain. Currently, the proposed uncanny valley theory has not been conclusively established, but Mori’s hypothesis has been cited and applied by most studies. Studies with large sample sizes and randomization are expected to strengthen these hypotheses. The robot skin is studied with the functions described, including protecting industrial workers and helping the robot sense the world around it. Skin studies are expected with the following requirements guaranteed to simulate the functions and physical properties of human skin. Furthermore, robot humanoids are proposed with human-like functions and appearance. This can affect the identity of the person from whom the robot is copied, so robot design rules need to be established to avoid this situation. In this work, we have presented a summary of some concepts of emotion theory and uncanny valley, and then a review and update system about android robots capable of expressing facial emotions is also presented. This study provides a complete and systematic document for android robot designs from designing theories to updating published works. The advantages and disadvantages of robots are also discussed objectively based on published data by the authors. We believe that although this review may not be full, it has succeeded in its goals of offering a concise yet appropriate conspectus on theories of emotion, uncanny valley, and android robots.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Acknowledgments

This research was funded by the University of Economics Ho Chi Minh City, Vietnam.