Abstract

Nowadays, there is a need to provide new applications which allow the definition and implementation of safe environments that attends to the user needs and increases their wellbeing. In this sense, this paper introduces the EJaCalIVE framework which allows the creation of emotional virtual environments that incorporate agents, eHealth related devices, human actors, and emotions projecting them virtually and managing the interaction between all the elements. In this way, the proposed framework allows the design and programming of intelligent virtual environments, as well as the simulation and detection of human emotions which can be used for the improvement of the decision-making processes of the developed entities. The paper also shows a case study that enforces the need of this framework in common environments like nursing homes or assisted living facilities. Concretely, the case study proposes the simulation of a residence for the elderly. The main goal is to have an emotion-based simulation to train an assistance robot avoiding the complexity involved in working with the real elders. The main advantage of the proposed framework is to provide a safe environment, that is, an environment where users are able to interact safely with the system.

1. Introduction

Currently there is a lot of efforts employed in the areas of ubiquitous computing, social robotics, and wearable mobile devices. Advances on new hardware devices such as Wi-Fi and Bluetooth have provided the means to implement embedded systems in common households. Furthermore, with the rise of Internet of Things (IoT) every device can be connected to the Internet and transmit data [1]. With an array of these devices (that at their core they are sensors or actuators) the ability of creating a sensor platform that is able to capture several information from the users increases. With this information, the environment can be manipulated to attend to the liking of the people with the system, for instance, changing the temperature and the lighting. Furthermore, it is possible to improve energy consumption by attending to factors that the human-beings (users of these platforms) do not actively consider (like preemptively close windows blinds).

These features are accomplished through the use of machine learning algorithms. The algorithms give the ability of learning the home users’ likes and preferences. Typically in these systems the information about the users is introduced early on and through the capture of the users’ interaction by the sensors the system changes those values to correspond to personal preferences [2]. The user’s information is generated by performing computational operations on sensor data. These operations could be as simple as web service calls or as complex as mathematical functions run over sensed data. There is an issue with this approach, that is, the transformation of these systems into reactive ones, not being able to correctly foresee what the user really likes. Most of the system works almost randomly, where it does not matter if the user is happy or sad; it reacts the same way for both of the emotions, although they correlate with different needs.

The detection of human emotions is preponderant to give to these platforms the ability of evolving in time and having fuzzy logic states [3]. The detection and simulation of emotions can be considered as a new type of interaction that allows the system to know the user’s emotional state and/or to simulate and express an emotion. The result of this type of interaction is the active response of the environment to the emotional states.

The main goal is to provide tools which allow the definition and implementation of safe environments that attends to the user needs and increases their wellbeing [4]. Studies show that emotions have a direct impact on mental and physical health [5, 6]. The environment changes may give the sense of detachment and that its decisions are not in favor of the users. This problem is mainly produced by the different semantic interpretation of message meanings sent and received between devices and/or human users. In order to overcome this problem we propose the employment of social robots in these environments. A social robot is an autonomous or semiautonomous robot that interacts and communicates with humans by following the behavioral norms expected by the people with whom the robot is intended to interact [7].

Communication and interaction with humans are a critical point in this definition. Social robots increase the level of human-computer interaction and have social abilities like initiating a conversation and controlling our homes (BIG-i social home robot) (https://www.nxrobo.com/), or learning about the personality of the users (Jibo) (https://www.jibo.com/) or the user’s health condition and current treatments (Catalia Health) (http://www.cataliahealth.com/). These robots’ aim is to provide a human-like feeling to their interactions with the users. They should be able to respond appropriately to human affective and social cues in order to effectively engage in bidirectional communications [8].

There are relevant developments in the social robots area, but there are technological issues that are still unsolved such as the centralization of operations [9]. The robots are unable to perceive correctly the environment due to the lack of sensors and processing abilities. As a consequence, they typically serve only as a gateway to the rest of the platform. Furthermore, the system scaling is typically a problem. This is because these systems are designed with only one robot in one home, and the introduction of another robot is not considered and, consequently, the results of possible interactions are unexplored.

Addressing these issues is critical; thus, it is necessary to have tools that manage the information in a distributed way. Our solution proposes the use of intelligent entities (agents) that are autonomous and decentralized. This way, homes or offices can be seen as autonomous entities working together in order to provide comfort and security to the inhabitants. The system must be able to cope with the dynamic introduction of new entities.

The proposed tool, called EJaCalIVE, is based on the multiagent system paradigm, which gives the users the ability of designing and simulating emotional intelligent virtual environments (EIVE). Moreover, it incorporates elements of perception and action (machine learning algorithms, artificial vision, speech recognition, and communication with social robots and wearable devices), allowing the design and construction of EIVE capable of interacting with human beings in a natural way. The main advantage is to obtain emotion-based simulations to try different configurations (like training an assistance robot) avoiding the complexity involved in working with real people.

To achieve this, the designed tool has two levels: a user level and a developer level. The developer level gives allowance to design and simulate the EIVE, while the user level gives allowance to use and interact with the IVE constructed by the developer.

The rest of the paper is structured as follows: Section 2 analyses previous works; Section 3 shows the proposed EJaCalIVE framework while Section 4 describes a case study based on that framework; finally, Section 5 explains some conclusions and future work.

Ubiquitous computing and ambient intelligence (AmI) [10, 11] changed the concept of the Smart Homes to focus on the users’ quality of life. From these paradigms platforms that present devices and software that learn and adapt to the inhabitants’ tastes emerge. There are already some developments that address the most common concerns that most Smart Homes users present, for example, energy consumption tracking [12] and safer environments for elderly [13, 14]. To address these concerns AmI and IoT solutions may be used.

AmI and IoT projects establish environments where the users are surrounded by different kinds of technology elements [15] that help them on their daily tasks, being transparent to them [16]. A recent area that emerged from the AmI and IoT is the Ambient Assisted Living (AAL), being its goal to provide assistive environments for elderly and disabled people. Due to the elderly and the disabled people medical condition, a regular home environment is very challenging to them. They need special assistance and devices and services that help them to perform activities of daily living (ADL).

In terms of related projects, there are recent developments on the AAL area as well as in intelligent virtual environments. From our observations, the common issue that they present is the lack of interoperability features and the oversight over managing emotions. Next some relevant projects of the referred areas are presented.

The NACODEAL project [17] goal is to provide an augmented reality device that shows information about the activity that they should perform or are performing. This tool is very useful to people with cognition problems or that need assistance when performing a complex task, as the device shows step-by-step instructions about those tasks. An implementation is presented in [18], where a system that uses inexpensive devices (projectors, cameras, and speakers) to guide a user through a home environment is described, showing physical directions and warnings about the surroundings. Internally, the system resorts to virtual environments to forecast the possible outcomes of the user actions. This project changes the paradigm of the Smart Homes by incorporating sensors in the devices that the user uses, thus having a mobile sensor system and requiring fewer home devices. Although this project presents a complex and interesting sensor system and an innovative visual interface, it lacks the ability to detect emotions and really interpret the intentions of the users. Furthermore, the users cannot input their preferences or interact outside the predetermined actions.

The PersonAAL project [19] aims to provide a tool for elderly people that reduces their dependence on caregiving services. The tool will be implemented in each user house and constantly monitors them and their activities. The main idea is to use smartphones and wall displays to provide a virtual environment that has useful information about the activity that they are performing, adjusting the visual interfaces to each user according to their taste and medical condition. The issue with this tool is the limitation of the information available and the features that they provide (only information without any acting through actuators or robotic assistants).

Active@Work project [20] addresses an often forgotten area that is the workplace. The main goal is to help elderly workers on their workplace using a virtual assistant. This assistant receives the information of biosensors and identifies possible risk situations through the comparison to normal sensor values. Furthermore, the project aims to develop a visual interface that sensibly communicates possible problems to the users and to caregivers, having a virtual environment for collaborative processes that load-balances the work and sends some of the elderly tasks to younger employees. This project showcases the use of sensor systems to detect and address health issues, as well as emotional and stress levels. The issue is that it proposes limited action when critical levels are achieved; thus it does not work in the best interest of the people that are monitored.

Finally, the Pepper robot [21] was designed to identify the emotions of the users and select the best behavior appropriate to the situation, having a humanoid aspect. This robot takes into account the voice, facial expression, body movements, and oral expression, interpreting the emotion presented and offering the appropriate content. The robot is able to identify joy, sadness, anger, or surprise and responds to the mood of the moment, expressing himself through the color of his eyes, his tablet, or his tone of voice. Furthermore, it uses gamification procedures to engage with the users and keep them interested and follows the users interacting with them as they move. Through its arms it is able to shake hands and express physical emotions. One issue with this robot is that at its core is a companion (as in conversational) robot, not being able to carry or assist in any tasks, and although it is connected to the Internet, the quantity of information that is able to display (visual or auditory) is reduced to simple responses to vocal commands. Also, it is unable to anticipate actions or movements; thus it is not prepared to prevent critical situations and is only able to directly respond to the immediate emotions displayed by the users.

The EJaCalIVE is designed to address the issues that these projects overlook, providing a platform and robot that is able to attend to the users issues and needs.

3. EJaCalIVE (Emotional Jason Cartago Implemented Intelligent Virtual Environment)

This section focuses on the presentation of the EJaCalIVE framework. This framework allows the design and programming of intelligent virtual environments, as well as the simulation and detection of human emotions for the creation of IoT and UC applications.

EJaCalIVE is a tool that allows the design and programming of these new human-agent societies while incorporating the detection and simulation of emotions. EJaCalIVE is divided into two parts. The first one focuses on the design and programming of the intelligent virtual environment (IVE) and the second one on the detection and simulation of emotional states. For the design of IVEs, EJaCalIVE uses the MAM5 [22] metamodel based on agent and artifact (A&A) [23, 24] one. The A&A metamodel determines that, within an environment, there are two types of entities, the intelligent ones (Agents) and the objects (Artifacts). Based on this idea, MAM5 metamodel goes a step further to design an IVE, in terms of distinguishing between the entities (agents and artifacts) that have a physical representation in the virtual environment from the ones that have not. EJaCalIVE introduces different specializations of agents and artifacts, which give the designer a wide range of entities to realize their design. Figure 1 shows the different entities that can be represented in EJaCalIVE.

Detailing a little more EJaCalIVE structure, inherited of MAM5 model, which divides the IVE into two workspaces, a workspace where entities that do not have virtual representation exist and an IVE_Workspace inhabited by entities with a virtual representation. In turn, within each of these workspaces, the entities are divided into two main classes, corresponding to agents and artifacts. In this way, in the first kind of workspace, those that do not have a virtual representation (no 3D representation), there are agents and artifacts just as defined in the A&A metamodel. On the other hand, in the IVE_Workspace, there are those that have a virtual representation, the Inhabitant Agents and the IVE_Artifacts. Moreover, this last kind of workspace allows more specific subclasses: Smart Device Artifacts (SDA) [25] and Human-Immersed Agents. The SDA are specialized IVE_Artifacts, allowing the developer to make a connection with the real world. This connection gives agents and Inhabitant Agents the ability to interact with the real world acquiring information through sensors or acting in the real world using actuators. The Human-Immersed Agent is an Inhabitant Agent that serves to immerse a human in the system, so for the rest of the system is an Inhabitant Agent, but it is also a communication bridge between the virtual world and the real world.

However, the introduction of emotions into the IVE design, as a new form of communication and interaction between entities, along with the above commented possibilities of accessing not only the virtual environment but also the real world (in an augmented reality kind of applications) open the door for the creation of new IoT, UC, and complex simulations. This new emotional component gives the different entities the ability to simulate and/or detect human emotions. For this, it is necessary to introduce different emotional models that are commonly used in psychology and are widely used in computer science. Within these models, we can find the big-five model (OCEAN) which is a personality model, the emotional model PAD [26], and the Circumplex Model [27]. EJaCalIVE uses different communication channels (image processing through cameras, text, voice, body gestures, or biosignals) for the detection of emotions. This input data dealt with machine learning algorithms such as SVM or neural networks for the detection and classification of emotions from it.

Figure 2 shows the main structure of EJaCalIVE framework along with the different modules it is based on. As can be observed in such figure, EJaCalIVE is supported by four engines: cognitive, artifact, physical, and emotion. Each of these engines allows the developer to design and program an Emotional Intelligent Virtual Environment (EIVE). The Cognitive Engine is supported in turn by Jason who is the agent platform. Jason allows scheduling of each of the behaviors of the agents. The Artifact Engine is supported by CArtAgo which allows you to create the various objects that are inside the EIVE. EJaCalIVE has a Physical Engine, which is supported by Jbullet. Jbullet allows to introduce physical restrictions (gravity, IVE-Artifact position, speed, and acceleration among others) which will be governed by the IVE Workspace. The Emotion Engine is responsible for simulating and classifying human emotions as well as for the calculation of social emotion and the emotional dynamics of human-agent society.

Each one of the engines is defined by the developer through an XML file, which will later be interpreted by EJaCalIVE creating the different templates for agents, artifacts, and workspace data.

The following subsection describes the Emotion Engine included in EJaCalIVE (the other engines are described in more detail in [28, 29]).

3.1. EJaCalIVE Emotion Engine

The Emotion Engine plays an important role within the framework, as it is responsible for incorporating the different human emotions. This engine can be used by all agents, to detect, process, and simulate emotions. This engine also allows the developer to use all simulated or detected emotions to calculate a social emotion.

The processes of detection and simulation of emotions are described below.Emotion detection: EJaCalIVE’s Emotion Engine detects the emotions using the artifacts designed to perform this task. However, EJaCalIVE allows the developer to connect any other hardware that can perform such detection. By default EJaCalIVE incorporates two ways of performing the detection, the first is through image processing and the second is using an Emotional Wristband.  In the detection of emotions through images processing, the system uses the Face Landmark Algorithm, which extracts the characteristic points of the face from the images [30]. An example of image detection can be found in [31].  To characterize a face image, a vector of characteristics is created storing not only the characteristic points but also the Euclidean distances between those points. This vector of characteristics serves as the input for a neural network that gives as output the corresponding emotion expressed by such face. This network has been trained previously with similar vectors calculated from a database of face images representing different emotions. Although EJaCalIVE uses neural networks to perform the classification, it allows the developer to use their own classifying scripts. This is done through a dedicated artifact for this task (scripts can be made in Python or Node.js).  To perform the detection of emotions through the Emotional Wristband, we employ a design described in more detail in [32]. This band is used by a human; an agent is embedded inside the band perceiving the variations of the resistance of the skin. These variations are preprocessed in order to extract a feature vector, which is used by a neural network classifying the emotion (in a similar way as in the emotion detection by images processing).Emotion simulation: if we want to simulate the emotions of humans, we need to obtain previously the personality values of the involved humans. These values are obtained by making these humans perform the big-five personality test (https://personality-testing.info/printable/big-five-personality-test.pdf). This test models the personality of an individual through five factors: factor (openness to new experiences), factor (conscientiousness or responsibility), factor (extraversion or extroversion), factor (agreeableness or kindness), and factor (neuroticism or emotional instability). If the agent is simulated, we need to determine its personality. To do this it is necessary to vary the values of the OCEAN in each component as shown in Table 1 [33].

However, it is possible to use these personality values to calculate the emotion that our agent would have at the moment of initiating the simulation. For this we use the following equation [34] that allows us to determine the first emotion in terms of PAD.

Once these first values of PAD have been calculated, during the simulation process emotions can vary. This variation is produced through the perceptions of the agent and will depend on the scenario to be simulated (e.g., using music to modify moods of the people in a pub, the agents perceive the music, and their emotions evolve according to their current emotion and the kind of music they like [31]).

As commented above, the Emotion Engine allows the developer to use all simulated or detected emotions in order to calculate a social emotion. The use of the concept of social emotion allows the developer to know the emotional state of the group, composed of humans and agents. This social emotion is represented as a triplet composed of: the Central Emotion (average of the emotions of each individual in the group), the maximum distance between the Central Emotion and the emotions of the individuals and the dispersion of the emotions around the Central Emotion (for more details [29]).

Working with social emotions allows comparing two groups of individuals and comparing their social emotions, in fact, the distance between such social emotions, and so to know how close or far they are from an emotional perspective or even compare the social emotion of a group with a goal emotion.

It is possible to modify the individual emotions of each person, causing the social emotion to change and thus to make that distance between social emotions increase or decrease. However, the emotions are dynamic, as well as the interactions between agents and humans. Moreover, emotions can be spread among the individuals who are being simulated. In order for this emotional contagion to take place, it is necessary to take into account not only personality traits as the empathy commented above, but also the affinity between the individuals. In this sense, the proposed EJaCalIVE incorporates a dynamical model which allows the designer to model emotional contagion between individuals.

4. Case Study

In this section, we present the case study in order to use EJaCalIVE in the simulation of a residence for the elderly. Due to the complexity involved in working with these people, an emotion-based simulation is proposed to train an assistance robot. The robot will interact with the agents detecting their emotions and communicating with the caregiver if there is any variation. This emotional variation allows the caregiver to decide if it is needed to change the activities to modify the people emotions.

The simulation presented in this section was divided in a virtual component and a real component. The virtual component is responsible for simulating the elderly; to do this, we use three emotional agents. These agents have different personalities and a list of characteristics that make them different (features such as affinity, empathy, and activity tastes). These different characteristics make each agent’s emotion be affected in a different way by environmental stimulus. The real component is performed with the robot; this robot can detect the emotional states and change its behaviour depending of the emotion detected. In this way, this simulation trains not only the mechanism used by the robot to detect (and act to try to modify) the human emotions, but also an environment where the robot could interact with the elements of an IVE was developed. For this reason, other elements have been taken into account such as furniture (chairs, sofa, table, and floor) to be simulated as IVE_Artifacts. These elements are located as virtual obstacles to be avoided by the robot (in the real world). We have integrated a human in our simulation. This human is modeled as an Emotional_Human_Inmersed_Agent. This agent is a virtual representation of a caregiver and allows us to test the mechanism of emotion detection using the camera and the interaction through different actions. Figure 3 shows in the upper left corner the metamodel design of the case study system, formed by 4 IVE_Artifacts (modelling the table, chair, sofa, and TV), 4 Emotional Inhabitant Agents (modelling the assistant robot and the 3-elderly people in the residence), and one Emotional Human-Immersed Agent (modelling the caregiver). EJaCalIVE allows compiling this metamodel and build automatically the skeleton files for such agents, artifacts, and the IVE_Workspace (that is seen in the upper right corner of Figure 3). The last part of this figure shows the view of the simulation done.

4.1. Robot Description

The robot is responsible for interacting with the elderly; this interaction is performed through emotion detection and the proposal of activities. As the robot moves in a real world, it is important to provide it with a series of sensors that help it to navigate, to perceive the human emotions, and to interact with the people. For this reason, the robot has been divided into two levels.

The first level provides the robot with the capability of controlling the motors and giving access to different sensors that allow perceiving the environment. We employ sensors such as ultrasound, magnetometers and gyroscope. This control has been developed using an Arduino Mega (https://www.arduino.cc/en/Main/arduinoBoardMega). The data acquired by these sensors and processed by the Arduino is sent to the agent. This information is the knowledge of the environment (angle inclination, long-distance obstacle detection). However, this low-level control has a reactive behaviour that allows reacting to external events without having to be reasoned by the agent (see Figure 4).

The cognitive level was developed with a Raspberry Pi 3 (https://www.raspberrypi.org/). This level is responsible for recognizing faces and detecting emotions through image processing using a camera. At the same time, this level is in charge of controlling the robot movement. The robot includes a LCD touchscreen where users can interact, as well as speakers and microphones. All the described processes have a very high consumption of resources. This is the main reason why the robot has three raspberry pi connected as a cluster (see Figure 5).

The cluster configuration allows distributing the information and the different processes to be done. In the robot, each node is an agent with a specific task and different resources needs:(i)The “node_1” controls the LCD touchscreen. It executes the person identification and the emotion classification behaviors. At the same time, this node is in charge of visualizing the corresponding emoticons according to the detected emotions.(ii)The “node_2” incorporates the robot’s behaviors in charge of acquiring the temperature, CO2 level, and relative humidity sensors. This information is used by the robot to determine if the environmental conditions where people are located are adequate, that is, if the temperature is adequate, or the humidity level is right, and if the CO2 level are suitable. These values are acquired using a hat sensor of the raspberry pi. At the same time, this agent is in charge of carrying out the speech recognition and converting the text to speech.(iii)Finally, the “node_3” incorporates the behaviors in charge of communicating with the low-level control. For instance, it sends the different values that make the robot move within the environment and receive the information sent by the sensors located in the low-level control, that is, velocities values, angle rotations, ultrasonic distance, motor position, and so forth.

All the agents are interconnected with each other through the SPADE platform (http://spade.gti-ia.dsic.upv.es). Each agent located in each node uses the switch for communication through messages, distributing the information between them.

The personality of the robot was defined using the OCEAN values, as defined in Table 1, taking into account that since the robot has to be at the service of humans, the OCEAN values are high to have a robot with the following characteristics:(i)Agreeableness, tendency to be compassionate and cooperative towards others(ii)Conscientiousness, tendency to act in an organized or thoughtful way(iii)Extraversion, tendency to seek stimulation in the company of others(iv)Neuroticism, emotions being sensitive to the individual’s environment(v)Openness, tendency to be open to experience a variety of activities.

Since the robot is real and older people are simulated through agents, in this simulation we have defined two restrictions. The first one is that the agents communicate their emotional states through a message since there is no emotional representation through avatars using screens. The second one is that the robot knows the emotion of all people. This allows us to calculate the social emotion and also determine the emotional dynamics of the group. With the use of the emotional dynamics, the robot can determine which person influences more about others. This way the robot can determine which person is the one that causes the emotion of the group to fall. This information is used by the caregiver, in order to design actions focused on the people who make the group’s emotion fall.

5. Conclusions and Future Work

This paper has presented the EJaCalIVE framework which is an intelligent virtual environment that implements the concept of emotions and also allows an integral interaction with human beings. The main goal of this project is to build a robust virtual environment that is able to capture and reproduce atomic emotions to its agents and artefacts.

The social aspect of this project is to provide assistance to an elderly community by reading each person’s emotional status and interacting through environmental changes. The robot developed (although it is still an initial build) is used as an humanoid to ease the interaction with the people on its environment. Furthermore, its mobility helps in terms of screen and sensors displacement, meaning that instead of requiring several sensor systems and that the users have to displace themselves to locations that have interaction interfaces we are able to bring it to them. We believe that this will have a lesser visual impact and disturb less than the robot. Previous studies show that robots are well accepted by elderly people.

The combination between the robot and agent projections of the elderly community has provided interesting preliminary results. Unfortunately, due to the fragility of the elderly people and the alpha version of the robot, tests on real environment and with real people were not possible. But that restriction has given way to the enhancement of the EJaCalIVE [25], with the introduction of extended emotion representation, artifacts (and its enhancements like smart devices), and social emotion representation. In fact, these developments are now the part of the core ecosystem operation; that is, the EJaCalIVE will be used even when real users and robot are interacting with each other. In terms of future work, we aim to build a user-safe version of the robot and deploy it in an environment with real users and capture the interaction with them, which may result in a validation scenario. Furthermore, we aim to develop robust quality of information methods that are able to assert an quantify scenarios with lack of information, correspondent with most that happen in real life events.

Finally, the trained robot will be tested by elderly of a daycare center in the northern area of Portugal, the Centro Social Irmandade de S. Torcato. After the tests, a validation will be performed through a questionnaire that will be done to the caregivers (registered nurses and medical personnel), trying to identify the obtained results and detected problems.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This work is partially supported by the MINECO/FEDER TIN2015-65515-C4-1-R and the FPI Grant AP2013-01276 awarded to Jaime-Andres Rincon. This work is also supported by COMPETE: POCI-01-0145-FEDER-007043 and Fundacao para a Ciencia e Tecnologia (FCT) within the projects UID/CEC/00319/2013 and Post-Doc scholarship SFRH/BPD/102696/2014 (A. Costa).