Table of Contents Author Guidelines Submit a Manuscript
Advances in Human-Computer Interaction
Volume 2014 (2014), Article ID 386928, 10 pages
Research Article

Encoding Theory of Mind in Character Design for Pedagogical Interactive Narrative

1Cognitive Science Department, Rensselaer Polytechnic Institute, 110 8th Street, Troy, NY 12180, USA
2Institute for Creative Technologies, University of Southern California, 12015 Waterfront Drive, Playa Vista, CA 90094-2536, USA

Received 31 October 2013; Revised 29 May 2014; Accepted 14 June 2014; Published 23 October 2014

Academic Editor: Stefan Kopp

Copyright © 2014 Mei Si and Stacy C. Marsella. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.


Computer aided interactive narrative allows people to participate actively in a dynamically unfolding story, by playing a character or by exerting directorial control. Because of its potential for providing interesting stories as well as allowing user interaction, interactive narrative has been recognized as a promising tool for providing both education and entertainment. This paper discusses the challenges in creating interactive narratives for pedagogical applications and how the challenges can be addressed by using agent-based technologies. We argue that a rich model of characters and in particular a Theory of Mind capacity are needed. The character architect in the Thespian framework for interactive narrative is presented as an example of how decision-theoretic agents can be used for encoding Theory of Mind and for creating pedagogical interactive narratives.

1. Introduction

Story has been a central part of the human experience, both as entertainment and as a powerful tool for providing pedagogy. We watch movies, read novels, and tell stories as a regular part of our lives. With the rapid development of computer technologies, a new form of media—computer aided interactive narrative—has received increasing attention in recent years. Interactive narrative allows a user to play a role in a story and interact with digital characters driven by a computer system. Unlike reading a novel or watching a movie, the users can actively participate in interactive narratives. They can explore the story world and see the effect of their actions, both physically and socially, in the immediate and longer terms. If the user is unsatisfied with how the story ends, he or she can start over and try alternative options.

When applied to training cultural and social skills, interactive narrative provides a unique platform that combines the pedagogical power of narrative with an active learning experience. The ability of narrative to captivate the audience and the direct link between actions and outcomes ideally engage users, motivate them to spend more time learning (e.g., to explore alternative paths in the story), and appropriately contextualize the experience. Moreover, interactivity and the user’s experience of agency can promote intrinsic motivation in learning [1] and support learning in context and replay [27].

Interactive narrative has been recognized as a promising teaching tool in a wide variety of social interaction scenarios, such as HIV prevention [8], PTSD intervention [9], language and cultural skills training [10], antibullying interventions for young children [11], negotiation and communication skills training [12, 13], and social learning (see [14] for a review). Outside of social interaction training, interactive narrative-based games have also been used for science education [15, 16], military decision-making training [17], and health education for children such as Lungtropolis (2012) where children learn about asthma.

The enthusiasm for interactive narratives goes beyond its use in training. In recent years, games for entertainment that emphasize the social and narrative aspects of the player’s experience have become increasingly popular. This is evidenced in recent major titles such as Heavy Rain, Mass Effect, and Bioshock. Game designers have been looking into ways to use rich characters and narratives to engage the player and to provide the central experience of the game. For example, Wang et al. argue that “in order to be a highly enjoyable game that is often deliberately selected and played over a longer period—a game must utilize both narrativity and social interaction to promote a player’s emotional engagement and elevate the level of pleasure in game play” [18]. In fact, the rapid growth of interest in interactive narrative is in part motivated by the explosion of video games in recent years.

In this paper, we discuss key challenges in creating narrative-based pedagogical systems. We argue that a rich character model and, in particular, a Theory of Mind capacity are needed for addressing these challenges. The Thespian framework for authoring and simulating interactive narratives is discussed as an example of a computational system capable of modeling rich characters with a Theory of Mind. We discuss examples of using the Thespian framework to author pedagogical interactive narratives followed by a discussion of future work.

2. Design Goals for Pedagogical Interactive Narrative

Our goal is to support the creation of interactive narratives for social interaction training, such as language and culture skill training, and negotiation training. Central to many social interactions is the participants’ ability to understand and predict others’ behaviors, which requires people to have a Theory of Mind—beliefs about self and others. By giving characters a capacity for Theory of Mind, we seek to replicate this key aspect of human social interaction in dynamically unfolding interactive narratives. We next illustrate Theory of Mind as the basis for other desiderata of social characters.

2.1. Narrative Coherence and Experience of “Presence”

The basic goal of interactive narrative is to immerse the user in a narrative experience in which his or her behavior has an impact on how the narrative unfolds. The sense of presence, which is often described as “feeling as if being there” [1922], has been shown as an important factor for both providing entertainment and ensuring the efficacy of virtual training and the transferring of knowledge and skills to the real world thereafter [23]. Many factors contribute to the experience of presence. In order for the user to feel presence in a virtual environment either with or without a narrative component, the content of the virtual environment needs to be meaningful [24]. The user should be able to form a mental model of the virtual world [2528]. Interactivity and the sense of agency have been shown to affect the experience of presence [29]. In particular, the user should know his/her own possible actions and be able to anticipate the results of his/her actions as well as other characters/objects’ actions/movements [30]. In other words, the user must be able to make sense of the experience.

In the context of narrative design, this sense-making requirement relates to ensuring coherent narrative, which is defined as “the semiotic representation of a series of events which are meaningfully connected in a temporal and causal way” [31]. (Coherent) narrative has been shown as an important way for people to organize and make sense of their experience [32]. In an interactive environment, the story each user goes through needs to be coherent, and broken narrative should be prevented from happening during user interaction. Coherent narrative is particularly important for interactive narratives designed for pedagogical purposes because we want the user to understand his/her experience in the way intended by the designers.

Theory of Mind, we argue, is a powerful generative approach to creating characters whose behaviors make sense to a user. Theory of Mind allows the characters to act in well-motivated ways in dynamic social contexts. Using Theory of Mind, the characters can form expectations about others’ emotional and physical responses to their potential actions and then decide what to do based on the expectations. Of course, characters’ being well-motivated does not mean the user should always be able to understand the characters’ intentions during the interaction. In fact, a certain level of unexpectedness or confusion is welcomed both for pedagogical purposes—allowing the user to pause and think deeper about the characters and the story—and for entertainment purposes, as we will discuss later. Discrepancies among the user’s and the characters’ beliefs and expectations are often the source for creating dramatic effects. The “truth” about the characters—their motivations, beliefs, and reasoning processes—can be revealed to the user after the interaction if needed.

2.2. Affordance for Social Interaction

Secondly, characters must also support and maintain social interaction with the user. For this purpose, the characters need to understand social norms and follow the norms unless they are motivated to violate them. By doing so, the characters become reliable and provide an incentive for people to interact with them.

Again, we argue characters’ having Theory of Mind is important for realizing this function. Norms describe the general behavioral patterns that are expected to emerge during social interaction. Some of the norms can be expressed as simple reaction rules, such that a person should greet others if being greeted. Other norms involve understanding another’s intentions and situations, such as not asking improper questions. To enable characters to “understand” social norms in context and make tradeoffs between following norms and addressing their other high priority goals requires them to have mental models of others, including other characters’ mental model about themselves. The mental models enable the characters to understand what is expected from others.

Furthermore, to create the experience of interacting with an intelligent social character, the characters should be designed with a “mind” that gives them characteristics [3335]. Breazeal studied the requirements to “promote the illusion of a socially aware robotic creature” and argued that “to socially engage a human, its behavior must address issues of believability such as conveying intentionality, promoting empathy, being expressive, and displaying enough variability to appear unscripted while remaining consistent” [33]. Similar arguments have been given for creating digital companions [36] and for assistant robot [37]. It is found that even when the assistant robot does not need to represent a social character, having a “personality” helps the user to understand and predict its behaviors. Of course, having Theory of Mind alone does not guarantee the believability of the characters. However, Theory of Mind provides a good foundation for representing characters’ intentions and emotions. Many emotions and, in particular, social emotions are decided by what the person’s expectations are [3840].

2.3. Dramatic and Pedagogical Goals

In addition to the basic goals for creating a believable social interaction environment, we want to give the designers of interactive narratives some ability to predict and affect the user’s emotional experience. The reason for this goal is twofold. First of all, the engaging power of narrative is often related to the audience’s subjective emotional experiences. For example, Tan argues that the central purpose of entertainment films is to evoke emotions in the viewer and that films are “emotion machines” that create specific patterns of tension and relaxation over the course of the experience, arouse interest, and induce the facsimile of fear, and so forth [41]. Secondly, the goal of some pedagogical systems is to invoke the user’s emotions or allow the user to practice his/her decision-making under certain emotional states. For example, FearNot! is an interactive narrative game that is aimed at helping children practice how to deal with a school bully [11] and is aimed at invoking empathy from the user.

Finally and most critically, we want to be able to encode pedagogy. In other words, there must be some way for the learner to interact with the system and learn on the basis of that interaction. This is of course a very general goal. As an authoring system for helping authors design interactive narratives, ideally we want the system to give some feedback in terms of how such pedagogical goals are related to the characters’ designs and whether there are potential conflicts among the characters’ designs, the dramatic goals, and the pedagogical goals.

For achieving dramatic and pedagogical goals, the system needs not only modeling characters with a Theory of Mind but also having a Theory of Mind about the user. Neither dramatic nor pedagogical effects can be defined independently from the user’s beliefs, actions, and experiences.

2.4. Authoring Challenge

Unlike traditional narratives, in which a fixed line of story is presented to the audience, the support for interactivity results in a huge, if not unlimited, number of possible story paths in interactive narratives. Each of the paths needs to be designed properly—both true to the individual character’s motivations, beliefs, and circumstances and reflecting the designer’s dramatic and pedagogical goals. The complexity of designing these paths increases exponentially as more interactivity is allowed. Moreover, compared to interactive narratives for games, pedagogical applications often have more strict requirements for the design of the interactive experience. The applications are designed for the users to learn from their interactions with the virtual characters/world, and the designers by all means do not want wrong pedagogy to be achieved accidently.

This work is aimed at exploring artificial intelligence (AI) based approaches to facilitate the creation of interactive narratives. This work takes a generative, artificial intelligence (AI) based approach to facilitate the creation of interactive narratives, whereby stories are represented using an internal computational model. New stories are generated automatically based on the user’s choices of actions and the current status of the story. One fundamental issue with using a generative approach is how to represent information about the story. The AI techniques which the computational model can be based on range from simple and explicit ones, such as rule-based systems, finite state machine, and decision tree, to more sophisticated ones such as hierarchical task network, planner, MDP, and POMDP. These techniques provide different modeling power and allow different types of character and narrative models to be built on top of them. The disadvantage of relying on simple and explicit modeling techniques such as rule-based systems is clear. Such models, though easy to start working with, do not scale up easily and therefore cannot represent rich character behaviors in a dynamic environment. Systems using more sophisticated computational approaches have advantages in dealing with user interactivity because they provide an abstract model of the interactive story or the characters’ internal states which drives the characters’ behaviors. Theory of Mind has been encoded in many systems for character/story generation and demonstrates its importance. In the next section, we will review related work that incorporates Theory of Mind in intelligent agents.

3. Related Work

A body of research has been conducted on human-avatar interaction, virtual reality, narrative, game, and serious game. Modeling Theory of Mind in characters can be traced back to as early as Meehan’s Tale-Spin system which was created in 1976 [42]. In Tale-Spin, the story unfolds as the characters look for ways to satisfy their goals using the mental models of each other and the world.

In more recent works, the use of Theory of Mind has been explored for realizing various aspects of character and narrative creation. For example, Peters shows that, to initiate conversations at the appropriate moment, it is important to form Theory of Mind about others based on their nonverbal behaviors [43]. Hodhod et al. argue that shared mental models are important for successful improvisational agents [44]. O’Neill and Riedl model of narrative suspense is built upon the user’s mental model of the characters and the story [45].

Emotion plays an important role in social interaction. In modeling emotions for social characters, the appraisal theories of emotion [3840, 46] are often used. The appraisal theories view emotions as the results of one’s evaluation of the person-environment relationship along several dimensions, such as how good/bad the current situation is, who caused the situation, and whether the situation is changeable. In simulating appraisals, Theory of Mind is often created for the agents. For example, Aylett and Louchart’s work on double appraisal enables characters to “understand” others’ feelings by simulating other agents’ conditions [47]. Belkaid and Sabouret create a logic based model of Theory of Mind for emotion [48]. Within Thespian, Si et al. model appraisal theories in decision-theoretic goal-based agents with Theory of Mind [49].

Agents with Theory of Mind capacity have often been evaluated as being more realistic or more intelligent. Dias et al. compare agents with two-level Theory of Mind; that is, agents have a model of my beliefs about your beliefs about me to agents with a single level Theory of Mind; that is, agents only have beliefs about others and find the first type of agents is perceived as being more socially intelligent [50]. Laird’s work shows that having a Theory of Mind about the user can improve AI bot’s performance against human players in Quake II [51]. Harbers et al. argue that agents for training systems should have a Theory of Mind about the user [52]. Doirado and Martinho demonstrate that being able to detect the user’s intention makes an agent’s behavior more believable [53].

As we have discussed in Section 2, characters having Theory of Mind addresses key, desired properties of interactive narratives, including narrative coherence and characters following social norms. In interactive narrative systems where Theory of Mind is not explicitly modeled, the characters’ behaviors often still suggest their Theory of Mind. Such effects are defined conjointly with the characters’ other behaviors. For example, in Façade [54], characters’ beliefs are implicitly encoded in the design of the beats and the beat selection process.

We argue that an agent architect with built-in support for Theory of Mind is more preferred for our purpose. Such modeling makes the characters’ behaviors computationally interpretable, which means the system can automatically generate interpretations for the characters’ behaviors in the context of its environment, motivations, and beliefs. For simulations that are aimed at training the users’ ability of picking up social cues or the users’ social interaction skills, it is important to help the users understand their experiences. It is through this procedure that the users learn about social rules and what they have interpreted or have done incorrectly in the previous sessions. To support this function, the interactive narrative system must provide a way to automatically retrieve information about the characters’ intentions and beliefs behind their behaviors. Otherwise, the designers will have to manually supply such information, which will become a significant undertaking. In this regard, we argue that any system that is not able to represent characters’ motivations, Theory of Mind, and the link between their internal reasoning processes and behaviors is less preferred for creating rich characters for social skill training.

In the next section, we present an example of a computational framework that implements the desired properties discussed in Section 2. In this system, Theory of Mind is not just used for modeling some aspects of the characters. Instead, it is the core for the characters’ decision-making, norm reasoning, and emotional processes. The name of the framework is Thespian.

4. Thespian Framework for Interactive Narratives

Thespian is a multiagent framework we developed for authoring and simulating interactive narratives [10, 49, 5559]. Thespian’s approach seeks to ensure the coherence of narrative and supports both creating rich characters and managing the development of the story during the interaction for realizing the author’s dramatic or pedagogical goals.

Thespian models characters’ motivations as the goals of decision-theoretic agents. Characters also have models of other characters and their behaviors are mediated by how they expect others to respond. In addition, the virtual characters’ behaviors are subject to social norms—the characters have motivations for obeying social norms. These features ensure there is an interpretation of social interactions that happen among the characters, including the user.

Thespian is built based on PsychSim [60, 61], a multiagent framework for social simulation. In PsychSim, each agent is modeled based on partially observable Markov decision problems (POMDPs) [62] and incorporates algorithms for belief update and decision-making.

4.1. Example Authoring Scenarios

Thespian has been applied to author dozens of virtual characters in various interactive narratives. The first interactive narrative to incorporate Thespian is the mission practice environment of the tactical language training system (TLTS) [63], which is aimed at providing rapid language and culture training. Thespian has also been used to model fables such as Little Red Riding Hood and the Fisherman and His Wife. We describe below two previous example domains of Thespian to illustrate the important role Theory of Mind plays in social interaction.

4.1.1. Tactical Language Training System (TLTS)

The tactical language training system is a large-scale (six to twelve scenes each for three languages) project funded by the US military for rapid language and culture training. Thespian was used together with Unreal 2003 to author and simulate the interactive narratives for the system’s mission practice environment. The system has been used by thousands of military personnel and shown to be effective [64].

The user takes the role of a male army sergeant (Sergeant Smith) who is assigned to conduct a civil affairs mission in a foreign (e.g., Pashto, Iraqi) town. The human user navigates in the virtual world using mouse and keyboard. The user can interact with the virtual characters using spoken language and gestures. An automated speech recognizer identifies the utterance and the mission manager, which is a component outside of Thespian, and converts them into a dialogue act representation that Thespian takes as input. Output from Thespian consists of similar dialogue acts that instruct virtual character bodies to speak and behave.

The story in TLTS consists of multiple scenes. A typical scene contains two or three main characters and up to six supporting characters. The main characters usually have 10 to 20 different actions including their dialogue acts, and the supporting characters typically have fewer than 5 actions. An example scene is provided below.

As shown in Figure 1, the story happens in a village café. The user’s aim in the scene is to find the senior official in the town to discuss providing aid to the locals. There are a variety of actions he can perform, including moving around the town, greeting people, introducing himself, asking questions, and using gestures. The user interacts with a range of characters in the scene, most notably an old man and a young man. These two locals have different personalities. The old man is cooperative. The young man worries more about the safety of the town and may accuse the sergeant of being a CIA agent if the user does not establish trust. Table 1 shows an excerpt from this story.

Table 1: Sample Dialogue in Tactical Language Training System.
Figure 1: Tactical language training system.

In this example, Theory of Mind plays an important role in driving the characters’ behaviors. The young man character interrupts the user’s and the old man’s conversation and asks the user a question. By doing so, he violates norms and appears rude. He decides to do so because, based on his mental model about the old man, he expects that the old man will answer the user’s question if he does not do anything, and this will threaten his sense of safety. From the user’s perspective, this interruption should be unexpected. The user is instructed to establish basic trust with the locals before asking any questions. At this moment, the user should realize that what he/she has done for establishing trust is not enough and should try to make it up in later conversations. If the user does not, the scene will continue in an unsuccessful direction. Using Thespian, the characters’ motivations and decision-making processes can be revealed to the user in the debriefing phase.

4.1.2. Little Red Riding Hood Story

This story contains four main characters, Little Red Riding Hood, Granny, the hunter, and the wolf. The story starts as Little Red Riding Hood (Red) and the wolf meet each other on the outskirt of a forest, while Red is on her way to Granny’s house. The wolf has a mind to eat Red, but it dares not because there are some wood-cutters close by. At this point, they can either have a conversation or choose to walk away. The wolf will have chance to eat Red at other locations where nobody is close by. Moreover, if the wolf hears about Granny from Red, it can even go and eat her. Meanwhile, the hunter is searching the woods for the wolf to kill it. Once the wolf is killed, people who got eaten by it can escape. In our modeling of the story, the numbers of possible actions for Red, Granny, the hunter, and the wolf are 14, 2, 4, and 10, respectively. The role of either Red or the wolf can be taken by the user. The user can interact with virtual characters through a text-based interface.

In this story, Theory of Mind again plays an important role, specifically in modeling character’s correct and incorrect beliefs about others that drive how the story unfolds. It is Red’s misbelief about the wolf which leads the later tragedy to happen. The wolf, on the other hand, has quite accurate beliefs about others. This allows it to find the right strategies to deal with Red, the woodcutter, Granny, and the hunter, respectively. The pedagogy of the original story is well known. However, through making an interactive version of the story, the user can experience different characters’ perspectives. Tweaking the characters’ initial beliefs can make the story deliver a different type of lesson.

We now turn our attention to how Thespian enables the designers to create such virtual characters as shown in the above examples.

4.2. Two-Layer Runtime System

Egri has strongly argued for the importance of characters in traditional narratives [65]. His view of narrative—of rich, well-motivated, autonomous characters that serve as a creative spark to the author but are nevertheless constrained by the author’s goals for the plot—serves as inspiration to the approach taken in this work. Specifically, Thespian utilizes autonomous agents for well-motivated and socially aware characters and multiagent coordination to realize story plots.

At the base is a multiagent system comprised of goal-oriented autonomous agents (Thespian agents) that realize the characters of the story. The agents in this layer autonomously interact with each other and the character controlled by the user, thereby generating the story.

The user is also modeled using a Thespian agent based on the role the user is playing. In modeling the user, not only are the goals of the user’s character considered but also the goals associated with game play (see [55, 57] for more details). This model allows other agents to form mental models about the user in the same way they do about other characters and the director agent to reason about the user’s beliefs and experience and thus can be thought of as the system’s Theory of Mind about the user.

Above this layer is a director agent that proactively directs the characters for realizing the author’s plot design, which can be seen as group goals for the multiagent system [58, 59].

4.2.1. Thespian Agent

Decision-theoretic, goal-based agents are used for modeling the characters in the story. Each agent is composed of its state, actions, dynamics, goals, beliefs, and policy. Objects, such as a cake or a house, in the story can be represented as special Thespian agents that have only state. This allows characters to reason about the state of an object in the same way as that of a character.

State. State contains information about an agent’s current status in the world. An agent’s state is defined by a set of state features, such as the name and age of the character, and the relation between that character and other characters, for example, affinity. Values of state features are represented as real numbers bounded in the range of .

Actions/Dialogue Acts. Each agent has a set of actions to choose from during its interaction. Minimally, the definition of an action includes an actor, the agent who performs the action, and an action type. For example, the agent that models the wolf in the Little Red Riding Hood story can have an action of “wolf-run.” The definition may also include an object, which is the target of the action, such as “wolf-greet-Red.” Thespian does not differentiate between physical actions and dialogue acts. They are represented and reasoned by the agents in the same way.

Dynamics. Action dynamics define how actions, either initiated by self or others, change the values of the state features. Some of the definitions are straightforward such as the fact that being killed brings one’s life value to zero, where others require sophisticated modeling such as conversational norms and appraisal dimensions for emotions (see [49, 56] for examples).

Goals. Goals are defined as a combination of goal items where each item describes the agent’s desire to maximize or minimize a state feature, such as safety, or to maximize or minimize the frequency of an event, such as being praised. The goal items motivate the agent to take corresponding actions. Which action to take is decided both by the agent’s current state and beliefs and by the relative importance each goal item has. Thus, the agent can balance multiple and potentially competing goals.

For modeling social relations and social support, agents can have goals about others’ states or actions. For example, the woodcutter in the Little Red Riding Hood story has the goal of ensuring Red is alive. He does not have the goal of killing the wolf. However, if the wolf eats Red in front of him, he will kill the wolf to bring Red back to life.

Beliefs (Theory of Mind). In Thespian, characters are modeled with a Theory of Mind—recursive beliefs about self and others including others’ beliefs about self and others, as shown in Figure 2. An agent’s subjective view (mental model) of itself or another agent includes every component of that agent, that is, its state, beliefs, policy, and so forth. The “Theory of Mind” capacity enables the agents to reason about others when making their own decisions and thus makes them “social characters.”

Figure 2: Theory of Mind.

For representing the uncertain nature of social beliefs, each agent has a mental model of self and one or more mental models of other agents. The agent’s belief about another agent is a probability distribution over alternative mental models. For example, in the Red Riding Hood story, Red’s mental models about the wolf are one being that the wolf does not have a goal of eating people and one being otherwise. Initially, Red may believe that there is a 90% chance that the first mental model is true and a 10% chance that the second mental model is true. This probability distribution will change if Red sees or hears about the wolf eating people.

In general, upon observation of an event—something happens in the virtual environment—each agent updates its beliefs based on the observation and then makes decisions based on the updated beliefs. In particular, the relative probability of alternative mental models is adjusted. Each observation serves as evidence for the plausibility of alternative mental models, that is, how consistent the observation is with the predictions from the mental models. Using this information, the probabilities of the mental models are updated based on Bayes’ theorem [66].

Decision-Making Process. In Thespian, all agents use a bounded lookahead policy by default. When an agent has multiple mental models about other agents, by default, it uses the most probable mental models to predict other agents’ future actions, though the expected states/utilities of all alternative mental models are calculated for the purpose of belief revision.

Each agent has a set of candidate actions to choose from when making decisions. When an agent selects its next action, it projects into the future to evaluate the effect of each option on the states and beliefs of other entities in the story. The agent considers not only the immediate effect but also the expected responses of other characters using its mental models about the characters and, in turn, the effects of those responses, its reaction to those responses, and so on. The agent evaluates the overall effect with respect to its goals and then chooses the action that has the highest expected reward. Thus, the agents’ actions are driven by their motivations, taking into account the status of the interaction. For example, in the “Little Red Riding Hood” story, the wolf will react to Red differently depending on whether there is somebody else close by and who that is. The wolf will choose different actions when the hunter is near or when the woodcutter is near, because the wolf has different mental models about these two characters. During the interaction, the agents do not necessarily need to do the lookahead reasoning online; rather, they can use compiled policies which are precomputed offline [67].

Encoding Social Norms. Unlike most interactive narrative frameworks, Thespian explicitly models norms in face to face communication using a domain-independent model built within a decision-theoretic context. In general, actions by one agent can impose a type of obligation on another, and a certain set of responding actions will satisfy the obligation to some degree. We currently use these obligations to encode a broad set of social norms as pairs of initiating and responding actions: greeting and greeting back, introducing oneself and introducing oneself back, conveying information and acknowledging, inquiring and informing, thanking and saying you are welcome, offering and accepting/rejecting, requesting and accepting/rejecting, and so forth.

Thespian agents have explicit goals to following norms [56]. By giving the agents goals to satisfy any such outstanding obligations, we give them an incentive to follow the encoded social norms while also considering tradeoffs between satisfying different goals. Thus, the agents can reason about their decision-making and norm-following behaviors using a unified framework. The relative priorities among all of these goals reflect the value that the character places on the corresponding social norms. For example, using Thespian, the author can model two agents—one regards following norms as an important goal and one does not. When the agents are in a hurry, they will behave in the same way—both ignore norm-related goals. However, they will behave differently under a different context.

4.2.2. Directorial Control

The director agent guides the characters’ interactions with the user, based on the author’s directorial control goals which are expressed as partial order or temporal constraints on the characters’ and user’s beliefs and actions, such as “the wolf should not be found by the hunter until it finds out where Granny is” and “Red should be eaten by the wolf within 10 steps after Granny has been eaten.”

During the interaction, the director agent proactively estimates the future developments of the story and fine-tunes the characters if necessary to achieve the goals. The director agent has access to models of the agents and the user. It uses these models to assess whether plot goals will be achieved as well as redirect the characters when needed. For redirecting the characters, the director agent takes the least commitment approach. It maintains a space of character configurations, that is, their goals and beliefs consistent with the characters’ prior behavior. All of these configurations are equally valid in the sense that they will drive the character to act in the same way up to the current point of the interaction. When the director agent foresees a violation of the author’s plot design goals, it constrains that space so that the rest of the configurations will drive the agent to act in a way that eliminates the violation (see [58] for more details). Thus, from the user’s perspective, the characters are always well motivated, and the user can interact freely with them.

4.3. Embed Pedagogical Goals

Thespian supports three approaches to encode learning goals into interactive narratives.

First, learning goals can be embedded in the world’s dynamics and the characters’ goals. For example, one of the pedagogical goals in the tactical language project is for the learner to learn to establish a relationship with the local people, in particular a relationship of trust. We can encode this pedagogical goal into the action dynamics by ensuring that failure to establish trust will have consequences. At its most severe, distrust can cause irreparable breakdowns in social interactions. Specifically, in this scene if a learner fails to achieve even the minimal requirement for this trust goal, the young man will accuse him of being a CIA agent, and all characters will refuse to talk to him.

Secondly, Thespian can provide characters with the explicit intention that the learner learns. In this approach to encoding the pedagogy, characters have a goal that the learner acquires skills specified by the pedagogy. In other words, the characters could have the intention that the learner learns (or does not learn for that matter). A character could then use its mental model of the learner as a user model to measure the degree to which the pedagogical goals are achieved.

The Theory of Mind embedded within Thespian forms a subjective view of the world that includes beliefs about the learner’s knowledge and capabilities. A character, for example, could have the explicit goal that the learner practices gaining trust. Having encoded such a goal, the character could now evaluate a possible action choice using its mental model of the learner’s goals to assess the effect on the learner and, in turn, on the pedagogical goals so encoded. Note that this is different from setting the character to trust the learner at the beginning. This goal setting will drive the character to deliberately behave in a fashion that would elicit behavior from the learner that increases trust. Although it is not an explicit intention of the character, its behavior does assist the learner. Again, because we have priorities regarding the goals, we can choose how much a particular character is driven by pedagogical goals for the learner in relation to its personal goals.

Finally, a third way to encode the pedagogy is through the director agent. By setting up directorial control goals, we can explicitly encode the intention to teach in the overall system. When the directorial goals cannot be reached without violating the constraint of characters having consistent motivations, Thespian will give up the directorial goals unless the author has indicated otherwise, so that the characters’ behaviors are interpretable to the user, and the user is more likely to have a coherent narrative experience.

These three approaches to encode pedagogy—in the world’s dynamics, in the character’s intentions, and in the system’s intention—provide Thespian with a rich framework for realizing pedagogical interactive narratives.

5. Discussion

We have presented Thespian’s agent architect and illustrated in the above examples how Thespian can be used to model two different types of stories and the characters in them. A question that arises naturally is as follows. Can Thespian model other types of stories and characters?

As a modeling tool, Thespian does not capture accurately all human social and psychological phenomena. For example, do people always explicitly perform forward reasoning for making a decision? Even when they do, do they perform reasoning at the same level of detail as Thespian agents? How does a person’s current emotional state affect his/her decision-making process and even their memories of past events? In future work, we plan to address some of these issues by extending Thespian’s models of human decision-making, opinion-forming, and emotional processes.

On the other hand, even at its current stage, Thespian is a sophisticated and useful tool for modeling and simulating characters for social interaction and culture skills training purposes. The most important goal for such training applications is to help the users understand social interaction and, further, practice their social interaction skills in context. The users may lack the ability to pick up social cues, to think about the outcomes of their actions thoroughly, or to switch roles and think from another person’s perspective. Thespian agents are designed to make the reasoning process transparent and also explicitly incorporate Theory of Mind in reasoning. Thus, the users can learn from the agents how to make social decisions in their own social lives. Moreover, the system can reveal the agents’ motivations and reasoning processes behind their actions and thus help the user practice picking up cues in social interactions and interpreting others’ intentions. Of course, by simply providing such capacity, Thespian cannot ensure the user learns. In this regard, Thespian is incomplete as a standalone tutoring system. The design and evaluation of the pedagogical goals both rely on the designer of the interactive narrative. Nevertheless, we believe Thespian provides a good foundation for supporting the designers.

6. Future Work

Future work has been planned to extend existing Thespian models of decision-making and emotion. Currently, characters are modeled as goal-based agents and use a deliberate lookahead reasoning process to decide their actions. This is not the only way people make decisions. People also make decisions through shallower process, such as retrieval of similar past experience, using only the most salient information for reasoning.

Similarly, Thespian’s current model of emotion is based on the appraisal theory, which defines emotion as the character’s evaluation of its person-environment relationship. However, other factors also affect people’s emotions, such as their prior emotional state, memory of past experiences, and even the emotions of other people around them. In later versions of Thespian, we want to establish a unified computational model for emotion that considers more of these factors.

Finally, Thespian so far has only been used for creating digital avatars which are displayed on either computer monitors or projector screens. Computing technologies, in particular Internet technologies and mobile computing technologies, have advanced rapidly over the past decade. There has been an increased interest in developing personal assistance robots, combining an intelligent agent with a moveable platform. These new platforms present unique challenges as well as an opportunity for an intelligent conversational agent and interactive narrative. For example, the agent may have access to the user’s physical location, and this can influence how it interacts with the user. We are interested in extending Thespian to work on these new platforms.

7. Conclusion

Computer aided pedagogical interactive narrative represents a new interdisciplinary research and application field. In this paper, we analyzed the design challenges faced by the designers of pedagogical interactive narratives for social interaction training and identified the key requirements for characters in these interactive narratives as well as for the authoring framework for creating such characters. We argue that a rich model of characters with Theory of Mind capacity is needed, and, for pedagogical purposes, it is preferred that the characters’ motivations and beliefs behind their behaviors can be explained to the user after the interaction. The Thespian framework for interactive narratives is presented, together with examples of applying it to model characters with Theory of Mind.

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.


  1. S. Park and K. Kim, “The use of pedagogical agent as a tool to improve learning interest: based on the distinction between individual interest and situational interest,” in Proceedings of the Society for Information Technology and Teacher Education International Conference, R. McFerrin, R. Weber, R. Carlsen, and D. A. Willis, Eds., pp. 2777–2781, AACE, Chesapeake, Va, USA, 2008.
  2. R. Moreno, “Multimedia learning with animated pedagogical agents,” in Handbook of Multimedia Learning, R. Mayer, Ed., pp. 507–524, Cambridge University Press, New York, NY, USA, 2005. View at Google Scholar
  3. R. Moreno, R. E. Mayer, H. A. Spires, and J. C. Lester, “The case for social agency in computer-based teaching: do students learn more deeply when they interact with animated pedagogical agents?” Cognition and Instruction, vol. 19, no. 2, pp. 177–213, 2001. View at Publisher · View at Google Scholar · View at Scopus
  4. C. Evans and N. J. Gibbons, “The interactivity effect in multimedia learning,” Computers & Education, vol. 49, no. 4, pp. 1147–1160, 2007. View at Publisher · View at Google Scholar · View at Scopus
  5. R. K. Atkinson and A. Renkl, “Interactive example-based learning environments: using interactive elements to encourage effective processing of worked examples,” Educational Psychology Review, vol. 19, no. 3, pp. 375–386, 2007. View at Publisher · View at Google Scholar · View at Scopus
  6. A. Wong, N. Marcus, P. Ayres et al., “Instructional animations can be superior to statics when learning human motor skills,” Computers in Human Behavior, vol. 25, no. 2, pp. 339–347, 2009. View at Publisher · View at Google Scholar · View at Scopus
  7. R. Azevedo, “Computer environments as metacognitive tools for enhancing learning,” Educational Psychologist, vol. 40, no. 4, pp. 193–197, 2005. View at Publisher · View at Google Scholar · View at Scopus
  8. L. C. Miller, R. P. Appleby, J. L. Christensen, and etal, “Virtual agents and virtual sexual decision-making: Interventions for on-line applications that change real-life risky sexual choices,” in Interactive Health Communication Technologies: Promising Strategies for Health Behavior Change, S. Noar and N. Harrington, Eds., Lawrence Earlbaum Associates, Mahwah, NY, USA, 2011. View at Google Scholar
  9. A. Rizzo, B. Newman, T. Parsons et al., “Development and clinical results from the virtual Iraq exposure therapy application for PTSD,” in Proceedings of the IEEE Explore: Virtual Rehabilitation, Haifa, Israel, 2009.
  10. M. Si, S. C. Marsella, and D. V. Pynadath, “THESPIAN: an architecture for interactive pedagogical drama,” in Proceedings of the Conference on Artificial Intelligence in Education: Supporting Learning through Intelligent and Socially Informed Technology (AIED '05), pp. 595–602, 2005.
  11. A. Paiva, J. Dias, D. Sobral et al., “Caring for agents and agents that care: building empathic relations with synthetic agents,” in Proceedings of the 3rd International Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS '04), pp. 194–201, July 2004. View at Scopus
  12. D. Traum, J. Rickel, J. Gratch, and S. Marsella, “Negotiation over tasks in hybrid human-agent teams for simulation-based training,” in Proceedings of the 2nd International Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS '03), pp. 441–448, Melbourne, Australia, July 2003. View at Scopus
  13. J. Kim, R. W. Hill, P. Durlach et al., “BiLAT: a game-based environment for practicing negotiation in a cultural context,” International Journal of Artificial Intelligence in Education, vol. 19, no. 3, pp. 289–308, 2009. View at Google Scholar · View at Scopus
  14. G. Pereira, A. Brisson, R. Prada et al., “Serious games for personal and social learning & ethics: status and trends,” in Proceedings of the 4th International Conference on Games and Virtual Worlds for Serious Applications (VS-GAMES '12), pp. 53–65, 2012.
  15. J. Lester, E. Young Ha, S. Lee, B. Mott, J. Rowe, and J. Sabourin, “Serious games get smart: intelligent game-based learning environments,” AI MAGAZINE, vol. 34, pp. 31–45, 2013. View at Google Scholar
  16. B. C. Nelson, “Exploring the use of individualized, reflective guidance in an educational multi-user virtual environment,” The Journal of Science Education and Technology, vol. 16, no. 1, pp. 83–97, 2007. View at Publisher · View at Google Scholar · View at Scopus
  17. A. Zook, S. Lee-Urban, M. O. Riedl, H. K. Holden, R. A. Sottilare, and K. W. Brawner, “Automated scenario generation: toward tailored and optimized military training in virtual environments,” in Proceedings of the 7th International Conference on the Foundations of Digital Game (FDG '12), pp. 164–171, Raleigh, NC, USA, June 2012. View at Publisher · View at Google Scholar · View at Scopus
  18. H. Wang, C. Shen, and U. Ritterfeld, “Enjoyment of digital games: what makes them seriously fun?” in Serious Games: Mechanisms and Effects, U. Ritterfeld, M. Cody, and P. Vorderer, Eds., Routledge, New York, NY, USA, 2009. View at Google Scholar
  19. C. Heeter, “Being there: the subjective experience of presence,” Presence: Teleoperations and Virtual Environments, vol. 1, no. 2, pp. 262–271, 1992. View at Google Scholar
  20. T. B. Sheridan, “Further musings on the psychophysics of presence,” Presence: Teleoperators and Virtual Environments, vol. 5, no. 2, pp. 241–246, 1996. View at Google Scholar · View at Scopus
  21. M. Slater, M. Usoh, and A. Steed, “Depth of presence in virtual environments,” Presence: Teleoperators and Virtual Environments, vol. 3, pp. 130–144, 1994. View at Google Scholar
  22. J. Steuer, “Defining virtual reality: dimensions determining telepresence,” Journal of Communication, vol. 42, no. 4, pp. 72–93, 1992. View at Google Scholar
  23. W. Winn, “A conceptual basis for educational applications of virtual reality,” Tech. Rep. HITL-TR-93-9, Human Interface Technology Lab oratory, Seattle, Wash, USA, 1993. View at Google Scholar
  24. H. G. Hoffman, J. D. Prothero, M. J. Wells, and J. Groen, “Virtual chess: the role of meaning in the sensation of presence,” International Journal of Human-Computer Interaction, vol. 10, pp. 251–263, 1998. View at Google Scholar
  25. T. Schubert, F. Friedmann, and H. Regenbrecht, “The experience of presence: factor analytic insights,” Presence: Teleoperators and Virtual Environments, vol. 10, no. 3, pp. 266–281, 2001. View at Publisher · View at Google Scholar · View at Scopus
  26. R. M. Held and N. I. Durlach, “Telepresence,” Presence: Teleoperators and Virtual Environments, vol. 1, no. 1, pp. 109–112, 1992. View at Google Scholar
  27. D. W. Schloerb, “A quantitative measure of telepresence,” Presence: Teleoperators and Virtual Environments, vol. 4, pp. 64–80, 1995. View at Google Scholar
  28. F. Biocca, “The Cyborg's dilemma: progressive embodiment in virtual environments,” Journal of Computer-Mediated Communication, vol. 3, no. 2, 1997. View at Publisher · View at Google Scholar · View at Scopus
  29. F. Biocca, C. Harms, and J. K. Burgoon, “Toward a more robust theory and measure of social presence: review and suggested criteria,” Presence: Teleoperators and Virtual Environments, vol. 12, no. 5, pp. 456–480, 2003. View at Publisher · View at Google Scholar · View at Scopus
  30. P. Zahorik and R. L. Jenison, “Presence as being-in-the-world,” Presence: Teleoperators and Virtual Environments, vol. 7, no. 1, pp. 78–89, 1998. View at Publisher · View at Google Scholar · View at Scopus
  31. E. Ochs and L. Capps, Living Narrative: Creating Lives in Everyday Storytelling, Harvard University Press, Cambridge, Mass, USA, 2001.
  32. T. Wilkens, A. Hughes, B. M. Wildemuth, and G. Marchionini, “The role of narrative in understanding digital video: an exploratory analysis,” in Proceedings of the Annual Meeting of the American Society for Information Science, pp. 323–329, 2003.
  33. C. Breazeal, Sociable machines: expressive social exchange between humans and robots [Ph.D. thesis], Massachusetts Institute of Technology, 2000.
  34. H. Knight, “Eight lessons learned about non-verbal interactions through robot theater,” in Proceedings of the 3rd International Conference on Social Robotics (ICSR '11), pp. 42–51, Amsterdam, The Netherlands, November 2011. View at Publisher · View at Google Scholar
  35. Y. Wilks, Close Engagements with Artificial Companions: Key Social, Psychological, Ethical and Design Issues (Natural Language Processing), John Benjamins, 2010.
  36. T. W. Bickmore and R. W. Picard, “Establishing and maintaining long-term human-computer relationships,” ACM Transactions on Computer-Human Interaction, vol. 12, no. 2, pp. 293–327, 2005. View at Publisher · View at Google Scholar · View at Scopus
  37. K. Severinson-Eklundh, A. Green, and H. Httenrauch, “Social and collaborative aspects of interaction with a service robot,” Technical Report, Royal Institute of Technology (KTH, Stockholm, Sweden, 2003. View at Google Scholar
  38. R. S. Lazarus, Emotion & Adaptation, Oxford University Press, New York, NY, USA, 1991.
  39. I. J. Roseman and C. A. Smith, “Appraisal theory: overview, assumptions, varieties, controversies,” in Appraisal Processes in Emotion: Theory, Methods, K. Scherer, A. Schorr, and T. Johnstone, Eds., Oxford University Press, Oxford, UK, 2001. View at Google Scholar
  40. A. Ortony, G. L. Clore, and A. Collins, The Cognitive Structure of Emotions, Cambridge University Press, Cambridge, UK, 1998.
  41. S. Tan, Emotion and the Structure of Narrative Film: Film as an Emotion Machine, Lawrence Erlbaum Associates, Mahwah, NJ, USA, 1996, Translated by B. Fasting.
  42. J. R. Meehan, “TALE-SPIN, an interactive program that writes stories,” in Proceedings of the 5th International Joint Conference on Artificial Intelligence (IJCAI '77), vol. 2, pp. 91–98, 1977.
  43. C. Peters, “Foundations of an agent theory of mind model for conversation initiation in virtual environments,” in AISB'05 Convention: Social Intelligence and Interaction in Animals, Robots and Agents - Joint Symposium on Virtual Social Agents: Social Presence Cues for Virtual Humanoids Empathic Interaction with Synthetic Characters Mind Minding Agents, pp. 163–170, April 2005. View at Scopus
  44. R. Hodhod, A. Piplica, and B. Magerko, “A formal model of shared mental models for computational improvisational agents,” in Proceedings of the 12th Annual Conference on Intelligent Virtual Agents, 2012.
  45. B. O'Neill and M. Riedl, “Dramatis: a computational model of suspense,” in AAAI, 2014.
  46. K. R. Scherer, “Appraisal considered as a process of multilevel sequencial checking,” in Appraisal Processes in Emotion: Theory, Methods, A. Scherer and T. Johnstone, Eds., Oxford University Press, Oxford, UK, 2001. View at Google Scholar
  47. R. Aylett and S. Louchart, “If i were you: double appraisal in affective agents,” in Proceedings of the 7th international joint conference on Autonomous agents and multiagent systems, vol. 3, pp. 1233–1236, International Foundation for Autonomous Agents and Multiagent Systems, 2008.
  48. M. Belkaid and N. Sabouret, “A logical model of theory of mind for virtual agents in the context of job interview simulation,” in Proceedings of the 2nd International Workshop on Intelligent Digital Games for Empowerment and Inclusion at IUI 2014, pp. 83–90, 2014.
  49. M. Si, S. C. Marsella, and D. V. Pynadath, “Modeling appraisal in theory of mind reasoning,” Autonomous Agents and Multi-Agent Systems, vol. 20, no. 1, pp. 14–31, 2009. View at Publisher · View at Google Scholar · View at Scopus
  50. J. Dias, R. Aylett, H. Reis, and A. Paiva, The Great Deceivers: Virtual Agents and Believable Lies, Cognitive Science, New York, NY, USA, 2013.
  51. J. E. Laird, “It knows what you're going to do: adding anticipation to a Quakebot,” in Proceedings of the 5th International Conference on Autonomous Agents, pp. 385–392, Montreal, Canada, June 2001. View at Scopus
  52. M. Harbers, K. Van den Bosch, and J.-J. Meyer, “Agents with a theory of mind in virtual training,” in Multi-Agent Systems for Education and Interactive Entertainment: Design, Use and Experience, M. Beer, M. Fasli, and D. Richards, Eds., chapter 9, pp. 172–187, IGI Global, Hershey, Pa, USA, 2011. View at Google Scholar
  53. E. Doirado and C. Martinho, “I mean it !: detecting user intentions to create believable behaviour for virtual agents in games,” in Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems (AAMAS '10), vol. 1, pp. 83–90, Toronto, Canada, May 2010.
  54. M. Michael and A. Stern, “Integrating plot, character and natural language processing in the interactive drama Façade,” in Proceedings of the International Conference on Technologies for Interactive Digital Storytelling and Entertainment, 2003.
  55. M. Si, S. C. Marsella, and D. V. Pynadath, “Thespian: using multi-agent fitting to craft interactive drama,” in Proceedings of the 4th International Conference on Autonomous Agents and Multi agent Systems (AAMAS '05), pp. 21–28, July 2005. View at Publisher · View at Google Scholar · View at Scopus
  56. M. Si, S. C. Marsella, and D. V. Pynadath, “Thespian: modeling socially normative behavior in a decision-theoretic framework,” in Proceedings of the 6th International Conference on Intelligent Virtual Agents (IVA '06), pp. 369–382, Marina Del Rey, Calif, USA, August 2006. View at Publisher · View at Google Scholar
  57. M. Si, S. C. Marsella, and D. V. Pynadath, “Proactive authoring for inter-active drama: an authors assistant,” in Individual Voluntary Arrangement, pp. 225–237, 2007. View at Google Scholar
  58. M. Si, S. C. Marsella, and D. V. Pynadath, “Directorial control in a decision-theoretic framework for interactive narrative,” in Proceedings of the International Conference on Interactive Digital Storytelling, pp. 221–233, Guimares, Portugal, 2009.
  59. M. Si, S. C. Marsella, and D. V. Pynadath, “Evaluating directorial control in a character-centric interactive narrative framework,” in Proceedings of the International Conference on Autonomous Agents and Multiagent Systems (AAMAS '10), Toronto, Canada, May 2010.
  60. S. C. Marsella, D. V. Pynadath, and S. J. Read, “PsychSim: agent-based modeling of social interactions and influence,” in Proceedings of the International Conference on Cognitive Modeling, pp. 243–248, 2004.
  61. D. V. Pynadath and S. C. Marsella, “PsychSim: modeling theory of mind with decision-theoretic agents,” in Proceedings of the 19th International Joint Conference on Artificial Intelligence (IJCAI '05), pp. 1181–1186, August 2005. View at Scopus
  62. R. D. Smallwood and E. J. Sondik, “The optimal control of par tially observable markov processes over a finite horizon,” Operations Research, vol. 21, no. 5, pp. 1071–1088, 1973. View at Publisher · View at Google Scholar · View at Scopus
  63. W. Lewis Johnson, C. Beal, A. Fowles-Winkler et al., “Tactical language training system: an interim report,” in Proceedings of the 7th International Conference on Intelligent Tutoring Systems, pp. 336–345, 2004.
  64. C. Beal, W. L. Johnson, R. Dabrowski, and S. Wu, “Individualized feedback and imulation-based practice in the tactical language training system: an experimental evaluation,” in Proceedings of the 12th International Conference on Artificial Intelligence in Education (AIED '05), July 2005.
  65. L. Egri, The Art of Dramatic Writing: Its Basis in the Creative Interpretation of Human Motives, Simon & Schuster, 2004.
  66. J. Y. Ito, D. V. Pynadath, and S. C. Marsella, “A decision-theoretic approach to evaluating posterior probabilities of mental models,” in Workshop on Plan, Activity, and Intent Recognition (AAAI '07), pp. 60–65, July 2007. View at Scopus
  67. D. V. Pynadath and S. C. Marsella, “Fitting and compilation of multiagent models through piecewise linear functions,” in Proceedings of the 3rd International Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS '04), pp. 1197–1204, New York, NY, USA, July 2004. View at Scopus