We present concurrent theoretical work from HCI and Education that reveals a convergence of trends focused on the importance of three themes: embodiment, multimodality, and composition. We argue that there is great potential for truly transformative work that aligns HCI and Education research, and posit that there is an important opportunity to advance this effort through the full integration of the three themes into a theoretical and technological framework for learning. We present our own work in this regard, introducing the Situated Multimedia Arts Learning Lab (SMALLab). SMALLab is a mixed-reality environment where students collaborate and interact with sonic and visual media through full-body, 3D movements in an open physical space. SMALLab emphasizes human-to-human interaction within a multimodal, computational context. We present a recent case study that documents the development of a new SMALLab learning scenario, a collaborative student participation framework, a student-centered curriculum, and a three-day teaching experiment for seventy-two earth science students. Participating students demonstrated significant learning gains as a result of the treatment. We conclude that our theoretical and technological framework can be broadly applied in the realization of mixed reality, student-centered learning environments.

1. Introduction

Emerging research from Human Computer Interaction (HCI) offers exciting new possibilities for the creation of transformative approaches to learning. Current sensing, modeling, and feedback paradigms can enrich collaborative learning, bridge the physical/digital realms, and prepare all students for the dynamic world they face. When grounded in contemporary research from the learning sciences, HCI approaches have great promise to redefine the future of learning and instruction through paradigms that cultivate the students’ sense of ownership and play in the learning process.

A convergence of recent trends across the Education and HCI research communities points to the promise of new learning environments that can realize this vision. In particular, many emerging technology-based learning systems are highly inquiry based, with the most effective being learner centered, knowledge centered, and assessment centered [1]. These systems are broadly termed as student-centered learning environments (SCLEs). Looking to the future of learning, we envision a new breed of SCLE that is rooted in contemporary Education and HCI research and is tightly coupled with appropriate curriculum and instruction design. Our research is focused on three concepts in particular: embodiment, multimodality, and composition which we define in Section 2.

We begin with a discussion of these key concepts and situate them in the context of both HCI and Education research. We present prior theoretical work and examples of the application of these three concepts in a variety of learning contexts. We then present our own work in the design and implementation of new platform for learning, the Situated Multimedia Arts Learning Lab (SMALLab). SMALLab (Figure 1) is a mixed-reality environment where students collaborate and interact with sonic and visual media through vocalization and full-body, 3D movements in an open, physical space. SMALLab emphasizes human-to-human interaction within a computational multimodal feedback framework that is situated within an open physical space. In collaboration with a network of school and community partners, we have deployed SMALLab in a variety of informal and formal educational settings and community-based contexts, impacting thousands of students, teachers, and community members, many from underserved populations. We have developed innovative curricula in collaboration with our partner institutions. We summarize past deployments along with their supporting pilot studies and present two recent examples as case studies of SMALLab learning. Finally, we present conclusions and describe our ongoing work and future plans.

2. Prior Work

Recent research spanning Education and HCI has yielded three themes that inform our work across learning and play: embodiment, multimodality, and composition. Here, we define the scope of these terms in our research and discuss their theoretical basis before presenting examples of prior related applications.

2.1. Embodiment
2.1.1. Learning Sciences

By embodiment we mean that SMALLab interactions engage students both in mind and in body, encouraging them to physically explore concepts and systems by moving within and acting upon an environment.

A growing body of evidence supports the theory that cognition is “embodied”-grounded in the sensorimotor system [25]. This research reveals that the way we think is a function of our body, its physical and temporal location, and our interactions with the world around us. In particular, the metaphors that shape our thinking arise from the body’s experiences in our world and are hence embodied [6].

A recent study of the development of reading comprehension in young children suggests that when children explicitly “index” or map words to the objects or activities that represent them, either physically or imaginatively, their comprehension improves dramatically [7]. This aligns well with the notion, advanced by Fauconnier and Turner [4], that words can be thought of as form-meaning pairs. For example, when a reader encounters the lexical form, “train” in a sentence, he can readily supply the sound form (tr n). If he then maps it to the image of a train (a locomotive pulling cars situated on a track), we have a form-meaning pair that activates the student’s mental model of trains, which he can then use to help him understand and interpret the sentence in which the word “train” appears [6].

SMALLab is a learning environment that supports and encourages students in this meaning-making activity by enabling them to make explicit connections between sounds, images, and movement. Abstract concepts can be represented, shared, and collaboratively experienced via physical interaction within a mixed-reality space.

2.1.2. HCI

Many emerging developments in HCI also emphasize the connections between physical activity and cognition [814], and the intimately embedded relationship between people and other entities and objects in the physical world [1517]. The embodied cognition perspective [10, 14] argues based on strong empirical evidence from psychology and neurobiology [7, 18] that perception, cognition, and action, rather than being separate and sequential stages in human interaction with the physical world, in fact occur simultaneously and are closely intertwined. Dourish [8, 9] in particular emphasizes the importance of context in embodied interaction, which emerges from the interaction rather than being fixed by the system. As such, traditional HCI frameworks such as desktop computing (i.e., mouse/keyboard/screen) environments, which facilitate embodied interaction in a limited sense or not at all, risk binding the user to the system context, restricting many of his/her capacities for creative expression and free thought which have proven so essential in effective learning contexts. From cognitive, ecological, and design psychology, Shepard [17], Gibson [15], Norman [16], and Galperin [19] further emphasize the importance of the embedded relationship between people and things, and the role that manipulating physical objects has in cognition. Papert, Resnick, and Harel (see [2023]) extend these approaches by explicitly stating their importance in educational settings. Design-based learning methodologies such as Star Logo, Lego Mindstorms, and Scratch [21, 24, 25] emphasize physical-digital simulation and thinking. These have proven quite popular and effective in fostering and orienting students' innate creativity toward specific learning goals.

In order for these tools to extend further into the physical world and to make use of the important connections provided by embodiment, they must include physical elements that afford embodied interactions. Ishii has championed the field of tangible media [26] and coined the term tangible user interfaces (TUIs versus GUI: graphical user interfaces). His Tangible Media group has developed an extensive array of applications that pertain to enhancing not only productivity (e.g., Urban Simulation, SandScape) but also artistic expression and playful engagement in the context of learning (e.g., I/O Brush, Topobo, and Curlybot) [27].

Some prior examples of HCI systems that facilitate elements of embodiment and interaction with immersive environments include the Cave Automated Visualization Environment (CAVE) [28]. CAVEs typically present an immersive environment through the use of 3D glasses or some other head-mounted display (HMD) that enables a user to engage through a remote control joystick. A related environment, described as a step toward the holodeck, was developed by Johnson at USC to teach topics ranging from submarine operation to Arabic language training [29]. In terms of extending physical activity through nontraditional interfaces and applying them to collaboration and social engagement, the Nintendo Wii’s recent impact on entertainment is the most pronounced. The Wii amply demonstrates the power of the body as a computing interface. Some learning environments that have made strides in this area include Musical Play Pen, KidsRoom, and RoBallet [3032]. These interfaces demonstrate that movement-based HCI can greatly impact instructional design, play, and creativity.

2.1.3. Example

A particularly successful example of a learning environment that leverages embodiment in the context of instructional design is RiverCity [3336]. RiverCity is a multiuser, online desktop virtual environment that enables middle school children to learn about disease transmission. The virtual world in RiverCity embeds a river in various types of terrain which influence water runoff and other environmental factors that in turn influence the transmission of disease through water, air, and/or insect populations. The factors affecting disease transmission are complex and have many causes, paralleling conditions in the physical world. Student participants are virtually embodied in the world, enabling exploration through avatars that interact with each other, with facilitators’ avatars, and with the auditory and visual stimuli comprising the RiverCity world. Participants can make complex decisions within this world by, for example, using virtual microscopes to examine water samples, and sharing and discussing their proposed solutions. In several pilot studies [33, 34], the level of motivation, the diversity and originality of participants’ solutions, and their overall content knowledge were found to increase with River City as opposed to a similar paper-based environment. Hence, the RiverCity experience provides at least one successful example of how social embodiment through avatars in a multisensory world can result in learning gains.

However, a critical aspect of embodiment not addressed by RiverCity is the bodily-kinesthetic sense of the participant. Physically, participants interact with River City using a mouse and keyboard, and view 2D projections of the 3D world on a screen. The screen physically separates users’ bodies from the environment, which implies that perception and bodily action are not as intimately connected as they are in the physical world, resulting in embodiment in a lesser sense [10]. In SMALLab, multiple participants interact with the system and with each other via expressive, full-body movement. In SMALLab there is no physical barrier between the participant and the audiovisual environment they manipulate. It has long been hypothesized [37] that bodily kinesthetic modes of representation and expression are an important dimension of learning and severely underutilized in traditional education. Thus, it is plausible that an environment that affords full-body interactions in the physical world can result in even greater learning gains.

2.2. Multimodality
2.2.1. Learning Sciences

By multimodality we mean interactions and knowledge representations that encompass students’ full sensory and expressive capabilities including visual, sonic, haptic, and kinesthetic/proprioceptive. Multimodality includes both student activities in SMALLab and the knowledge representations it enables.

The research of Jackendoff in cognitive linguistics suggests that information that an individual assimilates is encoded either as spatial representations (images) or as conceptual structures (symbols, words or equations) [38]. Traditional didactic approaches to teaching strongly favor the transmission of conceptual structures, and there is evidence that many students struggle with the process of translating these into spatial representations [6]. By contrast, information gleaned from the SMALLab environment is both propositional and imagistic as described above.

Working in SMALLab, students create multimodal artifacts such as sound recordings, videos, and digital images. They interact with computation using innovative multimodal interfaces such as 3D physical movements, visual programming interfaces, and audio capture technologies. These interfaces encourage the use of multiple modes of representation, which facilitates learning in general, [39, 40] and are robust to individual differences in students’ optimal learning styles [37, 41], and can serve to motivate learning [1].

2.2.2. HCI

Many recent developments in HCI have emphasized the role of immersive, multisensory interaction through multimodal (auditory, visual, and tactile) interface design. This work can be applied in the design of new mixed-reality spaces. For example, in combining audio and video in perceptive spaces, Wren et al. [42] describe their work in the development of environments utilizing unencumbered sensing technologies in situated environments. The authors present a variety of applications of this technology that span data visualization, interactive performance, and gaming. These technologies suggest powerful opportunities for the design of learning scenarios, but they have not yet been applied for this purpose.

Related work in arts and technology has influenced our approach to the design of mediated learning scenarios. Our work draws from extensive research in the creation of interactive sound environments [4345]. While much of this work is focused on applications in interactive computer music performance, the core innovations for interactive sound can be directly applied in our work with students. In addition, we are drawing from the 3D visualization community [46] in considering how to best apply visual design elements (e.g., color, lighting, spatial composition) to render content in SMALLab.

There are many examples where HCI researchers are extending the multimodal tool set and applying it to novel technologically mediated experiences for learning and play. Ishii’s Music Bottles offer a multimodal experience through sound, physical interaction, and light color changes as different bottles are uncorked by the user to release sounds. The underlying sensor mechanism is a resonant RF coil that is modulated by an element in the cork. Edmonds has chronicled the significant contribution physiological sensors have made to the interactive computational media arts [47]. RoBallet uses laser beam-break sensors, such as those found in some elevators and garage doors, along with video and sonic feedback to engage students in interactive choreography and composition. Cavallo argues that this system would enable new forms of teaching not only music but math and programming as well [32]. The work described in this paper builds upon this prior work and is similarly extending the tools and domains for multimodal HCI interfaces as they apply to learning and play.

2.2.3. Example—the Mediate Environment

One example of an immersive, multisensory learning environment which emphasizes multimodality is MEDIATE, an environment designed to foster a sense of agency and a capacity for creative expression in people on the autistic spectrum (PAS). Autism is a variable neuron-developmental disorder in which PAS are overwhelmed by the excessive stimuli, the noises and colors that characterize interaction in the physical world [4850]. Perhaps as a result (although exact mechanisms and causes are unknown), PAS withdraw into their own world. They often find self-expression and even everyday social interaction difficult. MEDIATE, designed in collaboration with PAS, sets up an immersive 3D environment in which stimuli are quite focused and simplified, yet at the same time dynamic and engaging—capable of affording a wide range of creative expression. The MEDIATE infrastructure consists of a pair of planar screens alternating with a pair of tactile interface walls and completely surrounds the participant. On the screens are projected particle grids, a dynamic visual field which responds to the participant’s visual silhouette, his/her vocalizations and other sounds, and his/her tactile interactions [49]. A specially designed loudspeaker system provides immersive audio feedback that includes the subsonic range, and interface walls provide vibrotactile feedback.

Multimodality in MEDIATE is achieved through the integration of sonic, visual, and tactile interfaces in both sensing and feedback. The environment is particularly impressive in that it can potentially supplant the traditional classroom space with one that is much more conducive to learning in the context of PAS. However, MEDIATE remains specialized as a platform for PAS rehabilitation and has not been generalized for use in everyday classroom instruction. By contrast, SMALLab emphasizes multimodality in the context of real-world classroom settings, where the immersive media coexists in the realm of everyday social interactions. SMALLab enables students and teachers to work together, physically interacting, face-to-face with one another and digital media elements. Thus, it facilitates the emergence of a natural zone of proximal development [51] where, on an informal basis, facilitators and student peer experts can interact with novices and increase what they are able to accomplish in the interaction.

Although MEDIATE was designed in collaboration with PAS [48], participants are not able to build in new modes of interaction or further customize the interface. This idea of composition, which comes from building, extending, and reconfiguring the interaction framework, is essential to engaging participants in more complex and targeted learning situations and has been integral to the design of SMALLab.

2.3. Composition
2.3.1. Learning Sciences

Composition refers to reconfigurability, extensibility, and programmability of interaction tools and experiences. Specifically, we mean composition in two senses. First, students compose new interaction scenarios in service of learning. Second, educators and mentors can extend the toolset to support new types of learning that is tailored to their students’ needs.

In our design of the SMALLab learning experience, we proceed from the fundamentally constructivist view that knowledge must be actively constructed by the learner rather than passively received from the environment, and that the prior knowledge of the learner shapes the process of new construction [52]. Drawing on the views of social constructivists (i.e., Vygotsky, Bruner, Bandura, Cobb, Roth) we view the learning process as socially mediated, knowledge as socially and culturally constructed, and reality as not discovered but rather socially “invented” [40, 53, 54]. We venture beyond constructivism in subscribing to the notion that teaching and learning should be centered on the construction, validation, and application of models—flexible, reusable knowledge structures that scaffold thinking and reasoning [3, 55]. This constructive activity of modeling lies at the heart of student activity in SMALLab.

In their seminal work describing the situated nature of cognition, Brown et al. [56] observed that students in a classroom setting tend to acquire and use information in ways that are quite different from “just plain folks” (JPFs). They further revealed that the reasoning of experts and JPFs was far closer to one another than that of experts and students. They concluded that the culture of schooling, with its passive role for students and rule-based structure for social interactions, promotes decontextualization of information that leads to narrow procedural thinking and the inability to transfer lessons learned in one context to another. This finding highlights the importance of learning that is situated, both culturally and socially. SMALLab grounds students in a physical space that affords visual, haptic, and sonic feedback. The abstraction of conceptual information from this perceptual set is enabled through guided reflective practice of students as they engage in the modeling process.

Student engagement in SMALLab experience is motivated both by the novelty of a learning environment that affords them some measure of control [57] and by the opportunity to work collaboratively to achieve a specific goal, where the pathway they take to this goal is not predetermined by the teacher or the curriculum. Hence, SMALLab environment rewards originality and creativity with a unique digital-physical learning experience that affords new ways of exploring a problem space.

2.3.2. HCI

Compositional interfaces have a rich history in HCI, as evidenced by Papert and Minsky’s Turtle Logo which fosters creative exploration and play in the context of a functional, lisp-based programming environment [24]. More recent examples of HCI systems that incorporate compositional interfaces include novice level programming tools such as Star Logo, Scratch, and Lego Mindstorms. Resnick extends these approaches through the Playful Invention and Exploration (PIE) Museum Network and the Intel Computer Club Houses [58], thus providing communities with tools for creative composition in rich, informal sociocultural contexts. Essentially, these interfaces create a “programming culture” at community technology centers, classrooms, and museums. There has been extensive research on the development of programming languages for creative practitioners, including graphical programming environments for musicians and multimedia artists such as Max/MSP/Jitter, Reaktor, and PD. This research has made significant contributions toward improving the impact and viability of programming tools as compositional interfaces.

Embedding physical interactions into objects for composition is a strategy for advancing embodied multimodal composition. Ryokai’s I/O Brush [59] is an example of a technology that encourages composition, learning, and play. This system enables capture from the physical world through a camera in the end of a paint brush that allows individuals to capture colors and textures from the physical world and compose with them in the digital world. It can even take video sequences such as a blinking eye that can then become part of the user’s digital painting. Composition is a profoundly empowering experience and one that many learning environments are also beginning to emphasize to a greater extent.

2.3.3. Example—Scratch

The Scratch programming environment [60] emphasizes the power of compositional paradigms for learning. Scratch enables students to create games, interactive stories, animations, music and art within a graphical programming environment. The interface extends the metaphor of LEGO bricks where programming functions snap together in a manner that prohibits programming errors and thus avoids the steep learning curve that can be a barrier to many students in traditional programming environments. The authors frame the goal of Scratch as providing “tinkerability” for learners that will allow them to experiment and redesign their creations in a manner that is analogous to physical elements, albeit with greater combinatorial sophistication.

Scratch has been deployed in a number of educational settings [25, 61]. In addition to focused research efforts to evaluate its impact, a growing Scratch community website, where authors can publish their work, provides mounting evidence that it is a powerful tool for fostering meaningful participation for a broad and diverse population.

Scratch incorporates multimodality through the integration of sound player modules within the primarily visual environment. However, it provides only a limited set of available tools for sound transformation (e.g., soundfile playback, speech synthesis) and as a consequence, authors are not able to achieve the multimodal sophistication that is possible within SMALLab. Similarly, Scratch addresses the theme of embodiment in the sense that authors and users can represent themselves as avatars within the digital realm. However, Scratch exists within the standard desktop computing paradigm and students cannot interact through other more physically embodied mechanisms.

2.4. Defining Play

With a focus on play in the context of games, Salen and Zimmerman [62] summarize a multitude of definitions. First they consider the diverse meanings and contexts of the very term “play.” They further articulate multiple scopes for the term, proposing a hierarchy comprised of three broad types. The most open sense is “being playful,” such as teasing or wordplay. Next is “ludic activity,” such as playing with a ball, but without the formal structure of a game. The most focused type is “game play,” where players adhere to rigid rules that define a particular game space.

Play and game play in particular have been shown to be an important motivational tool [63], and as Salen and Zimmerman note, play can be transformative as, “it can overflow and overwhelm the more rigid structure in which it is taking place, generating emergent, unpredictable results.” Our work is informed by these broad conceptions of play that are applied to the implementation of game-like learning scenarios for K-12 content learning [62].

Jenkins offers an expansive definition of play as “the capacity to experiment with one’s surroundings as a form of problem-solving” [64]. Students engaged in this type of play exhibit the same transformative effects as described by Salen and Zimmerman. We apply this definition of play as collaborative problem solving in our work with students in formal learning contexts.

2.5. Toward a Theoretical and Technological Integration

As described above, there has been extensive theoretical and practice-based research across Education and HCI that is aimed at improving learning through the use of embodiment, multimodality, and compositional frameworks. We have described examples of prior projects, each of which strongly emphasizes one or two of these concepts. This prior work has yielded significant results that demonstrate the powerful impact of educational research that is aligned with emerging HCI practices. However, while there are some prior examples of interactive platforms that integrate these principles [65], there are few prior efforts to-date that do so while leveraging the powerful affordances of mixed reality for content learning. As such there is an important opportunity to improve upon prior work.

In addition, many technologically driven efforts are limited by the use of leading edge technologies that are prohibitively expensive and/or too fragile for most real-world learning situations. As a consequence, many promising initiatives do not make a broad impact on students and cannot be properly evaluated owing to a failure to address the practical constraints of today’s classrooms and informal learning contexts. Specifically, in order to see large-scale deployment on a two- to five-year horizon, learning environments must be inexpensive, mindful of typical site constraints (e.g., space, connectivity, infrastructure support), robust, and easily maintainable. It is essential to reach a balance between reliance upon leading-edge technologies and consideration of the real-world context in order to collect longitudinal data over a broad population of learners that will demonstrate the efficacy of these approaches.

Our own efforts are focused on advancing research at the intersection of HCI and Education. We next describe a new mixed-reality environment for learning, a series of formative pilot studies, and two recent in-school programs that illustrate the implementation and demonstrate the impact of our work.

3. SMALLab: Integrated HCI for Learning

SMALLab represents a new breed of student-centered learning environment (SCLE) that incorporates multimodal sensing, modeling, and feedback while addressing the constraints of real-world classrooms. Figure 2 diagrams the full system architecture, and here we detail select hardware and software components.

Physically, SMALLab consists of a portable, freestanding media environment [66]. A cube-shaped trussing structure frames an open physical architecture and supports the following sensing and feedback equipment: a six-element array of Point Grey Dragonfly firewire cameras (three color, three infrared) for vision-based tracking, a top-mounted video projector providing real time visual feedback, four audio speakers for surround sound feedback, and an array of tracked physical objects (glowballs). A networked computing cluster with custom software drives the interactive system.

The open physical architecture of the space is designed to encourage human-to-human interaction, collaboration, and active learning within a computational framework. It can be housed in a large general-purpose classroom without the need for additional specialized equipment or installation procedures. The use of simple, unencumbered sensing technologies ensures that there is a minimal learning curve for interaction, yet it has been utilized in diverse educational contexts including schools and museums.

With the exception of the glowballs, all SMALLab hardware (e.g., audio speakers, cameras, multimedia computers, video projector, support structure) is readily available off-the-shelf. This ensures that SMALLab can be easily maintained throughout the life of a given installation as all components can be easily replaced. Furthermore, the use of readily available hardware contributes to the overall low cost of the system. We have custom developed all SMALLab software which is made freely available to our partner educational institutions.

SMALLab can be readily transported and installed in classrooms or community centers. We have previously disassembled, transported to a new site, reinstalled, and calibrated a functioning SMALLab system within one day’s time.

3.1. Multimodal Sensing

Groups of students and educators interact in SMALLab together through the manipulation of up to five illuminated glowball objects and a set of standard HID devices including wireless gamepads, Wii Remotes [67, 68], and commercial wireless pointer/clicker devices. The vision-based tracking system senses the real-time 3D position of these glowballs at a rate of 50–60 frames per second using robust multiview techniques [69]. To address interference from visual projection, each object is partially coated with a tape that reflects infrared light. Reflections from this tape can be picked up by the infrared cameras, while the visual projection cannot. Object position data is routed to custom software modules (described below) that perform various real-time pattern analyses on this data, and in response, generate real-time interactive sound and visual transformations in the space. With this simple framework we have developed an extensible suite of interactive learning scenarios and curricula that integrate the arts, sciences, and engineering education.

3.2. Rich Media Database

SMALLab features an integrated and extensible rich media database that maintains multimodal content provided by students, teachers, and researchers. This is an important tool in support of multimodal knowledge representation in SMALLab. It manages audio, video, images, text, and 3D objects and enables users to annotate all media content with user-specific metadata and typed links between elements. The SCREM interface (described below) tightly integrates search and navigation tools so that scenario authors and students can readily access this media content.

3.3. SCREM

We apply the notion of composition at two levels. First, we have conceived of SMALLab as a modular framework to ensure that educators and administrators can continuously extend and improve it through the design and implementation of new scenarios. In this regard, SMALLab is not a one-size-fits-all solution, but rather, it enables an educator- and community-driven learning environment. Second, many SMALLab curricula emphasize learning through collaborative problem solving and open-ended design challenges. These approaches demand that students are able to readily design and deploy new interactive scenarios through the manipulation of powerful, yet easy to use interfaces—interfaces that provide both depth and breadth.

To this end we have developed an integrated authoring environment, the SMALLab core for realizing experiential media (SCREM). SCREM is a high-level object oriented framework that is at the center of interaction design and multimodal feedback in SMALLab. It provides a suite of graphical user interfaces to either create new learning scenarios or modify existing frameworks. It provides integrated tools for adding, annotating, and linking content in the SMALLab Media Content database. It facilitates rapid prototyping of learning scenarios, enables multiple entry points for the creation of scenarios, and provides age and ability appropriate authoring tools.

SCREM supports student and teacher composition at three levels. First, users can easily load and unload existing learning scenarios. These learning scenarios are stored in an XML format that specifies interactive mappings, visual and sonic rendering attributes, typed media objects, and metadata including the scenario name and date. Second, users can configure new scenarios through the reuse of software elements that are instantiated, destroyed, and modified via a graphical user interface. Third, developers can write new software code modules through a plug-in type architecture that are then made available through the high-level mechanisms described above. Depending on developer needs, low-level SMALLab code can be written in a number of languages and media frameworks including Max/MSP/Jitter, Javascript, Java, C++, Objective C, Open Scene Graph, and VR-Juggler.

3.4. SLink Web Portal

The SMALLab Link or, SLink, web portal [66] provides an online interface that enables teaching and learning to seamlessly span multiple SMALLab installations and to extend from the physical learning environment and into students’ digital realms. It serves as three functions: (1) a supportive technology, (2) a research tool, and (3) an interface to augment SMALLab learning.

As a supportive technology, SLink acts as a central server for all SMALLab media content and user data. It provides functionality to sync media content that is created at a given SMALLab site to all other sites while preserving unique metadata. Similarly, SLink maintains dynamic student and educator profiles that can be accessed by teachers and researchers online or in SMALLab.

SLink is a research tool and an important component of the learning evaluation infrastructure. Through a browser-based interface, educational researchers can submit, search, view, and annotate video documentation of SMALLab learning. Multiple annotations and annotator metadata are maintained for each documentation element.

SLink serves as a tool for students where they can access or contribute media content from any location through the web interface. These media content and metadata will sync to all SMALLab installations. In ongoing work, we are expanding the SLink web interface to provide greater functionality for students. Specifically, we are developing tools to search and render 3D SMALLab movement data through a browser-based application. Student audio interactions can be published as podcasts, and present visual interactions presented as streaming movies. In these ways, SLink extends into the web our paradigms of multimodal interaction and learning through composition.

3.5. Experiential Activity Archive

All glowball position data, path shape quality information, SCREM interface actions, and projected media data are streamed in real time to a central archive application. Incoming data is timestamped and inserted into a MySQL database where it is made available in three ways. First, archived data can be replayed in real time such that it can be rerendered in SMALLab for the purpose of supporting reflection and discussion among students regarding their interactions. Second, archived data is made available to learners and researchers through the SLink web interface. Third, archived data can be later mined for the purposes of evaluation and assessment of SMALLab learning. We are currently developing a greatly expanded version of the activity archive that will include the archival of real-time video and audio streams, interfaces to create semantic links among entries, and tools to access the data from multiple perspectives.

4. Case Study: Earth Science Learning in SMALLab

Having presented a theoretical basis and described the development and integration of various HCI technologies into a new mixed-reality environment, we now focus on the application and evaluation of SMALLab for learning. This research is undertaken at multiple levels including focused user studies to validate subcomponents of the system [70, 71], and perception/action experiments to better understand the nature of embodied interaction in mixed-reality systems such as SMALLab [72]. Over the past several years we have reached over 25,000 students and educators through research and outreach in both formal and informal contexts that span the arts, humanities, and sciences [7375]. This prior work serves as an empirical base that informs our theoretical framework. Here we present a recent case study to illustrate our methodology and results.

4.1. Research Context

In Summer 2007 we began a long-term partnership with a large urban high school in the greater Phoenix, AZ metropolitan area. We have permanently installed SMALLab in a classroom and are working closely with teachers and students across the campus to design and deploy new learning scenarios. This site is typical of public schools in our region. The student demographic is 50% white, 38% Hispanic, 6% Native American, 4% African American, and 2% other. 50% of students are on free or reduced lunch programs, indicating that many students are of low socioeconomic status. 11% are English language learners and 89% of these students speak Spanish at home. In this study, we are working with 9th grade students and teachers from the school’s dedicated program for at-risk students. The program is a specialized “school within a school” with a dedicated faculty and administration. Students are identified for the program because they are reading at least two levels below their grade and have been recommended by their middle school teachers and counselors. After almost a year of classroom observation by our research team, it is evident that students are tracked into this type of program, not because they have low abilities, but because they are often underserved by traditional instructional approaches and exhibit low motivation for learning. Our work seeks to address the needs of this population of students and teachers.

Throughout the year, we collaborated with a cohort of high school teachers to design new SMALLab learning scenarios and curricula for language arts and science content learning. Embodiment, multimodality, and composition served as pillars to frame the formulation of new SMALLab learning scenarios, associated curricula, and the instructional design. In this context, we present one such teaching experiment. This case study illustrates the use of SMALLab for teaching and learning in a conventional K-12 classroom. It demonstrates the implementation of our theoretical framework around the integration of embodiment, multimodality, and composition in a single learning experience. Finally, we present empirical evidence of student learning gains as a result of the intervention.

4.2. Design and Teaching Experiment

The evolution of the earth’s surface is a complex geologic process that is impacted by numerous interdependent processes. Geologic evolution is an important area of study for high school students because it provides a context for the exploration of systems thinking [76] that touches upon a wide array of earth science topics. Despite the nature of this complex, dynamic process, geologic evolution is typically studied in a very static manner in the classroom. In a typical learning activity, students are provided with an image of the cross-section of the earth’s crust. Due to the layered structure of the rock formations, this is sometimes termed a geologic layer cake. Students are asked to annotate the image by labeling the rock layer names, ordering the layers according to which were deposited first, and identifying evidence of uplift and erosion [77]. Our partner teacher has numerous years of experiences with conventional teaching approaches in his classroom. Through preliminary design discussions with him, we identified a deficiency of this traditional instructional approach: when students do not actively engage geologic evolution as a time-based, generative process, they often fail to conceptualize the artifacts (i.e., cross-sections of the earth’s surface) as the products of a complex, dynamic process. As a consequence, they struggle to develop robust conceptual models during the learning process.

For six weeks we collaborated with the classroom teacher, using the SMALLab authoring tools, to realize a new mixed-reality learning scenario to aid learning about geologic evolution in a new way. Our three-part theoretical framework guided this work: embodiment, multimodality, and composition. At the end of this process, the teacher led a three-day teaching experiment with seventy-two of his ninth-grade earth science students from the CORE program. The goals for the teaching experiment were twofold. First, we wanted to advance participating students’ understanding of earth science concepts relating to geologic evolution. Second, we wanted to evaluate our theoretical framework and validate SMALLab as a platform for mixed-reality learning in a formal classroom-learning environment.

We identified four content learning goals for students: (1) understanding of the principle of superposition—that older structures typically exist below younger structures on the surface of the Earth; (2) understanding geologic evolution as a complex process with interdependent relationships between surface conditions, fault events, and erosion forces; (3) understanding that geologic evolution is a time-based process that unfolds over multiple scales; (4) understanding how the fossil record can provide clues regarding the age of geologic structures. These topics are central to high school earth science learning and are components of the state of Arizona earth and space science standards [78]. We further stipulate that from the theoretical perspective of modeling instruction [79, 80] students should be able to apply a conceptual model of geologic evolution that integrates both descriptive and explanatory elements of these principles.

Our collaborative design process yielded three parts: (1) a new mixed-reality learning scenario, (2) a student participation framework, and (3) an associated curriculum. We now describe each of these parts, discussing how each tenet of our theoretical framework is expressed. We follow this with a discussion of the outcomes with respect to our goals.

4.2.1. Interactive Scenario: Layer-Cake Builders

Figure 3 shows the visual scene that is projected onto the floor of SMALLab. Within the scene, the center portion is the layer-cake construction area where students deposit sediment layers and fossils. Along the edges, students see three sets of images. At the bottom they see depictions of depositional environments. At the top are images that represent sedimentary layers. To the right they see an array of plant and animal images that represent the fossil record. Each image is an interactive element that can be selected by students and inserted into the layer-cake structure. The images are iconic forms that students encounter in their studies outside of SMALLab. A standard wireless game pad controller is used to select the current depositional environments from the five options. When one student makes a selection, other students will see the image of the environment and hear a corresponding ambient sound file. One SMALLab glowball is used to grab a sediment layer—by hovering above it—from five options and drop it onto the layer-cake structure in the center of the space. This action will insert the layer into the layer-cake structure at the level that corresponds with the current time period. A second glowball is used to grab a fossil from ten options and drop it onto the structure. This action embeds the fossil in the current sediment layer. On the east side of the display, students see an interactive clock with geologic time advancing to increment each new period. Three buttons on a wireless pointer device are used to pause, play, and reset geologic time. A bar graph displays the current fault tension value in real time. Students use a Wii remote game controller [67, 68], with embedded accelerometers, to generate fault events. The more vigorously that a user shakes the device, the more the fault tension will increase. Holding the device still will decrease the fault tension. When a tension threshold is exceeded, a fault event (i.e., earthquake) will occur, resulting in uplift in the layer-cake structure. Fault events can be generated at any time during the building process. Subsequently erosion occurs on the uplifted portion of the structure.

Figure 4 illustrates that in addition to the visual feedback present in the scene, students hear sound feedback with each action they take. A variety of organic sound events including short clicks and ticks accompany the selection and deposit of sediment layers and fossils. These events were created from field recordings of natural materials such as stones. This feedback contributes to an overarching soundscape that is designed to enrich students’ sense of immersion in the earth science model. In addition, key earth science concepts and compositional actions are communicated to the larger group through sound. For example, the selection of a depositional environment is represented visually through an image, and sonically through looping playback of a corresponding sound file. If a student selects the depositional environment of a fast moving stream, all students will see an image of the stream, and hear the sound of fast moving water. The multimodal display first alerts all students to be aware of important events in the compositional process. In addition, the dynamic nature of the fast moving water sound communicates important features of the environment itself that are not necessarily conveyed through image alone. Specifically, a fast moving stream is associated with the deposition of a conglomerate sediment layer that contains a mixture of large and small particles. The power of water to move large rocks and even boulders is conveyed to students through sound.

While students are engaged in the compositional process, sound is an important component of how they parse the activity and cue their own actions. Here we present a transcript from a typical layer-cake build episode, demonstrating how sound helps students to orient themselves in the process. In the transcription T is the teacher and FS indicates a member of the fossil selection team:

(The student holding the controller from the depositional environment group selects an environment and the sound of ocean waves can be heard. Responding to the sound cue without even looking up at the image of the depositional environment highlighted, the student controlling the glowball for sediment layer team moves to select limestone.)FS1: Shallow ocean.FS2:Wait, wait, wait.(As the student holding the fossil glowball moves to make his selection. A fossil team member tells the boy with the glowball to wait because he could not see what sediment layer had been selected. After the sediment group and the fossil group made their selections, someone from the depositional environment team changes their selection. When the sound of a new environment is heard, the fossil team selector student (FS1) looks at the new environment and sees that the fossil he deposited is no longer appropriate for this environment. He picks up an image of a swimming reptile but then pauses uncertainly before depositing it.)FS2: Just change it.T: Just change it to the swimming reptile.(The clock chimes the completion of one cycle at this point. The depositional environment team shifts their choice to desert and a whistling wind sound can be heard. Again, without even looking at the depositional environment image, the fossil group selector, FS2, quickly grabs a fossil and deposits it while the sediment layer girl runs back and forth above her 5 choices trying to decide which one to choose. She finally settles on one, picks and deposits it and then hands off the glowball and sits down. The next two selector students stand at the edge of the mat waiting for the clock to complete another cycle. The assessment team is diligently taking notes on what has been deposited. Another cycle proceeds as the sound of ocean waves can be heard. Students controlling the glowballs move quickly to make their selections without referring to the highlighted depositional environment.)

As shown in Figure 5, during the learning activities, all students are copresent in the space, and the scenario takes advantage of the embodied nature of SMALLab. For example, the concept of fault tension is embodied in the physical act of vigorously shaking the Wii Remote game controller. In addition this gesture clearly communicates the user’s intent to the entire group. Similarly, the deliberate gesture of physically stooping to select a fossil and carrying it across the space before depositing it in the layer-cake structure allows all students to observe, consider and act upon this decision as it is unfolding. Students might intervene verbally to challenge or encourage such a decision. Or they might coach a student who is struggling to take action. Having described the components of the system, we now narrate and discuss the framework that enables a class of over twenty students to participate in the scenario.

4.2.2. Participation Framework

The process of constructing a layer cake involves four lead roles for students: (1) the depositional environment selector, (2) the sediment layer selector, (3) the fossil record selector, and (4) the fault event generator. In Figure 4, we diagram the relationship between each of these participant roles (top layer) and the physical interaction device (next layer down). The teacher typically assumes the role of geologic time controller.

In the classroom, approximately twenty to twenty-five students are divided into four teams of five or six students each. Three teams are in active rotation during the build process, such that they take turns serving as the action lead with each cycle of the geologic clock. These teams are the (1) depositional environment team and fault event team, (2) the sediment layer team, and (3) the fossil team. The remaining students constitute the evaluation team. These “evaluator” students are tasked to monitor the build process, record the activities of action leads, and to steer the discussion during the reflection process. Students are encouraged to verbally coach their teammates during the process.

There are at least two ways in which the build process can be structured. On the one hand, the process can be purely open ended, with the depositional environment student leading the process, experimenting with the outcomes, but without a specific constraint. This is an exploratory compositional process. Alternatively, the students can reference an existing layer-cake structure as a script such as the one pictured in Figure 6. This second scenario is a goal directed framing where only two students have access to the original script, but all participants must work together to reconstruct the original. At the end of the build cycle, students compare their structure against the original. In this discussion we narrate the goal-directed build process.

At the beginning of each geologic period, the lead “depositional environment” student examines the attributes of the source structure (e.g., Figure 6) and selects the appropriate depositional environment or surface condition on the earth. All students see an image and hear a sonic representation of the depositional environment. Based on that selected condition, another student grabs the appropriate sedimentary rock, and drops it onto the structure. While considering the current evolutionary time period and the current depositional environment, another student grabs a fossilized animal and lays it into the sedimentary layer. To address any potential student misconceptions, the teacher initially leads a discussion to clarify that fossilization is yet another example of a geologic process that students should be aware of, despite the fact that it is not a focus of this particular activity. If a student changes their mind, sediment and fossil layers can be replaced by another element within a given geologic time period. As the geologic clock finishes a cycle, the next period begins. The action lead passes their interaction device to the next teammate, and these students collaborate to construct the next layer. The rotation continues in like fashion until the layer cake is complete. In this manner, the layer-cake build process unfolds as a semistructured choreography of thought and action that is distributed across the four action leads and their teammates. The teams rotate their roles each time a new layer cake is to be constructed. The fossil students become evaluators, while the evaluators become the sediment layer team and so forth.

From a compositional perspective, this process is open ended and improvisational. By open ended we mean that any combination of depositional environments, sediment layers, fossils, and fault events can occur without constraint from the technology itself. By improvisational we mean that it unfolds in real time, and each participant acts with a clearly defined role, yet independently of the other students. The participation framework is analogous to a group of improvising jazz musicians. Students have individual agency to think and act freely. Yet they are bound by a constrained environment and driven by the shared goal of producing a highly structured outcome. Composition is distributed across multiple students where each has a clearly defined role to play and a distinct contribution to be made toward the collective goal. Collective success or failure depends on all participants. This process unfolds in real time with the expectation that there will be continuous face-to-face communication between participants.

This interaction model affords rich opportunities for whole group action and discussion about the relationship between in-the-moment events and the consequence of these decisions in the final outcome. For example, “fault event” students are free to generate earthquake after earthquake and explore the outcomes of this activity pattern including its impact on students who are depositing sediment layers and fossils. Through this experimentation, students come to understand that in the real world, just as in the model, periods of numerous fault events are often interspersed with periods of little activity. This is a system-level understanding of geologic evolution that must be negotiated by teams of students over the course of numerous cycles.

The learning activity is a form of structured play in two senses. Following Salen and Zimmerman’s model, the layer-cake build process unfolds in a structured manner as defined by the interaction framework. However, the play activity can take different forms according to the metarules set by the teacher. For example, during the open-ended compositional process, play is akin to “ludic activity” where a clear game space is articulated in SMALLab, but there are not clearly defined start and end conditions. When the activity is structured with a reference layer-cake image and students are given the explicit goal to recreate that structure, the activity takes the form of goal oriented “game play.” Jenkins’ notion of play also frames the learning activity as he defines play to be “the capacity to experiment with one’s surroundings as a form of problem-solving.” Again, in both the open-ended and structured forms, the layer-cake build process is posed as a complex problem-solving activity that unfolds in real time. Importantly, individual participants must cooperatively integrate their thoughts and actions to achieve a shared success.

4.2.3. Curriculum

We collaborated with our partner teacher to design a curriculum that he implemented during a total of three, forty-five minute class periods across three consecutive days. The curriculum is informed by our overarching theoretical framework and is designed to foster student-centered learning. Student activity is structured around a repeating cycle of composition reflection. From a modeling instruction perspective [79, 80], this activity cycle supports students’ underlying cognitive process that we term as knowledgeconstruction consolidation. During the first phase of the cycle, (i.e., activity = composition and cognitive process = knowledge construction), students construct a simple conceptual model of the evolution of the earth’s crust. Teams of students work together in real time to create a layer-cake representation of this model. By engaging in this hands-on, compositional activity, they continuously form, test, and revise their model. This phase is immediately followed by a second stage (i.e., activity = reflection and cognitive process = knowledgeconsolidation) in which students discuss their activities, analyze any flaws in decision making, make sense of the various aspects of the layer-cake structure, and challenge one another to justify their choices. This reflective activity leads to a consolidation of the conceptual model that was interactively explored during the first phase. With each iteration of this cycle, new elements are introduced and new knowledge is tested and consolidated, ultimately leading to a robust and coherent conceptual model of the process of geologic evolution.

As this was the first experience in SMALLab for most students, day one began with a brief introduction to the basic technology and an overview of the teacher’s expectations. The teacher then introduced the technological components of the learning scenario itself and students were divided into teams to begin creating layer-cake structures in an open-ended, exploratory fashion. During this first day, the teacher structured the interactions, frequently pausing the scenario and prompting students to articulate their thinking before continuing the interaction. For example, he first started the geologic clock and asked the depositional environment team to select an environment, leading a discussion of the images and sounds, and what they represent. Once an environment was selected he would stop the geologic clock and ask the sediment layer team to discuss the sediment icons and why a particular selection would be appropriate or not. Restarting geologic time, the team selected their choice for the best sediment layer, placed it in the layer-cake structure. Similar discussions and actions unfolded for the selection of an appropriate fossil. Over the course of the class period, the teacher intervened less and less as the students improved in their ability to coordinate their activities and reason through the construction process on their own. Figure 6 shows an example of the outcome of a layer-cake build cycle. During each reflection stage, we captured screenshot of the layer-cake structure and uploaded and annotated it in the SLink database for later reference.

During day two, the teacher introduced the fault event interface and teams assumed this role in a similar manner as the exploration of day one. Discussions regarding the selection of the fossil record grew more detailed as students were challenged to consider both the environmental conditions and the sequence of geologic time in their selection process. For example, students reasoned through an understanding of why mammalian fossils should not appear early in the fossil record due to their understanding of the biological evolution of species. Midway through the class, the teacher moved students to the structured build process. He provided the “depositional environment” team with source images that show geologic cross-sections of the earth’s crust such as the one pictured in Figure 6. These students had to interpret the sequence of sediment layers and uplift/erosion evidence to properly initiate the environments and fault events that would cause the actions that followed to reproduce the source image. Only the few students on the “depositional environment” team had access to this source image. Thus all others’ actions were dependent on their decision making. For example, the “sediment” selection team could potentially add a rock layer that did not align with the source image for a particular geologic period. While this could stem from a misunderstanding by their action lead, this deviation might be due to the improper selection of a depositional environment. Or both the depositional environment and the sediment could be selected incorrectly, causing a chain of deviations that would have to be unraveled at the end of the build. Students continued iterating through the composition reflection process, rotating roles with each cycle, structuring their successive interactions, and measuring their progress with the explicit goal of replicating the reference layer cake. The teacher at times guided this reflective process, but the student “evaluation” team members increasingly led these discussions.

On day three, the teacher led a summative assessment activity. Prior to the session, he worked in SMALLab to create a set of four layer-cake structures. He captured screenshots and printed images of these four structures. During class the students worked to recreate each of the structures in a similar manner as in day two. At the end of each build process, the “evaluation” team reported any deviations from the reference structure, and the build teams were given the opportunity to justify and defend their actions. The teacher assigned a grade to each student at the end of the class period. These grades were a measure of their mastery of the build process as indicated by their ability to effectively contribute to the replication of the source structure and/or justify any deviations. Similar to days one and two, team action leads rotated with each new geologic period, and teams rotated through the different roles each time a new script was introduced. During this class session the teacher made very few interventions as students were allowed to reason through the building and evaluation process on their own.

4.3. Outcomes

During the final in-class assessment activity on day three, all teams demonstrated an impressive ability to accurately reproduce the source structure. Collectively, the students composed fifteen layer cakes during day three. Eleven of the results were either a perfect match or within tolerable limits (e.g., only a slight deviation in the intensity of a fault event or no more than one incorrect sediment layer) of the source structure. Deviations typically stemmed from students’ selection of alternate fossils in circumstances where there was room for interpretation or minor deviations in the magnitude of fault events within a given geologic period. Students also exhibited improvement in their ability to justify their actions, developing arguments by the final day which suggest that they quickly developed robust conceptual models of the underlying content.

For example, below is a transcript of the teacher and students in a typical cycle of composition reflection from day one of the treatment. The teacher is controlling geologic time during this episode. When the transcription begins, the students are in the middle of a layer-cake build process and they have just completed discussion about creating one layer in the process. After his first comment, he starts the geologic clock again, and the students commence constructing the next layer. In the transcriptions T is the teacher and students are identified by a first initial or S if the exact voice could not be identified.

T:Alright, let’s go one more time.(Sound of rushing water. The students with the glowballs pick a sediment layer (sandstone) and a fossil (fish) and lay them into the scenario. This takes lass than 10 seconds. When they are done the teacher pauses the geologic clock to engage them in reflection.)T: Alright, depositional environment—what are we looking at?Ss: A river.T: A river. Sandstone. Is that a reasonable choice for a type of rock that forms in a river? (Shrugs) Could be is there any other types of rock over there that form in a river. Chuck. What’s another rock over there that might form in a river?C:In a river? I can’t find one T:In a river. (there is a pause of several seconds)S: Conglomerate.T: Alright. Conglomerate is also an acceptable answer. Sandstone’s not a bad answer. Conglomerate is pretty good big chunks of rock that wash down in the river. So, what kind of fossil did you put in?S: A fish.T: A fish, okay. A fish in a stream makes good sense. Let’s think about the fossils that we have in here. First we have a trilobite and then we had a jellyfish, then we had a fern and then we had a fish, alright? Is there anything wrong with the order of these animals so far?S:They’re aging.T:What do you mean, “they’re aging”?S: Evolution?T:It’s evolution so which ones should be the older fossils? (pause of several seconds)S: Trilobite?T: Trilobite in this case why the trilobite in this case? How do we know the trilobite’s the oldest?S: Because it’s dead.T: Just look at the picture. How do we know that the trilobite is oldest?S: Because it’s on the bottom?T: We know that the oldest rocks are found S:On the bottom.T: on the bottom. So that’s another thing that we want to make sure that we’re keeping in check we don’t want to end up putting a whale on the bottom and a trilobite on top of a whale because what kind of animal is a whale? (Pause) It’s a mammal, alright? Mammals are relatively recently evolved. So let’s pass off the spheres, guys. This next cycle I’m going to do a little different. I’m going to let two cycles go through without stopping you. Let’s see how well we can do with the two cycles.

Now we present a brief transcription of a typical episode from day 3. Here, students have just finished building a complete layer cake. One student team controlled the depositional environment and faulting events, another team controlled sediment layers, a third team, controlled fossils, and a fourth team acted as evaluators, determining the plausibility of various elements used in the construction.

T:Alright, JR, What’s the first rock supposed to be?JR:They got them all right.T:All the rocks are correct?JR: Yeah.T:Ok. How about depositional environments, and Walt you’re going to have to help her with this do all the depositional environments match up with the rocks that were chosen?W:Yeah.T: All the rocks match up what about the fossils, A (student)?A: They actually had some differences T: It doesn’t have to exactly as it is on here. This is just a suggested order, right? What you need to do is figure out whether or not the ones they chose fit their environment.W: Yeah. Well except for S1:Except for the fern in there W: Yeah, number 9 was supposed to be a fish, but it was a fern.T:Ok, well, like I said, it doesn’t necessarily have to be the fish that’s there is a fern possible as a formation of a fossil in a conglomerate, which is what type of depositional environment?S1:Water S2: Stream S3:River T: A stream is it possible for a fern to form a fossil in stream environment?

Many voices: yeah no, no no yeah

T: Alright. Bill says there is. Let’s hear what you have to say Bill.B: I just said that it can be.T: Okay. How. How would that happen?D:Cuz he thinks he knows everything.T: David. Talk to Bill. I think you have a potential valid argument here but I want to hear it so we can make our so we can judge.D:That’s cool.B: Well like, ferns grow everywhere, and if it lives near a river it could fall in T: Do ferns grow everywhere?S4: No, not deserts.T:Where do they typically grow? What do they need to grow?Ss: Water.T: Water. Would a fern growing next to a river make sense?Ss:Yeah.T: Do you think over the course of millions of years that one fern could end up preserved in a river environment?B: Yeah. Fern plants could.T: So since you guys over here are judging the fossils, Andy, do you accept his answer for why there’s a fern there?A:Yeah.T:I would agree. I think that’s an acceptable answer. It doesn’t always have to be the way it pans out on the image here. Any other thing that you see? What about Allen, did they put the earthquake at the right point.A: No. They’re a little off.T: How were they a little bit off?A:They went, like, really long.T: Could you be more specific. How did they go a little long?A: She got excited. (Referring to the fact that she shook the Wii Remote hard for almost 10 seconds causing multiple faulting events.)S1:I told you to stop.S2: It’s hard to do it right.S1: Have an aneurism why don’t you.T: Okay. Allow me to just work it out. I’m mediating. I’m backing you guys up okay? So Allen, the important thing is, did it come at t he right time?A:Yeah. It was just too long.T:Okay. That’s more important. Maybe when you use the Wii controller sometimes it’s hard to know when to stop.S:Yeah.T: Do you think that this is acceptable the way that they did it.S:Yeah.T: I would agree with you as well. So were there any points taken off for any decisions that were made in creating this geologic cross section?S:No. Not really. It was all good.T: It was all good alright awesome

These two transcripts demonstrate two important trends. First, there is a marked difference in the nature of the reflective discussion between the two days. The discussion in day 1 is exclusively led by the teacher as he prompts students to respond to direct questions. By day 3, while the teacher serves to moderate the discussion, he is able to steer the more free-flowing conversation in away that encourages students to directly engage one another. Second, owing to the open-ended nature of the build process, students are by day 3 considering alternative solutions and deviations in the outcomes. They discuss the viability of different solutions and consider allowable tolerances. This shows that they are thinking of the process of evolution as a complex process that can have multiple “acceptable” outcomes so long as those outcomes align with their underlying conceptual models.

To assess individual students’ content learning gains, we collaborated with the classroom teacher, to create a ten-item pencil and paper test to assess students’ knowledge of earth science topics relating to geologic evolution. Each test item included a multiple-choice concept question followed by an open-format question asking students to articulate an explanation for their answer. The content for this test was drawn from topics covered during a typical geologic evolution curriculum and aligning with state and federal science standards. All test concepts were covered in the teacher’s classroom using traditional instructional methods in the weeks leading up to the experiment. As such, at the time of the pretest, students had studied (and learned) all of the test material to the full extent that would be typically expected. To be clear, the three-day teaching experiment did not introduce any new concepts but rather only reinforced and reviewed previously studied topics. This concept test was administered one day before and then one day after the SMALLab treatment. Every student in our partner teacher’s earth science classes participated in the teaching experiment and thus we were not able to administer the test to a control group.

Table 1 shows the pre- and posttest scores for the seventy-two participating students. The summary is divided into two categories for the multiple-choice items and corresponding open-answer explanation items. Open-answer questions were rated on a 0–2 scale where a score of 0 indicates a blank response or nonsense response. A score of 1 indicates a meaningful explanation that is incorrect or only partially accurate. A score of 2 indicates a well-formed and accurate explanation. We computed a percentage increase and the Hake gain for each category. A Hake gain is the actual percent gain divided by the maximum possible gain [81]. Participating students achieved a 22.6% overall percent increase in their multiple-choice question scores, a 48% Hake gain . They achieved a 40.4% overall percent increase in their explanation scores, a 23.5% Hake gain . These results reveal that nearly all students made significant conceptual gains as measured by their ability to accurately respond to standardized-type test items and articulate their reasoning.

We also observed that the student-centered, play-based nature of the learning experience had a positive impact on students. All participants were part of the school’s CORE program for at-risk students. While many of these students are placed in the program due to low academic performance, after one year of observation, we see that this is often not due to a lack of ability, but rather to a lack of motivation to participate in the traditional culture of schooling. During our three-day treatment we observed high motivation from students. Many students who might otherwise disengage from or even disrupt the learning process emerged as vocal leaders in this context. These students appeared intrinsically motivated to participate in the learning activity and displayed a sense of ownership for the learning process that grew with each day of the treatment. As evidence of the motivating impact of play, we informally observed a group of students from outside the teacher’s regular classes. These students previously spoke with their peers about their in-class experience and subsequently visited SMALLab during their lunch hour to “play” in the environment. For nearly a full class period these students composed layer-cake structures, working together, unsupervised by any teacher.

5. Conclusions

We have presented theoretical research from HCI and Education that reveals a convergence of trends focused on embodiment, multimodality, and composition. While we have presented several examples of prior research that demonstrates the efficacy of learning in environments that align work in HCI and Education, there are few examples of large-scale projects that synthesize all three of these elements. We have presented our own efforts in this regard, using the integration of these three themes as a theoretical and technological framework that is informed by broad definitions of play. Our work includes the development of a new mixed-reality platform for learning that has been pilot tested and evaluated through diverse pedagogical programs, focused user studies, and perception/action experiments. We presented a recent high school earth science program that illustrates the application of our three-part theoretical framework in our mixed-reality environment. This study was undertaken with two primary goals: (1) to advance students’ knowledge of earth science content relating to geologic evolution, and (2) to evaluate our theoretical framework and validate SMALLab as a platform for mixed-reality learning in a formal classroom learning environment. Participating students demonstrated significant learning gains after only a three-day treatment and exhibited strong motivation for learning as a result of the integration of play in the scenario. This success demonstrates the feasibility of mixed-reality learning design and implementation in a mainstream formal school-based learning environment. Our preliminary conclusions suggest that there is great promise for the convergent themes of applied HCI and Educational research that are manifest in the SMALLab learning platform and our three-part theoretical base.

6. Future Work

We are currently working to increase the scope and scale of the SMALLab platform and learning programs. With regarding to the technological infrastructure, we are actively pursuing augmented sensing and feedback mechanisms to extend the system. This research includes an integrated framework for robotics, outfitting the tracked glowballs with sensors and wireless transmission capabilities, and integrating an active RFID system that will allow us to track participant locations in the space. We are extending the current multimodal archive to include real-time audio and video data that is interleaved with control data generated by the existing sensing and feedback structures.

With regard to learning programs, we continue our collaboration with faculty and students at a regional high school. We are currently collecting data that will allow us to evaluate the long-term impact of SMALLab learning that is correlated across multiple content areas, grades, and instructional paradigms. Concurrently we are developing a set of computationally based evaluation tools that will identify gains in terms successful SMALLab learning strategies and the attainment of specific performance objectives. These tools will be applied to inform the design of SMALLab programs, support student-centered reflection, and communicate to the larger HCI and Education communities our successes and failures in this research.


The authors gratefully acknowledge that this document is supported by the National Science Foundation CISE Infrastructure grant under Grant no. 0403428 and IGERT Grant no. 0504647. They extend their gratitude to the students, teachers, and staff of the Herberger College for Kids, Herrera Elementary School, Whittier Elementary School, Metropolitan Arts High School, Coronado High School, and ASU Art Museum for their commitment to exploration and learning.