Abstract

The paper presents results of a quasiexperiment where the three social classroom applications Post-It, WordCloud, and Categorizer were used in software architecture lectures. Post-It and WordCloud are applications that allow students to brainstorm or give comments related to a given topic. Categorizer is a puzzle game where the students are asked to place a number of terms in one of two correct categories. The three applications are multimodal HTML5 applications that enable students to interact in a classroom using their own digital devices, and the teacher’s laptop is used to display progress and results on the large screen. The focus of this study was to evaluate how the difference of these applications and how their integration into the lecture affected the students’ motivation, engagement, thinking, activity level, social interaction, creativity, enjoyment, attention, and learning. In addition, the study evaluated the usability and the technical quality of the applications. The results of the experiment show that the way such applications are integrated into a lecture highly affects the students’ attitude. The experiment also showed that the game-based application was on average better received among the students and that the students’ attitude was highly sensitive to the difficulty level of the game.

1. Introduction

The introduction of technology in classrooms has opened new ways of interacting in lectures. More and more classrooms are being equipped with smart boards or video projectors, and it is becoming common to have access to wireless networks throughout school campuses. Teachers usually have access to laptops or tablets used for teaching, and some schools even provide every student with a tablet or laptop. For schools where students are not provided with tablets or laptops, the Bring Your Own Device (BYOD) approach is an alternative. A survey from 2013 showed that more than 85 percent of 500 educational institutions in UK and US allowed some form of BYOD [1]. The survey also showed that the devices were increasingly being integrated into the classroom and learning experience. Introduction of technology into the classroom can provide many benefits if it is done correctly. Success depends on the teacher’s knowledge, skills, and motivation, what applications are being used, the managerial and technical infrastructure and support, and how the applications are implemented and integrated into lectures. Further, users’ perception factors such as environmental characteristics, environmental satisfaction, collaboration activities, learners’ characteristics, and environmental acceptance must be taken into account [2]. One example of successful use of classroom technology is student response systems. Student response systems have been found beneficial for both students and teachers in terms of improved student performance on exams and creating a more positive and active atmosphere in classrooms [3]. Similarly, research has shown that introduction of games in the classroom can provide positive results. Games have been found to be beneficial for academic achievement, motivation, and classroom dynamics in K-12 [4] as well as for higher education [5]. In recent years, a new kind of tools has been adopted in academia, which can be used beyond the classroom. Social software such as blogs, wikis, voice-over-IP, and social networking tools is used to a larger degree for learning and communication [6]. To use software for collaborative learning is not new, for example, the use of virtual classroom to provide collaborative learning [7, 8].

The introduction of BYOD together with the needed technological infrastructure has now made it possible to enhance the lectures themselves through technology. This paper presents the results of a quasiexperiment where three social classroom applications were tested. The two first applications provide support for brainstorming or collecting student responses using the virtual post-it notes and visualization of keywords in word clouds, respectively. The last application is a game where the goal is to relate various terms to two defined categories. All three applications are multimodal and provide one common screen for the whole class and one individual screen for each student. The focus of the quasiexperiment was to investigate how the differences of these three applications affected the students’ motivation, engagement, thinking, activity level, social interaction, creativity, enjoyment, attention, and learning. Further, we investigated how the integration of the applications in the lecture affected the results. In addition, this paper describes the results of evaluating the usability and the technical quality of the three applications.

The rest of this paper is organized as follows. Section 2 presents the background material and methods related to the study. Section 3 presents the results, discusses the results, and discusses the validity of the quasiexperiment. Section 4 concludes the paper.

2. Materials and Methods

This section introduces related work, the applications used in this study, research goal, research questions, research methods, and the quasiexperiment.

2.1. Related Work

The three social classroom applications described in this study can be characterized as multimodal as they provide a shared large screen and screens for individual users with a touch interface. The actions performed on the individual user interface will affect the output on both the individual and the shared screen. Multimodal user interfaces have become popular commercially through the Nintendo Wii U game console that provides a game controller with a touch screen interface in addition to the ordinary TV screen. Multimodal user interfaces are used to engage audiences through various interactive applications and games. Examples of such multimodal applications is a racing game where the sensors on the cellphone were used to control a car on a large screen [9], an application where users could use the accelerometers on cellphones to interact with multimedia content on large public displays [10], the MobiToss application where users can share multimedia art on their cellphones by “throwing” content on large public displays using a throwing gesture [11], and the MOOSES platform used for playing various arcade games in movie theaters using a cellphone as the game controller [12]. An evaluation of the MOOSES platform found that multimodal games are found to provide a unique user experience, that having one public shared screen improved the social experience, that the individual screen provided useful input for the user, and that it can be a bit cumbersome to control a game using a cellphone [12].

In education, the most common multiuser applications used in classrooms are student response systems. Traditionally, special-purpose devices like clickers, key-pads, handsets, or zappers were used to provide input from the students [3]. Modern student response systems use the teacher’s laptop connected to a large screen together with the students’ own devices such as smart phones, tablets, or laptops. Most of these systems are web-based and provide various ways for the students to interact. One example is Socrative where the students can use their own devices to respond through multiple choice, true or false, or short answers [13]. The results of using Socrative in teaching physics showed that the students got more involved and engaged and interacted more with fellow students and it helped them realize what they knew. Another social classroom application is Quizlet, which is a web-based learning tool where the students can study various topics through Flashcards, a speller application and various spelling tests and games [14]. Learning Catelytics is another student response system, which makes it possible for students to give numerical, algebraic, textual, or graphical responses [15]. An evaluation of the game-based student response system Lecture Quiz found that the tool had positive impact on students’ attention and learning, it made the lectures more fun, it did not distract the lecture, and it increased the chance for lecture attendance [16, 17]. Surveys on usage of student response systems have found that students using such systems were twice as likely to work on a problem presented in class [18], student attendance rose to 80–90% [19], and 88% of students either “frequently” or “always” enjoyed using these systems in class [3].

Traditionally, computers in the classroom have been met with skepticism. However, the research field of computer-supported collaborative learning has the opposite view that development of new software and applications brings learners together and offers creative activities of intellectual exploration and social interaction [20]. Social software not developed specifically for education can be used with great success to support students and staff within and beyond the classroom. Dickinson College uses general purpose social software such as blogs, wikis, and voice-over-IP tools for educational purposes [6]. The blogs and the wikis are used to support sharing individual experiences and knowledge within and beyond the classroom including international student interaction. Voice-over-IP technology such as Skype has proven to be very useful for language exchanges between students in different countries. The use of social technology has also proved to transform the classroom in which students who may avoid live class participation are levering new communication forms to become more active and “vocal” in a virtual class [21]. In a study on computer-mediated collaborative learning, the empirical evaluation showed that students that used a group decision support system in support of their group learning activities perceived higher levels of skill development, learning, and interest in learning relative to students who did not use the system [22]. The study also showed that the students that used the system were more positive about the classroom experience and the group learning activities relative to the students who did not use the system. In a literature review on the role of social software tools in education, Minocha investigated the benefits to students and educators of using social software methods and tools in learning and teaching [23]. The study shows that, from the teachers’ point of view, social software allows students to participate in collaborative work with a higher learning outcome as the quality of, for example, a group report may exceed the sum of its parts. The students involved benefit from peer recognition and peer review, and social software that supports group interaction can foster a greater sense of community. Also social software methods and tools were found to encourage more active learning and increased student motivation. In Liaw et al.’s survey on instructor and learners attitude toward e-learning, it was found that instructors have highly positive attitudes toward e-learning that included perceived self-efficacy, enjoyment, and behavioral intention of use [24]. One finding relevant to our study was that multimedia instruction was found to be a critical predictor for perceived enjoyment, as well as instructor-led learning environment. In case study where 424 university students were surveyed about how they perceived the Blackboard e-learning system, it was found that e-learning effectiveness can be influenced by multimedia instruction, interactive learning activities, and e-learning system quality [25]. When designing the three social classroom applications described in this paper, our emphasis was mainly on providing applications offering multimedia instruction and interactive learning activities that would motivate and engage the students.

Integrating technology into the classroom can be a challenge and there are more challenges than mere technical issues that must be taken into account. Hew and Brush identified the following common barriers of technology integration for K-12: lack of resources (technology, access to technology, time, and technical support), lack of teachers’ knowledge and skills, institutional barriers (leadership, school time-tabling structure, and school planning), conservative subject culture, teacher attitudes and beliefs toward technology, and limited time related to assessments [26]. None of these barriers were an issue in our experiment. However, if our three social classroom applications should be used by teachers from other institutions, these issues had to be dealt with. N. Bitner and J. Bitner have summarized eight keys to succeed in integrating technology into the classroom: the teacher’s fear of change, sufficient training in the basics, the teacher’s own skills using the technology, use of teaching models that includes technology as a tool, learning as the overall goal, a climate that allows teachers to experiment without fear of failure, the teacher’s motivation to endure frustration, and ongoing and onside support [27]. The teacher in our quasiexperiment had been involved in the development in the three social classroom applications and thus some of these keys were not relevant for our study. However, two of the keys listed were relevant: the use of teaching models and that learning had to be the overall goal. Baylor and Ritchie have published a study of 94 classrooms on the factors that affect student learning in classrooms using technology [28]. The results showed that the impact of technology on higher-order thinking was predicted by the degree of teacher openness to change (positively), the amount of technology used by students individually (negatively), and the level of constructivist modes of technology use. An interesting finding was the negative impact of using a computer in isolation that suggests the importance of collaborative work on higher order thinking skill development. How transparently the technology was blended into the lesson was found to be predicted by the teacher openness to change and the percentage of technology activities with others. Positive results were found for integrated lessons that provide students with greater challenge in form of research, exploration, and expression compared to automate direct instructions such as computerized drill and practice. This study affirms the importance of how social classroom applications are integrated in lectures as well as the need for these applications to support collaboration among students.

2.2. The Applications

The three applications presented in this section were developed using the Framework for Interactive GameWall Applications (FIGA) [29]. The FIGA supports development of multimodal HTML5 web applications with several views. A typical FIGA application has a student view where the student can interact and view information on his own device and a teacher view that is typically displayed on the large screen in a classroom for the whole class to see. The underlying communication infrastructure in FIGA is based on Node.js and Socket.IO.

FIGA can be used to develop any kind of web-based multimodal application. The three applications used in our study are all educational applications. Two of the applications (Post-It and WordCloud) were developed to make it easier for all the students to share their ideas and answers with the teacher and the rest of the class. The third application, Categories, is a game where students are asked to sort correctly several keywords into two respective categories.

2.2.1. The Post-It Application

Post-its are commonly been used to perform brainstorming of ideas and opinions from participants in discussions and group work. Typically, all the participants in a brainstorm or feedback process receive a stack of post-it notes and are told to write down everything they relate to a specific topic. All post-its are then collected and placed on a large white board or black board for further processing. Examples of such usage are postmortem analysis of student projects [30].

The Post-It application makes it possible to carry out the same procedure using digital devices. The students are asked to write post-its digitally using their own digital device (typically a smart phone, a tablet, or a laptop), and the result is shown on the large screen using the teacher’s computer connected to a video projector. The student’s view provides a user interface to enter and submit short sentences or keywords. The teacher’s view shows all the submitted post-its. The teacher can group the post-its into different categories using different colors, delete overlapping or unwanted post-its, and sort the post-its according to category. Figure 1 shows screenshots from the Post-It application (the student view is shown to the right). The QR-bar-code in the upper right corner can be scaled up to full screen and is used to make it easier for students to access the client web interface (to get the URL).

2.2.2. The WordCloud Application

WordCloud is another brainstorming application which is similar to Post-It. WordCloud is a collaborative tool used to map associations of various subjects. A word cloud or tag cloud is a visual representation of text data where the most frequent mentioned words become more prominent than other words. The keywords are usually distinguished by font size and color, based on how often they are mentioned relative to the other words. The WordCloud application makes it possible to dynamically create word clouds on the large screen in a classroom, as the students enter various words based on input from a teacher. The main difference between the WordCloud and the Post-It application is what the students can input and the way the information is displayed on the shared screen. The Post-It application makes it possible for the students to submit short sentences, while WordCloud process inputs single words. Thus, the brainstorming or associate process using WordCloud is more quantitative oriented and targeted on finding the pulse of the class in terms of common keywords. This makes WordCloud a useful tool to get the students associations related to a theme or a statement. Figure 2 shows screenshots from the WordCloud application (the student view is shown to the right).

2.2.3. The Categories Application

The Categories application is different from the two other applications in that it is a game where the goal is to place a series of terms in the correct category. Before the lecture, the teacher must come up with a set of terms that can be sorted into two different categories, which is relevant for what being taught in the lecture. Figure 3 shows screenshots from the Categories application. To the left in the figure, the student view is shown. The screen shown to the right is displayed on the large screen showing the progress of students completing the game. When the student is playing the game, he will see two categories represented in two different colors (green and blue). He must drag the terms into one of the two categories. When the student has placed all terms in correct categories, the puzzle has been solved, and a new set of terms must be categorized until all the puzzles have been solved. The goal is to solve all the puzzles as fast as possible.

2.3. Research Questions and Methods

The aim of the study in this paper was to investigate how various types of interactive classroom applications affect the students’ attitude and learning. In addition we wanted to investigate how the way these applications were used in the lecture affected the outcome. The research method used is based on the goal question metrics (GQM) approach where we first define a research goal (conceptual level), then define a set of research questions (operational level), and finally describe a set of metrics to answer the defined research questions (quantitative level) [31]. In our case, the metrics used to give answers to the research questions were a mixture of quantitative and qualitative data.

2.3.1. Research Goal and Research Questions

The research goal of this study was defined as the following using the GQL template.

The purpose of this study was to evaluate the impact of choice of interactive classroom application and how the application was integrated in a lecture on motivation, engagement, thinking, activity level, social interaction, creativity, enjoyment, attention, and learning from the point of view of a student in the context of a lecture.

The following research questions (RQs) were defined by decomposing the research goal.(i)RQ1: how does the choice of interactive classroom application affect the students’ motivation, engagement, thinking, activity level, social interaction, creativity, enjoyment, attention, and learning?(ii)RQ2: how does the way interactive classroom applications are integrated in the lecture affect the students’ motivation, engagement, thinking, activity level, social interaction, creativity, enjoyment, attention, and learning?In addition to these two research questions, the usability and the technical quality of the social classroom applications were evaluated.

2.3.2. Data Sources and Metrics

To be able to provide answers to the research questions defined in previous section, two data sources were chosen: observation by the research team and a questionnaire to be filled out by the students after the lecture. The questionnaire consisted of four parts. In the first part, the students had to fill out general information such as age, gender, educational program, mobile device, web browser, and connectivity used. The second part of the questionnaire was a System Usability Scale (SUS) form to assess the usability of an application [32]. The SUS form contains ten questions that can be answered on a scale from 1 to 5, where 1 represents strongly disagree and 5 represents strongly agree. Every other question is formulated negatively, making the user reflect upon the questions more carefully. The sum of the SUS form produces a score from 0 to 100 regarding the usability, where higher score indicates a better user experience.

The third part of the questionnaire was a form to assess some technical considerations regarding the application. The form was based on technical assessment questionnaires described in [33, 34]. The technical inquiries were asked to map the user experience regarding the technical performance of the applications, such as latency, ease of installing, starting and setting up the application, and interoperability. The form consisted of five statements where the respondents could state whether they disagreed, were neutral, or agreed to the statement.

The fourth part of the questionnaire was a form consisting of eleven statements, where the students were asked to choose the one application that fitted best the description. The latter is based on the EGameFlow framework that identifies the eight EGameFlow factors concentration, goal clarity, feedback, challenge, autonomy, immersion, social interaction, and knowledge improvement [35]. Further, this form was inspired by our own previous evaluation of the learning game Lecture Quiz [17]. Table 1 shows the form used to rank the interactive learning applications according to eleven given statements (includes an illustration of how the form can be filled out).

2.3.3. The Quasi Experiment

A quasiexperiment was set up to give answers to the two research questions described in Section 2.3.1. The quasiexperiment was carried out at the Norwegian University of Science and Technology, where the applications Post-It, WordCloud, and Categories were used in two 45-minute lectures in a software architecture course. In the beginning of both lectures, the students got a short introduction to the three applications to be used, along with the agenda for the lecture. In both lectures, one application was used in the beginning of the lecture, one was used in the middle, and one was used at the end. At the end of the lecture, the students filled out the questionnaire described in the previous section. The main difference between the first and the second lecture in the experiment was how the applications were integrated into the lecture and what was being taught. Table 2 shows a summary of the set-up for the quasiexperiment and the differences between the two lectures. For lecture one, the applications were integrated in a shallow way, and the results of the Post-It and the WordCloud were only shown to the class and not discussed. Also for lecture one, the way the students were asked to use Post-It and WordCloud was formulated in a more general term such as “Write keywords about….” In lecture one, the Categories game asked the students to categorize a set of general terms in software architecture into two categories. These general terms were not directly related to the topic of the lecture.

In lecture two, the applications were integrated into the lecture in a deeper way. The Post-It application was used to collect answers in a group exercise, while the WordCloud application was used to write down associations in the form of keywords from the teacher’s presentation of three slides. For both applications, the results collected were both shown to and discussed in the class. In lecture two the Categories game was used to summarize the lecture by asking the students to assign terms used in the lecture to one of two categories.

The set-up of the quasiexperiment was made to analyze both of how the students were affected by the use of different applications (research question one) and how the applications integrated into the lecture affected the students (research question two). Descriptive statistics were used to compare the three applications and the difference between the two lectures.

3. Results and Discussion

The subjects in the experiment consisted of 30 graduate students where 93% were male students and the average age was 24 years. The students used their own devices to interact with WordCloud, Post-It, and Categories and 78% of the students used their smart phones and the remaining 22% their laptops.

3.1. Results of Experimental Lecture One

The first experimental lecture was held on March 12, 2013, and the topic of the lecture was “Testing and Implementation” seen from a software architectural point of view.

Figure 4 shows the results of how students evaluated the WordCloud, Post-It, and Categories applications in the first lecture. The numbers above the bars are percentages. The participants found the WordCloud application most engaging and creative. The Post-It application scored the best at being the most fun to use, making them contribute the most, and being the application they would like to use the most. The Categories application received the most decisive results regarding highest learning outcome, made them think the most, and was the most challenging. Both WordCloud and Post-It were used as individual exercises during the lecture. An interesting observation is therefore why WordCloud got a significantly higher score regarding social interaction than Post-It. It received almost as good results as Categories where the participants were stressed that they could help each other if needed.

The general feedback about the use of the applications in this lecture was mixed. Some students found the applications useless and found that they were only a distraction and waste of valuable lecturing time. Most students found the applications educational and fun to use and said that the applications created a variety in an otherwise monotonous lecture.

3.1.1. Observations from Using the Post-It Application in Lecture

After the teacher had given a short introduction to the lecture as well as presenting the agenda and applications to be used in the lecture, the students were asked to write any keywords about challenges related to implementation and architecture using the Post-It application as shown in Figure 5.

There were no major technical challenges and nearly all students submitted one or a few post-its to the large screen. It was interesting that the students found the Post-It application the most fun to use as well as being the application preferred to be used in lectures. The fun factor can be rooted in the fact that some students submitted post-its that were not related to the subject with humorous content. The teacher monitored the submitted content in real time and removed the nonrelated post-its as they came in. This process of filtering post-its is probably better to do being not connected to the large screen, to avoid the satisfaction of students seeing their fun notes on the large screen. Two students even tried to hack the password to access the teacher client to take over control, but they did not succeed. This shows that security against hacking is an important issue for such lecturing tools. From the teacher’s point of view, it was useful to be able to get the preknowledge of the students visualized and assessed before starting lecturing on the topic.

3.1.2. Observations from Using the WordCloud Application in Lecture One

The WordCloud application was launched after the about 15 minutes of lecturing, where the students were asked to write any keyword about how software testing is related to software architecture (see Figure 6).

The WordCloud application scored well overall on engagement and creativity and well on social interaction and keeping the students active compared to the Post-It application. As with the Post-It application, some students also submitted nonrelated and humorous words for the WordCloud application. The WordCloud application did not have any user interface for removing these words directly, so the research team monitoring the server carried out this task. This is a major disadvantage of the WordCloud application, which makes it easy for students to sabotage the learning experience. Profanity filter can partly solve the problem, but there are always ways around them, and it is impossible to block unrelated keywords. It is recommended that the results of the WordCloud application are not shown live to reduce this effect and to add an interface to quickly remove unwanted words before showing the WordCloud to the class. For large classes, this could be a problem as there are so many words coming in to process.

3.1.3. Observations from Using the Categories Application in Lecture One

At the end of the lecture, the students were asked to play the Categories game where the students were asked to categorize general terms about software architecture in two different categories (see Figure 7).

The teacher was taken by surprise that the challenge he thought was easy proved to be very hard to most students. He was surprised that the students actually did not know these terms so far in the semester. The feedback of the students showed that Categories got the highest scores on highest learning, made me think, social interaction (they could ask fellow student for help), most challenging (74% see Figure 4), and being most active. As a game, one would expect that it would score higher on most fun to use, most engaging, and so forth, but this was not the case. Feedback of the students gave clear indication that the task was too difficult. This just shows how important it is to have the appropriate level of difficulty when designing a game or game content [36]. Another solution to the level of difficulty problem could be to ask groups of students to play Categories game instead of individually. It is far more likely that a group of students would be able to find the correct answer than single students. The students were encouraged to collaborate with neighbor student to find the answer, but few students embraced this opportunity.

3.2. Results of Experimental Lecture Two

The second experimental lecture was held on April 11, 2013, and the topic of the lecture was “Architectures for the Cloud.” Figure 8 shows the results of how students evaluated the WordCloud, Post-It, and the Categories applications in the second lecture. In the second experimental lecture, the use of applications was more carefully planned to ensure that they were deeply integrated into what is being taught.

Unlike the results of the first lecture, the Categories game application turns out to be superior in almost all categories compared to the Post-It and the WordCloud applications. The Categories application was only outperformed in the two categories most creativity with 0% and social interaction with 22%. Figure 8 shows significant difference between the Categories application and the others. If we only consider the Post-It and WordCloud applications, it is hard to find a clear pattern apart from the fact that WordCloud was considered more creative and Post-It had more focused on social interaction.

3.2.1. Observations from Using the Post-It Application in Lecture Two

In experimental lecture two, the Post-It application was used to collect results of a group exercise where the students were asked to submit the five most important issues related to economy and cloud computing. Some of the results of this group exercise can be seen in Figure 9. Although the Post-It application did not get a very high rating compared to the two other applications apart from social interaction, the students were pleased by the usefulness of using the tool to collect results of group tasks in this way. Normally the results of groups are collected orally, which takes a lot of time and makes some students uncomfortable, as they have to present the group results in front of the whole class. This time there was not any problem with unrelated and humorous submissions, probably because the results came from groups and not from individuals. From a teacher perspective, the Post-It application worked very well as a tool for collecting results of a group task. One challenge found while using the Post-It application was the limitation in number of characters the students could enter in one virtual post-it note. Some post-its submitted were unclear and not well defined, and it was not possible to get the students that submitted these notes to explain their post-its more in detail when they were asked to do so.

3.2.2. Observations from Using the WordCloud Application in Lecture Two

For the second lecture, the WordCloud application was used in a very different way compared to the first lecture. The students were asked to submit any associations (keywords) they got from the teacher’s lecturing of three slides on “cloud definition and service models.” After the lecturing of the three slides was completed, the resulting word cloud was shown on the large screen as presented in Figure 10.

This time there was no major problem with students entering obscure or unrelated keywords. The resulting word cloud summarized in a good way what had been taught. Although the students were told to write keywords in English, some students entered keywords in Norwegian. The word cloud was shared on the learning management system after the lecture. The only area the WordCloud application scored high in the evaluation was on creativity. Another noticeable result was that WordCloud scored much lower than the two other applications on highest learning (9%), made me think (9%), social interaction (10%), most challenging (3%), and most active (3%). From the teacher’s perspective, the WordCloud application had a good response that made students pay better attention to what was taught during the lecturing of the three slides. The students seemed to be more engaged and submitted a lot of words to the WordCloud application.

3.2.3. Observations from Using the Categories Application in Lecture Two

Unlike when the Categories application was used in the first lecture, in the second lecture the terms the students had to categorize were related to what is being taught in the lecture (see Figure 11). This time the students were very positive to the Categories game. The results of Figure 8 show that the Categories application by large was the preferred application in the areas most fun to use (43%), most engaging (60%), highest learning (68%), should be used (45%), made me think (63%), most challenging (83%), most active (64%), held my attention (67%), and made me contribute (50%). The only two areas the Categories application was beaten were most creativity (0%) and social interaction (22%). The teacher’s feedback was that Categories made the students very concentrated, and fewer students gave up the task of sorting the terms compared to the first lecture.

3.3. Usability and Technical Results

The usability and technical forms were targeting the FIGA platform and not the individual applications. Table 3 shows the system usability score (SUS) results for the Post-It, WordCloud, and Categories applications from both lectures.

The SUS score of 74 indicates that these applications are easy to use but has room for improvement. The questions that contributed most positively to the usability score were questions 4 and 10, which is natural as all the applications are simple and do not require much to understand. The questions where the users responses were most neutral were questions 1 and 5 related to how often they wanted to use the applications and how well the system was integrated.

The results of the technical evaluation of the FIGA platform from both lectures are shown in Table 4. In general, the students did not have any major technical issues installing or running the applications, the responsiveness of the system was ok, and they were very happy not having to install the applications (91% agreed). We also acknowledge that there is absolutely room for improving the overall consistency of the technical quality for the platform and the applications. We also noticed that the students that reported unresponsiveness and latency were students connected to the applications not using the wireless network in the lecture hall but rather the 3G networks through a telecom provider.

3.4. Discussion

This section discusses the results presented in the previous section and discusses some threats of validity.

3.4.1. Discussion of the Results

We would now like to revisit the research questions defined in Section 2.3.1 where we asked about how the choice of interaction classroom applications affects the students’ attitude (RQ1), as well as how their integration in the lecture affects the students’ attitude (RQ2). The differences between the results in experimental lectures one and two clearly show that the way such applications are used and integrated in a lecture is highly important. For instance in lecture one, Post-it was voted as the most fun to use as opposed to Categories in lecture two. Similarly, WordCloud was voted to be the most engaging in lecture one, as opposed to Categories in lecture two.

Figure 12 shows the results of the evaluation of the applications summarized for both experimental lectures one and two (the numbers are percentages of votes). If we look at the overall picture for both lectures, the Categories application was ranked number one in most areas. Categories got the most votes in both lectures in the areas of highest learning, made me think, most challenging, most active, and held my attention.

If we compare only WordCloud and Post-It for both lectures, the Post-It application seems to get ranked better in more areas than WordCloud. The main exception where there is a noticeable difference is creativity where WordCloud got the most votes. If we compare the results of lecture one and lecture two, Post-It performed much better in lecture two. This seems to indicate that the way the applications were used is very important. To use Post-It to collect results of a group exercise seems to be an appropriate use of the application. For the second lecture, the Post-It was ranked higher than WordCloud by the students in most areas apart from most fun to use, most engaging, and most creativity.

The difference in results in the feedback of using Categories in lecture two compared to lecture one gives a clear indication of how much the difficulty level of a game can affect the students’ attitude. If the challenge is too hard, many students get frustrated and might give up (which was the case in lecture one). Similarly, it is important to avoid challenges to be too easy, as the student can get bored. Boredom in computer learning environments is shown to be associated with poorer learning and problem behavior [37]. The same study also found that frustration was less associated with poorer learning. Malone stresses the importance of adjusting the difficulty of challenges in games to the appropriate level to make things fun to learn [36].

Regarding research question number one, the game-based interactive classroom application Categories produced the most positive results in the students’ motivation, engagement, thinking, activity level, enjoyment, attention, and learning. For creativity, WordCloud was found to give the best results. No clear conclusion was drawn regarding social interaction.

Regarding research question number two, the way interactive classroom applications are integrated in the lecture highly affects the students’ attitude towards motivation, engagement, thinking, activity level, social interaction, creativity, enjoyment, attention, and learning. As a result, it is very important to plan carefully how applications are used in lectures. The same application can be a great success in one lecture and produce mediocre results in another. We learned that careful planning was the key to get better results in using applications in the lecture. There was a major observed difference in how the students stayed focused and engaged in the second lecture compared to the first for all the three applications. The main difference between the two was that lecture two was carefully designed to utilize the strengths of the applications and making use of the results collected from the tools. For example, we found that to use Post-It to collect results of a group exercise was highly effective. It kept the students motivated and engaged as their group results were shown on the large screen, group discipline avoided entry of unrelated or humorous post-its, little time was wasted on collecting student results, and it was easy for the teacher to group and go through and discuss the results with the students. Similarly, when the students were told to submit associations using the WordCloud application during lecturing of three PowerPoint slides, the students paid more attention to what the teacher said and the engagement was much higher than a normal lecture using slides. In the beginning of the second experimental lecture, the students were told that at the end of the lecture they should use what they learned through the lecture in the Categorizer game. The teacher sensed a higher focus and engagement during the whole lecture, and when the students got to play the Categorizer game at the end of the lecture the students were very focused and engaged. The time it took students to complete the game also showed that they had learned the most basic terms from the lecture and were able to apply the knowledge efficiently.

3.4.2. Threats to Validity

This section addressed the most important threats to the validity of the results of the quasiexperiment described in this paper. The study presented in this paper must be regarded as a quasiexperiment and not a controlled experiment, as there are too many uncontrolled parameters. There are mainly three validities that must be discussed: intern, construct, and external.

The intern validity of an experiment is concerned with “the validity of inferences about whether observed covariation between A (the presumed treatment) and B (the presumed outcome) reflects a causal relationship from A to B as those variables were manipulated or measured” [38]. If changes in B have causes other than the manipulation of A, there is a threat to the internal validity. The first internal threat is that the sample of subjects in the experiment was not randomized. The students that participated in the experiment were software architecture students mainly in their 3rd or 4th year at the university. The majority of the students were male. Another factor was that the students that participated in experiment were about 1/3 of the students that take this course. They were not picked, but they were the students that showed up for the two lectures. Another internal threat is if there were any differences in how the applications were used and differences between the lectures. In this study we looked at two different things. First of all we looked at differences between the applications. Secondly we looked at the differences in how the applications were used. We acknowledge that the use of applications has been different, but we believe they had been used similarly in the same lecture. In the first lecture, all applications were not integrated into the lecture and the results produced by the applications were not discussed. In the second lecture, all applications were well integrated with the rest of the lecture, and the results were discussed.

Construct validity is concerned with the degree to which inferences are warranted, from (1) the observed persons, settings, and cause and effect operations included in a study to (2) the constructs that these instances might represent. The question, therefore, is whether the sampling particulars of a study can be defended as measures of general constructs [38]. Our approach for the quasiexperiment was to evaluate the use of three applications. We planned to use all applications the way they should be used, but in one lecture not well integrated and in the other well integrated. To measure the differences, we asked the student in a form to pick the best application in eleven different areas. Descriptive statistics were used to compare the results. To support the statistics, observations and textual comments from the students were used for the analysis. The form we used to evaluate the three applications was constructed based on the EGameFlow framework [35] and forms used to evaluate the learning game Lecture Quiz [34]. The focus of the study was on comparing the three applications WordCloud, Post-It, and Categories against eleven different criteria. Our approach was to make the students evaluate the application that performed the best in each category. An alternative approach would have been to design an evaluation form that evaluated each application against the eleven criteria and then compared the results. The former approach was chosen as it was perceived as being easier to understand and faster to fill out for the students. The time aspect was important as the quasiexperiment took place in actual lectures and with limited time to fill out a questionnaire.

The issue of external validity is concerned with whether a causal relationship holds (1) for variations in persons, settings, treatments, and outcomes that were in the experiment and (2) for persons, settings, treatments, and outcomes that were not in the experiment [38]. It is hard to predict whether the results of this study can be generalized to other contexts. This study is highly sensitive to the applications used in the study. It cannot be claimed that the same result will yield any interactive classroom application. However, the results should be applicable to applications used to brainstorm or share keywords or short sentences with similar characteristics to Post-It and WordCloud or games where the goal is to group terms in a social setting. The results of how applications are integrated with a lecture should be generalizable. As long as the classroom applications do not have major technical or usability issues, the way these are integrated with a lecture is likely to have a major impact on the students’ attitude.

4. Conclusions

In this paper we have evaluated the effect of three social classroom applications. All three are HTML5-based offering a private user interface to the students on their own devices and a public common user interface shared by the whole class running on the teacher’s laptop displayed on a large screen. Post-It and WordCloud share very similar characteristics. The students enter keywords or short sentences that are visualized on the large screen as either post-it notes or a word cloud. The Categorizer application is a game where the students are asked to drag and drop terms into the correct category. The progress of the class is shown on the large screen.

Our study showed that the game-based application got the best general reception of the three. However, the results also showed that the students’ attitude was highly sensitive to the difficulty level of the game. Post-It and WordCloud performed to a large extent in a similar way and were also highly sensitive to how the applications were used. Our quasiexperiment showed clearly that it is very important to plan the use of applications in lectures carefully to get the best results. When such use of applications is well planned and well integrated, it will boost engagement, learning, creativity, focus, attention, and social interaction. We also found that the Post-It application was very useful and handy for collecting and discussing results of group exercises. We also learned that the limitation of only allowing the students to enter single keywords in the WordCloud application was perceived as stimulating the students creativity.

We believe that there is bright future for social classroom applications. This is an area that needs to be explored and more and more experiments are needed to evaluate the applications themselves as well as how they are used.

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

Acknowledgments

The authors would like to thank the students from TDT4240 Software Architecture Class 2013. They would also like to thank Richard Taylor and Walt Scacchi at the Institute for Software Research (ISR) at the University of California (UCI), Irvine, for providing a stimulating research environment and for hosting a visiting researcher from Norway.