Abstract

Any test or assessment’s effect on teaching and learning is termed as “washback”. Empirical studies conducted in this area are relatively recent, starting with the remarkable work of Alderson and Wall in 1993. Studies conducted thereafter inquired into different aspects of washback. In light of these studies, this critical review would explore the stakeholders of washback and the factors outside of the test itself which could affect how it has an impact. It indicates that although a test’s washback remains at the core of the complex connection among classroom teaching, learning, and assessment, a test cannot solely lead the classroom teaching and learning; rather, it is intervened by the different overriding agents, e.g., teachers, students, and contexts shaped by them. Notwithstanding that, teachers are the ones who can maintain a leading role in generating positive washback of target tests. In the end, this article draws suggestions from the literature showing what teachers should do to warrant a positive washback.

1. Introduction

Academic tests, especially while referring standardized tests, influence teaching and learning. Researchers termed this phenomenon as washback [1, 2]. It is a relatively recent, yet complex phenomenon explored in the area of language and general education [1, 3]. Before the 1990s, the word “washback” could not be traced in the academic world [4]. With the remarkable work of Alderson and Wall [5] and the rise and prevalence of external standardized tests worldwide [3, 6], washback as a research topic has received increasing attention. Hence, several empirical studies on language testing have been carried out since the late 1980s.

The reason behind this increased attention is that external standardized tests are utilized for assessing the achievement of learners, instructors, and schools [7] and for employment purposes [3]. Hence, promoting and ensuring the positive effects of a test is of utmost importance to the stakeholders of the test. However, in an environment where standardized tests are given growing importance in assessing students’, teachers’, and schools’ academic achievements, there is a risk that concerned teachers, without thinking about what they are doing and being unaware of the extent to which tests govern their teaching, start teaching to the test [8] and engage the learners only in the test-oriented tasks and activities and eventually fail to achieve the curricular objectives [9]. Moreover, washback goes beyond the simple cause-and-effect hypothesis shown by Alderson and Wall [5]. Therefore, it is crucial that teachers, and test administrators are oriented to and made aware of the presence of test washback, and are familiarized with the diverse stakeholders of washback and the different intervening agents and factors beyond the test itself.

2. Methods

To examine the stakeholders of washback and intervening agents and factors, a critical review on the notable washback studies carried out from 1993 to 2022 have been consulted. The critical review would assist to identify and examine factors and recognize the stakeholders, those played a crucial roles for grabbing the effects of washback. The year 1993 was chosen because it was the year when one of the first washback studies and ground-breaking work of Alderson and Wall was published. Three types of research works on washback have been investigated. These were: published studies that included research articles, book chapters, and monographs; review papers, and published doctoral dissertations.

These were aggregated from several databases. “The Digital Commons Network,” which brings together full-text yet free scholarly articles directly from thousands of universities throughout the world, was used in this regard. It contains an increasing pool of peer-reviewed articles, working papers, book chapters, conference proceedings, dissertations, and other academic work. Besides this, Scopus, Web of Science, ScienceDirect, ERIC, PsycINFO, Google Scholar, and ResearchGate were also used as good sources. The keywords, e.g., standardized high-stakes tests, washback, backwash, impact, assessment, and testing, were used in the search engines to identify the relevant work. After examining the titles and abstracts of the existing literature based on the keywords, the research works were selected for this study. The selected literature was critically reviewed for reaching out the research objectives of this study.

3. Washback

Since a test, especially a standardized high-stakes test has power [10], it is used as an effective tool for changing language teaching and learning approaches [11]. Such tests can produce intended and unintended consequences [1214]. These consequences or effects that a test has on classroom teaching and learning are commonly defined as washback [5, 6]. It is considered to have positive effects when it encourages “good” instructional practices ([15], p. 921), and fosters the achievement of curricular objectives, whereas it produces unintended consequences, it is considered to be negative [16, 17]. Moreover, a test may produce intended or unintended effects [18]. Researcher like Spolsky [19] count only the intended effects of a test as washback. They believe that the core purpose of a test is to control the curricula. Other reseachers (e.g., [18, 20]) agree that any influence or effect either negative or positive, unintended or intended, which a test may have in teaching and learning of a language, is washback.

It is the intrinsic quality and value of a test where its ramification is to be demarcated by its stakeholders and contextual uses [21]. Alderson and Wall [5] argued that the washback of a test forces both teachers and learners to do things that they might not do if the tests were not there. In contrast, some researchers [18, 20] remarked that high-stakes tests have extensive and wider effects on education and test-takers life than only in the classroom. To specify the wider and broader effects of tests, Bachman and Palmer [20] used the term “test impact,” which indicated the influences tests usually have on teachers and students as individuals or on the society or community on the whole, along with the school and the different stakeholders involved in the process. However, researchers like Andrews et al. [22] and Rahman et al. [23] made no such difference and remarked that both narrow and wider/broader effects and influences could be encompassed under one term, washback.

3.1. Washback Complexity

Spratt [24] claimed that washback is quite complex and elusive instead of being a direct and automatic effect of tests. It is “an interactive multi-directional process” encompassing a continual interplay of diverse degrees of complexity among the diverse washback constituents ([25], p. 2). It is a phenomenon that is not found to happen of its own accord. It rather comes into existence when teachers, learners, or others get engaged in the process of test-taking. Other researchers also concluded that washback is not a simple and uniform phenomenon but a complex, elusive, and multidimensional phenomenon [18, 26, 27]. The study on the washback of TOEFL conducted by Alderson and Hamp-Lyons [28] concluded that hypothesizing washback as a simple thing is too naïve. Its effects on classroom dynamics are very complex than unstudied beliefs about it allow. It is, hence, considered an intricate phenomenon that affects different facets of teaching and learning a language and, thus, is intervened by various factors and needs to be discussed concerning diverse features.

3.2. Washback Stakeholders

Washback is an outcome of an interrelation between all direct and indirect stakeholders. While many washback studies [2931] focus on teachers and learners, studies on the other parties that may affect or are affected by the test are less widely conducted [32]. These other participants comprise test developers and advisors [3335], materials developers and publishers [11, 36], curriculum planners and teacher educators [37, 38], principals, head teachers, and other administrators [29, 39], language inspectors [40], program administrators [31], end-users [37], and parents [32, 41].

Rea-Dickins [42] pointed out five different stakeholders: teachers, students, parents, official and government agencies, and the marketplace. In their study, Saville and Hawkey [43] also indicated a range of stakeholders involved at the macro level. The stakeholders they listed is quite identical to the categories of stakeholders shown by Rea-Dickins [42]: the teachers, test-takers (students), test users, parents, teacher trainers and educators, test administrators, teacher educators, government agencies, funding agencies and sponsors, different exam authorities, curriculum committees, members of working parties, and the public.

3.3. Areas Influenced by Washback

As a complex and multidimensional phenomenon, the washback of a standardized test can potentially affect different aspects of teaching and learning a language. Studies [18, 4446] explore that high-stakes tests can affect teachers’ teaching aspects because of the changes in tests. It can affect teachers’ perception, attitude, and behavior [12, 47], methods and approaches of teaching [22, 48], teaching contents [2, 49], teaching materials [50, 51], allocation of class time [5254], and the status of the target language and the uses and importance of the test [55].

In his Sri Lankan study, Wall [49], for example, found that testing affected the contents of teaching, not the teaching methods and approaches of teachers, whereas, in his study in Israel, Ferman [56] found that newly introduced English Oral Matriculation Test had an intense washback on the classroom activities in the classrooms where both teachers and learners were found to concentrate on developing speaking skills.

Recent studies [3, 23, 54] also show the effects of testing on language learners’ learning style. Washback affects students’ test preparation approaches [57], learners’ focus on test-related materials, activities, and tasks [57, 58], their perspectives [57, 59, 60], their beliefs, their context, their educational experience [54, 61], their achievement of score or grades [57, 62].

Jiang and Sharpling’s [63] study is a noteworthy example of how test affects various aspects of language learning. They studied eight Chinese graduate students boarding higher education in the United Kingdom, where they got an English-speaking environment. Through interviews with these Chinese students, the researchers explored their reflective opinions about the interrelation between their learning strategies and changes in assessment. It was explored that changing the the assessment approach from summative to formative and altering the learning environment affected learners’ language-learning strategies. Their strategies for learning a language were linked to the form of assessment. They changed their strategies of learning the language to match the formative assessment approach when they discovered that their instructor assessed them mostly through the formative assessment approach (e.g., pair works, group works, participation in the classroom, and assignments in lieu of summative test) in their English-preparatory course. The study found that instead of focusing on learning discrete vocabulary and grammar items, the learners concentrated more on learning how to effectively write assignments when they found that their instructors assessed them on writing assignments. The study also explored that the changes in the form and approach of assessment and an English-language-learning environment also affected the learners’ strategies for learning the language. However, the factors which intervene in the washback process are more elaborately discussed below.

3.4. Factors Affecting Washback

Although there is a limited number of empirical studies on factors affecting washback [32], it seems an agreed-upon issue among the scholars [6, 23, 24, 44] that factors beyond the test itself also mediate in influencing the degree and direction of washback. For instance, in her study, Wall [27] could hardly differentiate the influences of the test from the effects of other existing variables in the setting where the test took place. Spratt [24] found several factors explored by the different empirical studies in his review paper. He classified these factors into the school factor, resource factor, teacher factor, and the test/exam itself. Wall [49] and Watanabe [64] classified washback factors into microcontext factors (i.e., student and teacher), factors associated with tests (i.e., contents and methods), and macrocontext factors (existing in the entire education system) to elucidate how these multiple issues are facilitating washback effects.

Alderson and Hamp-Lyons [28] found that the nature of the test itself could not elucidate the influence of TOEFL on teachers. For instance, large class size is inconvenient in creating an interactive classroom. Textbook developers may also be held accountable for not specifying how classroom teachers can fully exploit the textbook and other associated materials. Above all, teachers could be one of the most influential factors since many might be unwilling to revamp the course content and materials. Furaidah et al. [65] observed several factors culminating in the washback intensity in the Indonesian school context, including teachers’ viewpoints and approaches toward teaching, students’ competence, and school quality. Saif [66], on the other hand, advised resolving non-test-related covert issues (such as financial, political, and ethical issues) to bring about positive washback effects.

Shih [67] pointed out three kinds of factors (i.e., contextual, test, and teacher factors) influencing the intensity of test washback. Context-related factors involve administrative issues within schools, class size, the course objectives, the time when the course is on offer, the support system for teachers in the schools or from the designers of the test, other subject teachers’ noncooperation due to the schedules of class, and the presence of students with heterogeneous language competence [68]. Test factors include the test’s stakes, the status of the language, the skills tested, and the additional management issues necessitated by the test [68]. Teacher factors comprise target language competence of the teachers [69], teaching experience of teachers [64], amount of training received [59], teachers’ belief about successful methods of teaching and preparation of the test [59, 64], experience of learners learning the target language [64], concerns of teachers for learners’ levels of language proficiency, the familiarity of teachers with the different methods of teaching [48], perceptions of teachers about the importance of the test, perceptions of teachers about test quality [55], and the commitment of teachers to their profession, and teachers’ enthusiasm and competency to innovate [35].

3.4.1. Teacher Factors

Reviewing various empirical washback studies, researchers [24, 70] found that teachers of a second or foreign language has a vital role in shaping the forms and severity of washback and are the main agents for promoting positive washback. Cheng [6] and Watanabe [64] found that this role is dependent on certain factors related to teachers. These are their attitudes, perceptions, feelings, beliefs, expectations, experience, educational qualifications, etc.

In case of newly introduced or revised tests, these teacher related factors either intensify the stress of teachers and decline their confidence and enthusiasm [71], or stimulate them to welcome new methods and techniques and that are more communicative, and humanistic [72]. The willingness of teachers to innovate and their personalities are also found to be mediating factors of washback [28].

Teachers’ perceptions and attitudes toward the tests also influence how teachers design and develop their teaching materials and their classroom lessons [28]. In contrast to the belief of the authorities that tests can be used as an effective ways to inspire teachers to teach, tests are often considered an intrusion by the teachers [73]. Hence, the effect of tests is regarded as negative for teaching and learning [22]. Turner [74], however, explored that if teachers are invited to the processes of designing the test, they possess more positive attitudes toward the test.

Cheng [11], in her washback study, explored that teachers were concerned about how learners, particularly the introvert ones, would face and pass the revised test. One of them informed that she felt embarrassed if she failed to acquaint her students to content and format of the revised test. While investigating the effects of test on teachers’ perceptions and materials design, Tsagari [40] noted similar findings. During the interviews, teachers said they felt worried, and discomfited while attempting to teach the entire content and materials recommended in the syllabus. Spratt [24] commented that exam materials were extensively used in classrooms, especially when the exam approached. In another study, Tsagari [40] also observes that washback becomes more intensive as the test date approaches nearer. This intensity touches the peak in the same weeks when the test will be administered and even causes the students to suffer from several extreme physical and mental reactions, e.g., fear, anxiety, headache, upset stomach, and sickness.

Shohamy [10] highlighted that the effects of tests on its stakeholders need to be investigated in terms of their uses, fairness, misuses, discrimination, and biases. Cheng et al. [12], in their Hong Kong study examined students’ and their parents’ perceptions and beliefs of their role in the newly hosted high-stakes test. It was explored that there was an unswerving association between students’ perceptions of test-focused activities and tasks, and their English-proficiency levels. The researchers also explored the perceptions of parents toward the newly introduced high-stakes examination that their role was to support the children to make good grades in the tests. The study finally remarks that parents’ perceptions toward the newly introduced high-stakes examination are substantially linked with the perceptions of their children about the examination [12], which ultimately influence the teacher overtly or covertly in the classroom [75].

Moreover, the school authority predominantly pressures teacher, students, and parents to modify their teaching and learning styles to suit the test [23, 76]. Subsequently, it may lead teachers toward what Spratt [24]; p. 24) terms “a tension between pedagogical and ethical decisions.” Consequently, teachers’ professional knowledge and standing get reduced by the demands of tests, and they are indirectly pressurized enormously to work hard to upgrade exam test scores of their students, which ultimately develop feelings of anxiety, embarrassment, guilt, shame, and anger [77].

Conversely, Gregory and Burg [78] accentuate that while high-stakes examinations produce adverse effects, they can have inevitable positive consequences on instruction. Wall [49] and Amengual-Pizarro [79] indicated that teachers had mixed yet mostly positive attitudes toward the examination. They found that most of the teachers appeared to possess positive perceptions of the test. Amengual-Pizarro [79] concluded that the test was assumed to be useful, reliable, and essential.

Thus, high-stakes examinations exercise considerable washback effects on the perceptions, attitudes, and feelings of teachers and students [80, 81]. However, the extent of these effects on effective teaching and learning is unclear. Hence, stakeholders’ perceptions (especially teachers and students) toward the test, their test anxiety, and its effects on language teaching and learning are worthy of further investigation.

Several studies [28, 47, 82, 83] examining the washback of tests report that teaching experience also is likely to intervene the test washback. Some other research studies [11, 55, 64] also report that teachers’ experience is one of the main factors, which can help washback researchers explain the reasons behind washback’s varying influence on teachers, i.e., influencing some teachers but not others. Onaiba [75] concedes that more experienced teachers have the capability to change and adapt teaching techniques and methods while responding to the newly introduced test.

Shohamy et al. [55] found experienced teachers much more thoughtful and perceptive to standardized high-stakes testing and hence, were inclined to abide by the test requirements and apply them as guidance for their instructional practices. In the same way, Lam [84] found differences between experienced and novice teachers concerning negative and positive washback. He comments that teachers with more years of experience are less likely to be affected negatively by syllabus/curriculum innovation because their long experience facilitates them to set more in their ways. Moreover, they are more confident and realistic in measuring what is handy in their professional context.

Cheng [11], on the other hand, remarked that experienced teachers might fail to change their approaches to teaching which is required by the change in the testing system since, with the passage of time, two of their important characteristics (i.e., their ability and skill to change) fades.

Several washback studies also indicate that besides teachers’ teaching experience, the methodological training they received [47], their training on approaching and dealing with specific tests and test-related materials and textbooks [35, 47], their preparedness to accept the pedagogical or curricular changes [39, 11], and their awareness and recognition of the change in assessment and testing [49] are also important factors influencing washback of a test.

Wall [49] comments that tests can hardly stimulate teachers if they lack the skills that will facilitate them to make necessary adjustments to the newly hosted test. Wall [49] further commented that teachers in such cases would rely more on test-related materials while teaching in the classroom, taking them from previous exam papers, exam-focused support materials, or commercially produced test-oriented books.

Wall and Alderson [35] found no evidence of washback on methodology, i.e., teaching methodology remained the same. They commented that it happened because teachers needed more training in approaching and dealing with the specific tests and test-related textbooks and materials. They concluded that the test might only affect methodology if the teachers properly understand what the test is measuring. This finding and remark are quite similar to the findings of the studies conducted by Chapman and Snyder [85] and Wall [49]. These findings lead to the belief that the extent of washback of the test is attributable partially to how teachers perceive and understand the goals of the test and their awareness of the said test.

Teachers’ educational background and academic qualifications are another important teacher-related factor that can be partially attributable to why and how washback takes place. Watanabe [64] remarks that teachers’ educational backgrounds and academic qualifications shape the instructional practice they apply because of the introduction or revision of exams. Onaiba [75] advises that future teachers should major in the subject they are interested in teaching and attend quality preservice and in-service training to enhance theoretical and practical knowledge and understanding of the intended subject areas.

3.4.2. Student Factors

A number of studies [14, 40, 86] indicated that students’ attitudes toward and perception of learning, teaching, and testing play a significant role in creating washback. Tsagari [14] investigated the washback of a high-stakes EFL examination in Greece in relation to test-takers’ perceptions, and materials design and their applications in the classrooms. The findings indicated that the test influenced students’ feelings, perceptions, attitudes, and motivation toward language learning. Xie [86] examined the Chinese students’ attitudes and perceptions of the two changes made to a national English test and its effects on these students’ approaches to test preparation, their management of study time, and their test performance. He explored that those who possessed positive attitudes toward the test were more engaged in learning activities and test preparation, and thus the tests could have the potential to create positive washback.

Studies conducted on washback of testing also found that students’ attitudes toward testing might be mixed. Tsagari [40] carried out another washback study where the respondents were 54 teachers and 98 students from two language schools. Both teachers and learners in the study believed the newly hosted examination significantly influenced English teaching and learning. The findings from the student questionnaires showed that most of them found the examination very important and useful. It positively affected teaching and learning, materials, and the teacher’s perceived attitude. The study showed mixed results on the effects of tests on students’ attitudes. 44% of them thought the test had positive or strongly positive effects, and 36% reported negative or strongly negative effects. Similar to the study conducted by Shohamy et al. [55], this study also reported that most of the students (70%) found that the test caused them anxiety. In contrast, a 4-year massive-scale longitudinal study conducted in Hong Kong by Cheng [11] explored that the English subject had a washback on students’ learning but it was superficial. Furthermore, students’ attitudes toward high-stakes public examinations remained essentially unchanged.

3.4.3. Contextual Factors

Watanabe [48] underscores the significance of contextual factors in intervening in the washback process. These factors fall into two groups: “micro-context factors” (i.e., school or classroom settings) and “macro-context factors” (the society where the target test is administered) ([48]; p. 22). Concerning macrocontext or societal factors (e.g., parents, media), studies explored that pressure on teachers from external sources can elucidate the reasons behind the effects of high-stakes tests on instructional differences, mainly when students’ results measure the professional success of teachers [12, 78], or when rewards or sanctions are attributed to the scores of the test and teachers of high achievers are rewarded while reprimanding teachers of low achievers [71]. Therefore, the effects of learners’ and their parents’ expectations on teachers are likely to be influential [87]. On the contrary, studies [41, 32] identified parents as an indispensable factor, and they are also likely to intervene the washback of a language test by encouraging their children to learn to the test.

A recent research by Tsang and Isaacs [88] has studied how learning is intervened by the different micro agents and factors (e.g., home environment, courses, classrooms, and school) in the personal space of a learner. The broad macro context such as the environmental and cultural factors of schools (e.g., learning traditions) and the number of students and amount of time allocated to test preparation classes were also explored and identified as mediating factors in generating test effects on teaching and learning in the other washback studies [24, 49, 83, 88].

Class size is one of the factors which may interrelate with the examination to govern its influences on classroom teaching and learning [64]. In large classes, teachers would engage learners in test-related activities so that he could finish the intended syllabus utilizing less time and less effort. Onaiba [75] studied the washback of a revised EFL high-stakes public examination on the classroom teaching practices of Libyan school teachers, and explored class size as one of the most influential factors. The larger the class, the lesser communicative and skill-based activities are done by the students in the classroom [80, 81].

Another important factor to consider in a washback study is the grade teachers teach. Several washback studies [75, 89, 90] found that the grades teachers teach, which face high-stakes public examinations or tests, are more likely to experience washback effects. Teachers teaching in the higher grades are more likely to stick to test-oriented instructions and practices to suit the test requirements [5, 23].

Several washback studies identify the assessment and test itself as one of the influential factors influencing the degree and direction of washback. The different test-related factors are the purpose of the test, the status and stakes of the test, the status of the target language, the formats the test applies, test proximity [5, 55], when the test was hosted and teachers’ familiarity with the test [22]. All of these test-related factors play a substantial role in intermediating the kinds, degrees, and directions of the washback [14, 24, 27].

Spratt [24] also listed educational administration, geographical factors, and political factors as macrocontext factors. How well the messages about changes in testing and assessment are transmitted to the teachers and learners and how corroborative and facilitative the administration is in executing the change—are the administrative factors that affect the test washback. The geographical factors include—whether infrastructural facilities, e.g., availability of electricity and transport, and where the schools are located, while the political factors include—“politically motivated decisions” ([90], p. 47) among others.

Other studies found that resources can be an intervening factor affecting washback. Availability of modified materials, exam-related materials (e.g., specifications of the exam), and the required textbooks [39, 55, 83] to classroom teachers are intervening factors too that affect test washback.

Thus, washback is not a simple “testing-teaching causal relationship” ([44], p. 2) or a systematic cause–effect reaction to examinations [75, 91], rather it embraces several factors that are likely to intervene with the influences of the test on language teaching and learning, and thus, either promote or inhibit washback. Spratt [24], in her concluding remarks, mentions that the reply to the question of washback direction where these mediating factors lead to “would likely be: it depends” (p. 23).

4. Conclusion and Recommendations

To sum up, washback is not necessarily an automatic consequence of a test. The review implies that different stakeholders and factors governing washback of a test are discerned within different washback settings and these cannot be secluded. Therefore, analyzing and measuring washback warrants watchful interpretation of the context where the test is based in. Different factors actively involved in creating washback and their relative strengths should be identified so that favorable conditions can be created to ensure positive washback of a test. While there are diverse agents and factors mediating washback, teachers are the ones who could play a crucial role in bringing about positive washback [24]. Hence, teachers should be facilitated to guide them to the path where they start teaching to the curriculum, not teaching to the test. They need to understand the goals and objectives of the test and what it measures. These should be clearly articulated to the students. Furthermore, developing awareness about parents’ involvement in their children’s learning practices is also important. Additionally, students’ sociocultural context needs to be realized. In this regard, this article has critically discussed both micro and macro issues related to washback. To conclude, there may be other important measures for stimulating beneficial washback, which can be identified through further research in the area of test washback.

Disclosure

This research is an independent research project organized by a research group.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Authors’ Contributions

Kh Atikur Rahman and Prodhan Mahbub Ibna Seraj contributed in the designing work, collecting and interpreting data, and drafting the work. Mohammad Rukanudddin, Mst. Sabrina Yasmin Chowdhury, and Shaila Ahmed contributed in addressing the reviewers’ comments, drafting, and writing the conclusion and recommendation anew. Mohammad Rukanudddin contributed in adding the new ideas.

Acknowledgments

This paper also acknowledges the doctoral dissertation of Kh Atikur Rahman. His dissertation forms the basis and initial conceptualization, and preparation of the manuscript.