Manipulatives are concrete or virtual objects (e.g., blocks and chips) often used in elementary grades to illustrate abstract mathematical concepts. We conducted a systematic review to examine the effects of interventions delivered with manipulatives on the learning of children with mathematics learning disabilities (MLD). The outcomes observed in the sample (N = 38) were learning, maintenance, and transfer in a variety of mathematical domains. Interventions using manipulatives were reported to be effective for a range of learning objectives (e.g., conceptual understanding and computational fluency), but several methodological weaknesses were observed. Analyses also highlighted considerable heterogeneity in the studies reviewed in terms of participant characteristics, intervention approaches, and methodology. We discuss overall effects of interventions with manipulatives in the MLD population, the methodological quality across the sample, and implications for practice.

1. Introduction

According to the American Psychiatric Association in the Diagnostic and Statistical Manual of Mental Disorders, DSM-5 [1], a specific learning disorder can take the form of a deficit in the acquisition of reading, writing, arithmetic, or mathematical reasoning skills during formal years of schooling. Mathematics learning disabilities (MLD) in children are defined as a disorder that interferes with mathematics learning at school and in daily life activities, and its prevalence in the K-12 population runs from 1% to 10% [28]. Because mathematics is involved in many aspects of daily life, people with MLD can be marginalized and their social and professional integration can be affected [2, 9].

Studies have revealed that MLD is manifested by difficulties mastering number sense, number facts, or calculation, as well as difficulties with mathematical reasoning, and cannot be explained by intellectual disabilities, uncorrected visual or auditory acuity, other mental or neurological disorders, psychosocial adversity, or lack of proficiency in the language of academic instruction [1]. Furthermore, a diagnosis of MLD cannot be explained by inadequate instruction, as identifying a child with MLD occurs only after targeted interventions have been shown to be ineffective [10]. Nevertheless, specific and explicit instructional interventions have been shown to be beneficial for students with MLD and as such, increasing the mathematics achievement of schoolchildren with MLD thus necessitates the identification of effective instructional practices.

Our work focuses on the effects of using manipulatives in mathematics instruction on children’s learning and transfer. “Manipulatives” are concrete or virtual objects and are intended to reify central concepts in the mathematics curriculum. Students and teachers can configure and manipulate the objects, whether they are concrete or virtual, in ways that reflect the ideas at the heart of a lesson. Some research has indicated that manipulatives can be effective for the development of children’s conceptual and procedural knowledge of mathematics [1114]. For example, in their meta-analysis of the literature on typically developing (TD) children, Carbonneau et al. [11] indicated that using concrete manipulatives in mathematics instruction produces a small-to-medium-sized effect on student learning when compared to instruction with no concrete materials. Moyer-Packenham and Westenskow [13] synthesized the research reporting the effects of virtual manipulatives on student mathematics achievement and showed large, moderate, and small effects for virtual manipulatives compared to physical manipulatives and text combined. Despite these findings, recent research has revealed that the mere presence of concrete objects in instruction does not guarantee learning [1518]. Indeed, Carbonneau et al. [11] pointed to the inconsistencies in the manipulatives-based literature and revealed that the strength of the effect is dependent on other instructional variables, such as the perceptual richness of an object [19, 20], the level of guidance offered to students during the learning process (e.g., [21]), and developmental characteristics of the learner [22].

There is some evidence to suggest that children with MLD can benefit from instruction with manipulatives (e.g., [23]), but the extent of these benefits for children who struggle is unclear [24]. Furthermore, as is the case with TD children, the conditions under which they are beneficial are not well understood. In one of the few studies addressing such conditions, Luke [25] compared adults with MLD and children with MLD and TD children on mathematical problem solving, and examined the moderating effects of manipulative type (i.e., bland vs perceptually rich). The participants in each group solved half of the problems with bland manipulatives and the other half with perceptually rich objects. Luke found no differences between the groups on the problems solved with perceptually rich manipulatives, but the performance of the children with MLD was significantly worse than the other two groups on problems solved with the bland manipulatives. Thus, it appears that the conditions under which learning occurs with manipulatives are important to investigate further and that these conditions may look different for MLD and TD children. Therefore, while it is critical to identify effective instructional characteristics for all populations of children, the effects of instruction with manipulatives on MLD children is a gap in the literature that needs to be examined in more depth.

The objective of this review is to evaluate the impact of using manipulatives—i.e., concrete materials such as blocks or plastic chips or virtual representations of similar objects—on the mathematics learning of children with MLD. We are not only interested in the effects of interventions that involve manipulatives, but also in the instructional contexts in which they are used. This research will contribute to current understandings of how external representations in mathematics could be beneficial for learning, maintenance, and transfer in this population. Moreover, the pedagogical implications for special educators are significant, as there is at present no consensus on the most effective ways to use concrete, or virtual, representations for students with MLD.

1.1. Instructional Interventions for Children with MLD

A handful of researchers have investigated the effectiveness of interventions for children with MLD and mathematics difficulties. Methe et al [26], for example, reviewed case studies of interventions with children who struggle with mathematics in the domain of computation. Their results revealed moderate to large effects, but their focus was not on the nature of the instructional practices themselves. In contrast, in their review of the literature, Marita and Hord [27] showed that mathematics interventions that include explicit instruction (including instruction with manipulatives), instruction based on problem solving and discussions of student strategies, visual representations, or some combination of these factors were effective for secondary students with learning disabilities. Although Marita and Hord concluded that there are a variety of interventions that appear to be effective for students who struggle in mathematics, their review does not allow for the identification of the variables, including manipulatives, that are directly responsible for different aspects of student learning.

In another review, Jitendra et al. [28] examined the literature on instructional interventions that use visual representations (e.g., schematic drawings of part-whole and comparison word problems; see Schema-Based Instruction, Fuchs et al. [29]) and concrete representations, such as manipulatives, to teach students with MLD the structure of mathematics word problems. Jitendra et al. [28] concluded that representation, whether visual alone or visual in combination with concrete objects, is effective for problem solving accuracy, but in half of studies reviewed, teachers and students used both manipulatives and visual models, making it difficult to determine the influence of manipulatives in isolation of other representations. Similarly, Bouck et al. [30] reviewed the studies examining the effectiveness of concrete-to-representational-to-abstract (CRA) for children with learning disabilities. CRA is an instructional technique that entails presenting students with representations of mathematics concepts that move from concrete to abstract in three stages (concrete-pictorial-formal symbols). Although the authors concluded that there is sufficient evidence to support the use of CRA with children who have learning disabilities, given the nature of the instructional approach, it is impossible to determine whether the manipulatives alone had an effect on learning, or whether there are interactive effects with other aspects of the instruction. In conclusion, none of these reviews (i.e., [2628, 31]) allows for a systematic appraisal of interventions with manipulatives or the specific effects of manipulatives for children who struggle in mathematics (see also [32]).

Bouck and Park [33] is one of the few, if not only, reviews to focus specifically on the effect of interventions including manipulatives, both concrete and virtual, with children who struggle in mathematics. The authors assessed 36 studies and concluded that there is a paucity of research on manipulative use in this population and that most of the ones that exist are low in scientific credibility. Nevertheless, their analysis prompted them to recommend the use of the CRA instructional sequence to practitioners, primarily because of its relatively long history with special educators. Our review extends that of Bouck and Park [33] in two essential ways. First, they focused on children with disabilities (including children with intellectual disabilities or with autism spectrum disorder, for example), whereas we specifically target children with MLD. Because there are a variety of reasons for children struggling in mathematics, the wide net cast by Bouck and Park in their study selection to include children with several different types of disabilities makes it difficult to ascertain the populations for which interventions with manipulatives are effective. As such, in the present study, our focus on a more homogeneous population (i.e., the MLD population) will guarantee greater precision, and applicability, in our conclusions.

The second way in which our review differs from that of Bouck and Park’s review is that their analysis primarily targeted students’ immediate learning, whereas our analysis also included the outcomes measures of maintenance and transfer. The goal of any instructional program in mathematics goes beyond the immediate replication of the content given during instruction; ultimately, the aim is for students’ long-term gains as well as the capacity to transfer new knowledge to other tasks and contexts. Indeed, it is believed that the ultimate test of conceptual understanding is the ability to use it to solve a novel problem (see [34, 35]) and that flexible use of mathematics knowledge is evidenced by the ability to abstract general principles that can be applied across a number of different contexts [36].

Furthermore, it is particularly important to examine maintenance and transfer effects in the MLD population. Children with MLD have persistent deficits that are resistant to intervention [1]. Moreover, because children with MLD often present with analogical reasoning deficits [37], they may struggle to notice conceptual similarities across problems and contexts, thus preventing them from transferring their knowledge to solve novel tasks. Indeed, it is all too common for immediate effects of instructional interventions to diminish or even fade out completely within a short period of time after the interventions are completed ([38], in press), even for children who do not struggle in mathematics.

Finally, we argue that it may be short-sighted to investigate learning without also taking the interrelated constructs of maintenance and transfer into account. When children learn such that they can transfer their knowledge, they are more likely to experience lasting effects of the instruction they were provided [38]. For children with MLD, however, the difficulties they experience transferring their learning suggest they have difficulties in other areas, such as conceptual understanding and maintenance. This suggests that examining all three outcomes (i.e., learning, maintenance, and transfer) provide a more comprehensive picture of instructional effects with this population.

1.2. The Present Study

In this paper, we present a systematic review of the research examining the impact of interventions with manipulatives on the mathematics learning of children with MLD. We included studies that examined a wide variety of instructional approaches, ranging from instruction in inclusive classrooms to more targeted interventions outside the classroom for students with MLD [39] (in the remainder of the paper, we use the term “intervention” to refer to different forms of instruction with MLD students). In addition, the criteria in the DSM for diagnosing MLD have changed over time, and many studies previous to the most recent publication of the manual [1] may have classified children with MLD as having more general mathematics difficulties or vice versa. Furthermore, researchers in different fields often use a variety of criteria to classify children with mathematics difficulties. For these reasons, we broadened our database search beyond the MLD population to ensure that we selected all studies that targeted children who were experiencing difficulties in mathematics, even if they were not identified as having MLD.

These procedures resulted in a large initial sample, which we then narrowed down using the following exclusion criteria to focus the review on interventions for students with MLD. We excluded from our sample any study that focused only on children with disabilities other than MLD (i.e., intellectual disabilities, autism spectrum disorder, and emotional disorders). Some studies placed children with MLD and intellectual disabilities together in one instructional group; we excluded these studies from our review because we were unable to isolate the children with MLD from those with intellectual disabilities. For single-case studies (we use the term “single-case study” to refer to both single-case and multiple-case studies), we focused the analyses only on the participants in the samples with MLD and not on those with other disabilities.

In the present review, we aimed to extend the results of Bouck and Park [33] by (a) focusing our review of the effects of interventions with manipulatives on the MLD population and (b) examining immediate effects as well maintenance and transfer of learning with manipulatives, regardless of instructional technique. In particular, we examined the extent to which the practices documented in the literature can be considered “evidence-based,” which reflects the notion that “empirical evidence forms the basis for determining what important features, qualities, or outcomes are associated with an intervention or prevention program” ([40], p. 3). Our review also attends to more nuanced questions regarding the conclusions that can be drawn from the studies in our sample. That is, we also attended to whether the effects of interventions with manipulatives—or more specifically, the effects of the manipulatives themselves—could be interpreted as causal in nature. For this, we determined whether the appropriate controls were in place (e.g., random assignment to conditions for group studies; design and sufficient baseline data for single-case studies). We also addressed the question of the extent to which the manipulatives themselves added benefits above and beyond other elements of the intervention. This was determined by examining the research design to establish whether appropriate comparisons were made (i.e., comparing an intervention with manipulatives to the identical intervention without).

In sum, the review addressed the three following research questions:(1)What are the instructional contexts for the interventions with manipulatives? Specifically, what skills were targeted by the interventions, what were the characteristics of the interventions, and what types of manipulatives were used?(2)Can interventions that include manipulatives be considered evidence-based for children with MLD in terms of immediate learning, maintenance, and transfer?(3)Do the research designs used in the studies allow us to conclude that the manipulatives themselves added value to the interventions and do the designs allow for causal conclusions to be drawn about the interventions with manipulatives?

We will address the second and third research questions by assessing the methodological quality (i.e., [41, 42]) of the research reviewed and specific aspects of the research designs.

2. Method

In conducting the review, we used the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) statement [43, 44], which is a set of guidelines for reporting and conducting systematic reviews of health-care interventions. The PRISMA statement consists of a four-phase procedure for conducting systematic reviews, which entails searching and selecting studies and extracting and coding data.

2.1. Search Strategy and Study Selection

We conducted the search for studies between September 15 and November 20, 2017. The selection procedure is presented in Figure 1. The first author identified studies by conducting eight separate searches of the Cochrane, PubMed, PsycInfo, and ERIC electronic databases. Each of the terms “dyscalculia,” “math learning disabilit,” “math difficult,” and “low-math” were paired with the term “manipulatives” for the first four searches and then paired with the term “concrete” for the remaining four searches. We received e-mail alerts on any subsequent related publications.

A total of 306 studies were identified through these searches. After excluding duplicates (n = 135), the titles and abstracts of 171 studies were screened for eligibility. The following criteria for inclusion were used for the screening. The studies (a) were conducted with participants who were either identified by the authors as having MLD, or who we deemed as children likely to have MLD according to the DSM-5; (b) reported primary data; (c) assessed the effectiveness of an intervention delivered with manipulatives, regardless of delivery format (individually, in small groups, or in the whole class) or instructor (teacher, psychologist, speech-language pathologist, or researcher); and (d) focused on improving performance regardless of mathematical domain. We excluded studies if the sample consisted only of participants with intellectual disabilities or autism, if the outcome measures targeted general cognitive processes or science skills, or if the article was not reported in French or English. We did not set any limitations on publication date.

The process of determining eligibility resulted in 17 full-text articles, which were read by the first author in their entirety. The bibliographies of these articles were examined, and an additional 28 articles were found, which were also read by the first author. The same eligibility criteria were applied again to the 45 studies, and seven were excluded, resulting in a final sample of 38 articles.

2.2. Data Extraction and Analysis

Two members of the research team (the first and third authors) extracted information about participants and outcomes (Table 1) and characteristics of the intervention (Table 2) using a spreadsheet created specifically for this review. With respect to participants, information about sample size, grade level, and type of learning difficulty was extracted. The outcome measures identified were immediate learning, maintenance, and transfer of both cognitive and affective variables. With respect to the interventions described in the studies, data were extracted on the mathematics topic targeted, primary and secondary learning objectives, characteristics of instructional delivery, and type of manipulatives used. Furlong et al. [32] defined primary mathematics outcomes as those that pertain specifically to mathematical learning objectives and secondary outcomes as those that are not discipline-specific.

To determine the level of methodological quality of each study, we applied the quality indicators (QIs) outlined by Gersten et al. [82] to each group study and the QIs outlined by Horner et al. [83] to each single-case study. Gersten et al.’s quality indicators are subdivided into two main categories: essential and desirable. Essential indicators pertain to study design, analysis, and the disclosure of information about participants, procedures, and measures. Desirable indicators were similar to essential ones, but, if absent, may have resulted in flaws that were less fatal to the credibility of the research (see Gersten et al. for detailed information and descriptions regarding the QIs). The QIs presented by Horner et al. [83] are not divided into “essential” and “desirable” categories, but are similar to those of Gersten et al. [82] because they relate to issues related to research design, analysis, and the disclosure of information related to participants, procedures, as well as internal, external, and social validity.

The same coding procedures were conducted with group studies and single-case research. Using the appropriate set of QIs, we adopted Jitendra et al.’s [28] procedure of assigning a code to each QI on a scale from 1 to 3 (3 = indicator fully met, 2 = indicator partially met, and 1 = indicator not met) for each study. According to Gersten et al. [82]; a group study would need to meet all but one of the essential QIs (i.e., 9 of 10 indicators) and demonstrate at least four of the quality indicators listed as desirable (i.e., four of the eight) to be considered a “high-quality study.” A study would need to meet all but one of the essential QIs and demonstrate at least one of the QIs listed as desirable to be considered a study of “acceptable quality.” Because Horner et al. [83] did not provide any criteria for the number of QIs needed to determine the quality level for the single-case studies, we set similar criteria as Gersten et al. [82] for single-case studies. All but one of the QIs needed to be met (i.e., 20 of the 21 indicators) for a study to be considered high quality and at least 18 QIs to be considered of acceptable quality.

Gersten et al. [82] proposed that an instructional practice can be considered evidence-based if there are at least four group studies of acceptable quality or two high-quality group studies that support the practice. In either case, a weighted effect size significantly greater than 0 was a second criterion for classifying a study as evidence-based. In the case of single-case research, Horner et al. [83] proposed that a practice may be considered evidence-based when there is a minimum of five single-case studies that meet minimally acceptable methodological criteria, which we operationalized as either high or acceptable quality. Additional criteria were that the five or more studies are published in peer-reviewed journals, the studies are conducted by at least three different researchers across at least three different geographical locations, and the studies include a total of at least 20 participants.

The two coders (i.e., the first and third authors) used the studies by Gersten et al. [82] and Horner et al. [83] to assign quality codes (i.e., from 1 to 3) to each study in the sample. We calculated interrater agreement between the two coders on the quality codes across all studies (i.e., group studies and single-case) using percent agreement and Cohen’s kappa. Percent agreement of 64% was obtained for group studies, which corresponded to a Cohen’s k of 0.287. Percent agreement of 62.2% was obtained for single-case studies, which corresponds to a Cohen’s kappa of 0.250. The two coders resolved the discrepancies through discussion before they independently coded all the studies a second time. The mean agreement for QIs across all studies was then 95.5%, which corresponds to a kappa of 0.924. All remaining discrepancies were resolved through discussion.

3. Results

In this section, we begin with a description of the research methodology and characteristics of the participants across the sample. We then report the findings for each of the three research questions in turn.

3.1. Study Characteristics
3.1.1. Methodology

In the sample, we found 16 group studies (single-case designs or multiple-group studies with either experimental or quasiexperimental designs) and 22 single-case studies. Twenty-three studies in the sample incorporated inferential statistics to test effects (15 group studies and eight single-case studies), whereas the remaining 15 studies reported only descriptive statistics (one group study and 14 single-case studies). As shown in Tables 1 and 2, our analyses also highlighted considerable variance in all the study characteristics that were targeted in our review; that is, samples of children with MLD ranged in size from 3 to 259, outcome measures assessed immediate learning, maintenance, and transfer on the cognitive level, and interest and confidence on the affective level.

3.1.2. Participants

In total, 2250 children were tested altogether across the 38 studies. Among these participants, 1131 were children with persistent mathematics difficulties, most of whom we classified as having an MLD either because the authors of the study used the DSM criteria in effect at the time the study was conducted or we ourselves made the determination using the current criteria outlined in the DSM-5 [1] and the information provided about the sample. The remaining 1119 children were either typically developing (n = 1064) or had disabilities other than MLD (n = 55), neither of which were the focus of this review. In some single-case studies or studies in which the intervention was delivered to small groups of students or whole classes, students with emotional or intellectual disabilities were grouped with those students with MLD. In these cases, only data on the students with MLD were included in our analyses. In the case of randomized control studies and quasiexperimental group studies, our analyses included the data only for groups of children with MLD.

No study focused on children aged 5 years or below, 15 on children aged 6 to 9 years (first through third grades), 16 on children aged 10 to 12 years (fourth through sixth grades), and 16 on adolescents aged 13 years and older (seventh grade and above). In one study [76], the authors collected data from students at the elementary level, but did not indicate the specific ages or grades of the participants (the total number of studies in this count exceeds 38 because some of them included two or more age groups).

3.2. Research Question 1: Instructional Contexts
3.2.1. Targeted Skills

We found a variety of primary and secondary outcomes [32] in the sample. Four primary outcomes [32] were targeted in the sample: (a) precursor skills, such as counting and number sense, (b) arithmetic computation, regardless of strategy, (c) word problem solving, and (d) advanced mathematical skills, which we operationalized as topics in the school curriculum in the fourth grade and above. The five secondary outcomes included internalizing problems (such as anxiety, and depression), externalizing problems (such as aggression and defiance), hyperactivity or attention symptoms or both, user satisfaction, costs, and cost-effectiveness data.

With respect to the primary outcomes, seven of the studies in the sample targeted precursor skills (e.g., counting and number comparison in [51] and place-value in [80]), sixteen studies targeted arithmetic computation (e.g., addition in [68], subtraction in [63], basic multiplication facts in [74], and division facts in [75]), eight studies targeted word problem solving, and 15 studies targeted advanced mathematical skills (e.g., fractions in [50], algebra in [53], and geometry in [66]) (the total number of studies results in a number bigger than 38 because some studies focused on more than one outcome). No study focused on transcoding (i.e., reading and writing numbers). Concerning the secondary outcomes, only one study examined internalizing problems, and Yang et al. [51] assessed the effect of an intervention on mathematics interest and confidence. No other secondary outcomes were evaluated in the sample.

3.2.2. Intervention Characteristics

Instructional delivery varied in terms of the length of instructional units, number of lessons, and the length of each lesson. Not all 38 studies reported data on the length of the interventions, however. The data reported here represent only the studies that contained sufficient information for an analysis of instructional delivery. For 15 studies, the length of the instructional units ranged from three days (e.g., [48]) to seven months [62], M = 99.0 days, SD = 149.1 days. For 34 studies, the number of lessons ranged from three [56] to 70 [68], with a mean number of lessons at 18.6 (SD = 13.7). In 29 studies, the length of each lesson ranged from 10 minutes [80] to 55 minutes (e.g., [61]), M = 29.1, SD = 12.7. Together, these data show that for the studies that reported sufficient information about intervention characteristics, the briefest intervention ran for a total of 105 minutes and the longest for 1400 minutes (M = 499.3, SD = 365.1).

Finally, 36 studies in the sample included information about instructional contexts and settings. The results indicated considerable variability here as well; nineteen studies delivered targeted mathematics interventions to individual students outside the classroom, four offered interventions to students in small groups outside class, and 13 offered whole-class instruction.

3.2.3. Manipulative Type

Information on the type of manipulative used across the samples studies is in the right-most column of Table 2. With regards to the types of manipulatives used, seven studies used virtual manipulatives (e.g., [46]), and 34 used concrete materials. Of the seven intervention studies that used virtual manipulatives, five used a computer program and two used an app on a mobile device. One of these studies reported the delivery of the VRA (virtual-to-representational-to-abstract) teaching sequence. Although some of the virtual manipulatives were identified (e.g., polynominoes and fraction tiles), information on how the digital platform allowed them to manipulate the objects was scarce. With respect to the type of concrete manipulative used, one of the 34 studies used Cuisenaire rods [73]; one used Rekenrek [63]; one used Geoboards [66]; seven used base ten blocks (e.g., [60]); one used plastic 5 frames, 10 frames, and double 10 frames with counters [63]; two used algebra manipulatives such as balance scales, chips, and canisters (e.g., [47]); four used fraction manipulatives such as fraction circles, fraction squares, and various other concrete objects with shaded regions (e.g., [50]); and 18 used various everyday materials that are not specially designed for mathematics, such as popsicle sticks, string, buttons, plastic chips, paper plates, and plastic discs (the total number of studies results in fewer than 38 because some studies used more than one type of object). One study used the TouchMath Addition Mastery Kit [68], but the authors did not identify the types of manipulatives used in the study. Five of the 34 studies did not provide any information on the type of materials used in the instruction (e.g., [67]).

3.3. Research Question 2: Evidence-Based Interventions for Immediate Learning, Maintenance, and Transfer

In this section, we report on the quality ratings of the group studies and single-case studies using the criteria laid out by Gersten et al. [82] and Horner et al. [83], respectively. These criteria allowed us to establish which of the interventions with manipulatives could be considered evidence-based for each of three outcomes of interest: immediate learning, maintenance, and transfer.

3.3.1. Quality Ratings across the Sample

According to Gersten et al.’s [82] standards, we found three high-quality group studies [52, 53, 59] and one of acceptable quality [57]. We deemed the remaining 12 studies as not acceptable. Using Horner et al.’s [83] standards, our analysis revealed seven high-quality single-case studies [31, 47, 54, 58, 66, 74, 81] and we judged 10 studies to be acceptable [45, 46, 49, 55, 56, 60, 61, 69, 70, 73]. We categorized the remaining five single-case studies as not acceptable.

Immediate performance was measured in all 16 group studies. Of the 16 studies, immediate learning was the only outcome measure targeted in seven of these, maintenance was additionally assessed in six, transfer and learning were assessed in an additional three, and none of the 16 group studies assessed both maintenance and transfer. In terms of the studies based on single-case designs, all 22 assessed immediate performance as part of the experimental assessment. Of these, eight studies focused on immediate learning only, five assessed maintenance in addition to learning, two assessed transfer and learning, and seven of the 22 studies based on single-case design assessed both maintenance and transfer.

3.3.2. Immediate Learning

All 38 studies demonstrated immediate student learning, either statistically or descriptively. The 16 group studies demonstrating learning effects provide some evidence of positive change for interventions involving manipulatives, but the benefits of manipulatives must be tempered because we judged only three of these as being of high quality and one as acceptable. Only the three high-quality group studies reported effect sizes (between 0.245 and 2.50). The designs in these three studies did not all permit conclusions about the differential effects of interventions with manipulatives relative to interventions without. (In the study of Fuchs et al. [52]), for example, the authors compared three groups that all received interventions with manipulatives. The independent variable was the interpretation of fractions provided (i.e., measurement division versus part-whole). The effects of interventions with manipulatives, therefore, cannot be concluded from this design. A similar design was used in Powell and Fuchs [59]. Therefore, in terms of immediate learning, we conclude from these data that there is insufficient evidence to identify evidence-based practices with manipulatives for students with MLD.

With respect to single-case studies in the sample (n = 22), manipulative use was associated with immediate learning in seven high and 10 acceptable quality studies by nine research teams in nine locations (i.e., nine states in the US) for 85 children with MLD. These results indicate that for immediate learning, the use of manipulatives can be considered evidence-based according to the criteria for single-case designs laid out in Horner et al. [83].

3.3.3. Maintenance

Maintenance of gains was measured using delayed tests in 18 of the 38 studies in the sample, consisting of six group and 12 single-case studies. Among these, maintenance ranged from a few days (e.g., [76]) to 11 weeks [61]. Only one group study (i.e., [53]) was classified as being of high quality and reported an effect size of 0.74. The remaining five group studies were judged as unacceptable. According to Gersten et al. [82], these maintenance data do not provide sufficient evidence to conclude that the interventions with manipulatives in our sample are evidence-based.

Of the 12 single-case studies that assessed maintenance, we found four of high quality and seven of acceptable quality by seven research teams in seven locations (i.e., seven states in the US) for 45 children with MLD. The 11 single-case studies of high and acceptable quality were all based on multiple-baseline designs, which are considered suitable for establishing a functional relation between the manipulation of the intervention and the dependent variable [41]. On the other hand, the follow-up assessments in these studies, delivered after the interventions were completed, are subject to several threats to internal validity, such as the nature of the measures administered, the amount of time between intervention and follow-up, and potential confounds related to any interventions delivered after the experiment is completed. As such, the evidence of maintenance in all 11 studies, although important, can only be considered descriptive or anecdotal, thereby limiting the authors’ claims. As such, these findings prevent us from the concluding that the interventions in these single-case studies are evidence-based practices.

3.3.4. Transfer

Transfer was measured less often than immediate learning and even less often than maintenance. Twelve studies in the sample claimed to demonstrate transfer: three were group studies and nine were single-case studies. Of the three group studies, one was of high quality with an effect size of 1.06 (i.e., [59]), one was judged as acceptable, and one was judged as not acceptable. As such, according to Gersten et al. [82], the criteria for concluding that interventions with manipulatives are evidence-based for transfer are not met.

Of the nine single-case studies that assessed transfer, we classified two as high quality and seven as acceptable. Our analysis also revealed that these single-case studies were conducted by five different research teams in five geographic locations (i.e., five states in the US) for 38 children with MLD. As was the case with maintenance, however, a closer look at how transfer was assessed in these studies reduces the confidence one can place in the effects claimed by the authors. Transfer was measured by administering performance on tasks that are to a greater or lesser extent different from those in the intervention. Desired performance on transfer tasks can be attributed to the student applying what was learned to contexts beyond the confines of the intervention. In seven of the nine single-case studies, however, the authors administered the transfer measures either immediately following the intervention or after a period of time upon its completion. Such data are an important component of assessing the impacts of an instructional intervention, but because they were not part of the experimental assessments, we are unable to conclude from these descriptive data alone that the interventions in these nine studies are evidence-based. In the remaining two studies [45, 47], the transfer tasks were administered during the intervention in the context of alternating treatment designs. In these cases, the units of analysis are not independent [84], thus compromising the conclusions that can be drawn.

In sum, only the single-case research in our sample allows us to conclude that interventions with manipulatives can be considered evidence-based for children with mathematics difficulties and this only for immediate learning. Considerable methodological weaknesses prevent a comparable conclusion to be drawn from the group studies for all three outcomes targeted in this review. We also note that the wide variability in the outcome measures targeted in the single-case studies hinders our ability to draw conclusions about the effects of manipulatives for the learning of specific topics or learning outcomes.

3.4. Research Question 3: Value Added by Manipulatives and Causal Effects of Interventions with Manipulatives

To determine the value added by manipulatives themselves, one must assess the difference between the targeted intervention with manipulatives and the same intervention without them. Suitable comparison groups (or phases in the case of single-case studies) are thus required—treatment-as-usual comparisons leave various alternative explanations open regarding the reasons for the effects, whereas comparisons of identical interventions without the independent variable of interest (in our case, manipulatives) would serve to isolate the effects of manipulatives alone [85]. Therefore, we examined the interventions delivered in comparison (or control) groups in the group studies in our sample; in the single-case studies, we assessed whether the studies compared phases with identical treatment interventions with and without manipulatives.

Among the 38 studies, only five studies were designed to establish the value added by manipulatives. Three of the five studies were group studies and two were single-case. All studies assessed immediate learning, but only one was judged as high quality, one as acceptable, and three were judged as not acceptable (see Table 3 for details). In terms of maintenance, only one of the group studies made a comparison to isolate the effects of manipulatives alone, and in the case of transfer, only two single-case studies were designed to assess such value added. We propose, therefore, that there is little evidence in our sample to determine that manipulatives themselves provide benefits to children with MLD over and above comparable interventions without manipulatives.

To establish causal effects of interventions with manipulatives, group studies must incorporate random assignment of the unit of analysis to conditions (e.g., [42]). The criteria we used for single-case studies were (a) a baseline of at least three data points [83] before the intervention and (b) a design based on one of the three single-case designs for establishing experimental control: within series, between-series, and combined series (e.g., multiple-baseline) design [84]. Regarding immediate learning, a total of 21 studies were designed to establish causal effects. That is, random assignment was present in two group studies and a suitable baseline was established in 19 single-case studies, all of which were based on a multiple-baseline design. Of these 21 studies, we judged nine to be of high quality, 10 of acceptable quality, and two were deemed not acceptable (see Table 3 for details). Regarding the outcome of maintenance, 12 studies (one group study and 11 single-case studies) were designed to establish causal effects—that is, the authors of the one group study randomly assigned participants to conditions, and all the 11 single-case studies were based on multiple-baseline designs that incorporated sufficient baseline data for experimental control. Of all 12 studies that demonstrated causal effects, we judged five to be of high quality and seven of acceptable quality. With respect to transfer, 11 studies (two group studies and nine single-case studies) met sufficient methodological criteria to determine causal effects of interventions with manipulatives. Random assignment was employed in both group studies, and all nine single-case studies used appropriate designs for experimental control (two between-series and seven multiple-baseline designs) and provided sufficient baseline data. Of all 11 studies, we assessed three to be of high quality and seven to be of acceptable quality.

Although these analyses provide evidence of causal effects of interventions with manipulatives for all three outcome measures, some of these studies were nevertheless of poor quality (i.e., not acceptable) according to our previous analysis (i.e., the findings to Research Question 2). In other words, in five studies (three assessing immediate learning, one assessing maintenance, and one transfer), the experimental design for establishing cause was present, but too many quality indicators were absent for us to classify the studies as evidence-based. This information should be taken into consideration when interpreting the findings on causal effects in this section.

4. Discussion

Our aim in this chapter was to conduct a review of the literature to evaluate the impact of using manipulatives, either physical or virtual, on the mathematics learning, maintenance, and transfer in children with MLD. We used the frameworks established by Gersten et al. [82] and Horner et al. [83] to assess the methodological quality of the studies in our sample. Quality coding was then used to determine whether any of the instructional practices with manipulatives could be considered evidence-based. We also extended our analysis to assess whether any of the studies could claim causal effects of interventions with manipulatives and whether any evidence exists for the benefits of manipulatives themselves, over and above other instructional practices.

Our first research question addressed the instructional contexts in which the interventions were delivered, namely, the skills that were targeted, the characteristics of the interventions themselves, and the types of manipulatives used. The interventions varied considerably in total duration, the length of each session, and the number of sessions; the size of the groups receiving the intervention (one-to-one, small group, whole class) also varied from study to study. In addition, the types of manipulatives varied greatly as well, with some interventions using concrete materials, pictorial representations (as in the case of CRA), or virtual manipulatives. Interventions involving manipulatives also differed with respect to the mathematical domain targeted (i.e., precursor skills, arithmetic computation, word problem solving, and advanced mathematical skills).

Our second research question focused on whether interventions that include manipulatives can be considered evidence-based for children with MLD in terms of immediate learning, maintenance, and transfer. A quick glance at the findings in our sample may suggest that, overall, mathematics interventions with manipulatives are effective for children with MLD. Among the 38 studies, all showed immediate improvement, reported either statistically or descriptively. Applying the criteria established by Gersten et al. [82] and Horner et al. [83] to determine whether any of the practices in these studies can be considered evidence-based, however, a different picture emerges. For instance, an immediate learning effect was found in four group studies of high or acceptable quality, but two of these studies did not address the effects of interventions with manipulatives relative to those without (e.g., [52]). As a result, we were not able to draw conclusions about whether the instructional practices with manipulatives in the high-quality group studies are evidence-based.

In contrast, the studies using single-case designs credibly demonstrated immediate learning of such mathematical outcomes as arithmetic computation, word problem solving, and advanced mathematical skills. We note, however, that skills such as transcoding (e.g., the ability to read and write numerals) and the development of counting principles, such as cardinality and one-to-one correspondence, were not examined in the studies we reviewed. Given that such that competencies are important predictors for school success in mathematics [8688], we point out the omission as a suggestion for future research.

We were also interested in examining the effects of interventions with manipulatives on students’ maintenance and transfer. Learning that lasts over time has obvious benefits for both teachers and students and has surfaced as a particular challenge in responding to the needs of children with MLD. In addition, the goal of mathematics education goes beyond simply reproducing the material taught during instruction; the ultimate goal is for students to meaningfully apply (i.e., transfer) new knowledge to other tasks and contexts. The studies we reviewed, however, did not reveal credible effects of maintenance or transfer. A maintenance effect was found in only one high-quality group study, and only a small handful of group studies measured transfer, with just one of them judged as high quality. Relative to group studies, a larger number of studies based on single-case designs claimed maintenance and transfer effects, but our analysis of their methodologies produced disappointing conclusions. The maintenance and transfer outcomes assessed in the single-case studies were not incorporated into the between-series and multiple-baseline designs in ways that established functional relations between manipulations of the instructional interventions and the dependent variables.

Despite the promising findings with respect to immediate learning, we found too many instructional variations across the sample to develop prescriptive models for how to use manipulatives with MLD children. A larger number of controlled studies that isolate specific instructional features, such as length of instruction and type of manipulative used for specific outcome measures, are required to draw more definitive conclusions. In a study that manipulated the length of instruction, for example, Kroesbergen and Van Luit [89] demonstrated that brief mathematical interventions had greater impacts than longer interventions, presumably because they would allow for more targeted focus on specific topics (see [90] for such same effects of phonemic awareness instruction). Furthermore, the perceptual features of an object can detract from students’ performance and negatively affect their problem solving [11, 25], but these effects are dependent on the outcome measured [20] and moderating variables, such as prior knowledge (Peterson and McNeil, 2013). Finally, the timing and sequencing of lessons with manipulatives appear to matter as well [21]. Given this context, we find it premature to draw definitive theoretical and practical conclusions on the use of manipulatives in mathematics interventions for children with MLD. We argue that, similar to typically developing populations, manipulatives could be used to great effect with children with MLD, but instructional nuances (e.g., the type of manipulative, the instructional techniques used, the mathematical topic, and cognitive factors) may be responsible for differential effects for the two types of learners. Clearly, this assumption needs to be verified in future studies.

Along the same lines, we observed positive effects of interventions involving manipulatives for students with a wide variety of individual differences, such as age (e.g., 6 to 17 years old, first to twelfth grade) and the type of difficulty described (e.g., children with MLD, children with mathematics difficulties at school but without an official or known diagnosis, and children at risk of developing MLD). Again, however, positive effects were found regardless of students’ age or specific learning challenges, which makes specific prescriptions for uses of manipulatives with the MLD population elusive. Furthermore, student characteristics related to general cognitive ability or executive functioning skills were not directly addressed or tested in any of the studies reviewed. This is a glaring omission, as relatively recent work has identified general cognitive factors as moderators of intervention effects with at-risk elementary students. Fuchs and her colleagues (i.e., [50, 52]), for example, investigated the role of individual differences in general cognitive skills (such as working memory, for example) on the effects of interventions designed to improve at-risk fourth graders’ fraction knowledge. Fuchs et al. [52] first showed that intervention effects were moderated by domain-general abilities. Results of a follow-up study (i.e., [50]) then revealed that children with very weak working memory capacity learned better with activities focusing on concepts, but children with more adequate (but still weak) working memory learned better with activities that honed fluency skills. These two studies, however, did not address the role of individual differences as a function of the presence or absence of manipulatives. We thus recommend that more research examines the role that general cognitive ability and executive function play in students’ learning from interventions with manipulatives.

Prior knowledge is another student characteristic that can impact the conclusions drawn about the effects of mathematics interventions involving manipulatives. Peterson and McNeil (2013), for example, demonstrated that children’s counting performance was compromised if they had what the authors called “established knowledge” of the manipulatives they were counting. They speculated that the children were distracted by what they knew about the objects represented by the counters (e.g., their knowledge of zebra when they were counting with manipulatives that looked like zebras); in contrast, the students performed significantly better with objects that were unfamiliar to them about which they had no prior knowledge. In another study also with typically developing students, Osana et al. [22] found that second-graders’ prior knowledge of numeration was correlated with the students’ learning about the base-four positional system in an intervention that involved manipulatives, and their prior knowledge was also correlated with the ability to transfer the conceptual structure to novel problems.

Finally, students appear to benefit when they acquire what Uttal, Liu, and DeLoache [91] called “dual representation,” the understanding that manipulatives are objects with their own physical and perceptual features as well as objects that “stand for” something else, such as, in this case, quantities or mathematical ideas. Indeed, dual representation of mathematics manipulatives has been shown to predict the extent to which they use the objects as representing intended mathematical quantities [92]. While a growing body of research suggests that children’s prior knowledge (either of the manipulatives themselves or of prerequisite mathematical concepts) and internal representations of manipulatives are predictive of learning and transfer, the effects of student variables have not been studied as comprehensively or as systematically in the MLD population. Furthermore, the severity of numerical deficits in children with MLD (such as cardinality or subitizing) could be an additional moderator of intervention effects, but no such moderator was considered in any of the studies in our sample. We thus call for researchers to investigate the role of student characteristics more systematically in future studies.

Our third research question concerned whether the effects of manipulatives alone (i.e., the value added by the manipulatives themselves) could be determined from the research we reviewed, and the extent to which causal effects of interventions could be established in the sample. Only two high or acceptable quality studies were designed to establish the value added by manipulatives for immediate learning and transfer, providing little support for the benefits of manipulatives over and above the effects of comparable interventions without manipulatives. Concerning causal effects, random assignment was present in one group study of high quality and a suitable baseline was established in 17 single-case studies of high or acceptable quality. Together, these results allow us to conclude that there is some evidence to show that interventions using manipulatives can cause positive mathematical outcomes in students with MLD.

Overall, we found the methodological quality in the sample far from perfect, which limits the conclusions that can be drawn. For example, the less-than-resounding evidence for causal effects of interventions with manipulatives can be explained, in large part, by the lack of carefully designed experimental group studies. Even among those studies that were designed to establish experimental control (i.e., predominantly single-case studies), not all were judged to be of high or acceptable quality. Methodological shortcomings also dilute the quality of evidence on maintenance and transfer and can account for the lack of data on the value added by the manipulatives themselves. Additionally, the required information that would allow for complete assessments to be made regarding our three research questions was not available in many of the published reports. Without information about key methodological and procedural aspects of the research, the interpretability of the data is compromised, as are the pedagogical implications that are derived from them. For instance, in many of the single-case studies that assessed maintenance and transfer, little to no information on possible confounding variables were provided (i.e., what took place in the period of time between the experiment and when the follow-up or transfer data were collected). Also, key pieces of information about the instructional interventions, for example, such as the total duration of the instruction, the number of sessions, and the length of each session, were frequently omitted. Furthermore, critically important details about the teacher’s (or researcher’s) practices, such what he or she said and presented to the children at key moments during instruction, were absent from almost all the reports. Information about the students themselves was rarely reported; children’s domain-general and domain-specific cognitive abilities were not assessed in the vast majority of studies, and socioeconomic variables were also rarely considered. Finally, very little information was provided about the types of manipulatives used and how they were used, and in some studies, the location of the data collection was not specified.

That few of the studies reviewed met desirable scientific thresholds does not necessarily imply that none of the interventions is, in fact, effective for some outcome or another. In fact, we maintain that the collection of studies in this review provides a number of instructional resources for practitioners. Well represented in the sample is CRA, for example, which is an application of “concreteness fading,” an empirically supported theoretical framework for instruction in mathematics and science [12]. For example, one study of acceptable quality in the sample [57] involved the delivery of CRA, and certain core instructional techniques between the two studies could be quite useful for teachers. For example, aside from concreteness fading itself, the intervention involved explicit instruction, which included advanced organizers, demonstrations, guided practice, and independent practice. In addition, the instructors in both studies used mnemonic devices, such as cue cards and posters, so the students’ cognitive load involved in computation and problem solving was alleviated.

Given the designs of the studies in the sample, we were unable to pinpoint the specific aspects of instruction that were responsible for the improvements observed. In the case of CRA, for example, did the explicit explanations, or when and how they were delivered, predict improved performance? Did the teachers’ use of the manipulatives during explanation and practice account for learning? Were the tools used to alleviate cognitive load responsible for the effects observed? These are questions that cannot be answered at this time, but we argue that teachers can nevertheless use the ideas and approaches described to test whether they are useful for the students in their own classrooms. Practitioners are accustomed to testing a variety of approaches, particularly with students for whom “traditional” instruction is not effective. We maintain that the research reviewed here can be viewed by educators as a collection of resources that can inspire and motivate their practice.

5. Conclusion

The present study is, to our knowledge, the first systematic review on immediate learning, maintenance, and transfer effects of manipulatives in the context of instruction with the MLD population. Despite methodological limitations found across the sample, we can tentatively conclude that interventions with manipulatives show promise for children who struggle to learn mathematics. Our optimism must be tempered by the wide heterogeneity in methodological quality, the absence of instructional variables and student characteristics that are known to influence intervention effects, and insufficient consideration of possible confounding and moderating variables that have been shown to impact mathematics learning with manipulatives in typically developing populations. More systematic studies are needed to contribute to current theory on the instructional potential of manipulatives in the MLD population and to build instructional models that are pedagogically useful for special educators. We are also aware of the inherent publication bias (i.e., the tendency for studies that show statistically significant effects to be published over those that show null results; see [11] in a review such as ours, which also limits our ability to draw general normative or prescriptive conclusions from the review). Although we attempted to address the bias by including some nonpublished works, we are aware of the inherent limitation of our recommendations. Despite our rather bleak assessment of the current literature in the area, we are encouraged by the research attention that is accorded to children’s difficulties in learning mathematics and the efforts to translate the findings to instructional practice in school and clinical settings. We hope that this review can steer researchers in productive directions and for future studies to build on each other in coherent ways.


The data used in this article were presented at the 2018 meeting of the Mathematical Cognition and Learning Society in Oxford, UK.

Conflicts of Interest

The authors have declared that no conflicts of interest exist.


This research was supported by a grant from the Social Sciences and Humanities Research Council of Canada (435-2015-2002) and by Concordia University. We would like to thank Thomas Kratochwill for his generous assistance in our interpretations of single-case research design.