Review Article

Performance-Based Executive Function Instruments Used by Occupational Therapists for Children: A Systematic Review of Measurement Properties

Table 2

Summary of EF tools’ measurement properties.

InstrumentAuthorYearCOSMIN adequacy of measurement properties
Content validityStructural validityInternal consistencyReliabilityConstruct validityCross-cultural validityCriterion validity
BADS-CEngel-Yeger et al.2009Age: significant differences exist between the different age groups between three age groups on the following: playing card test (), water test (), key search test (), and zoo map test ().
Gender: not significant.
Socioeconomic status: not significant.
Parent’s education: not significant.
Underwent forward (Hebrew) and backward (English) translations by a bilingual clinician.
CCTChevignard et al.2009Interrater reliability
Total number of errors:

Types of errors:

Substitution sequence errors:
Group differences: significant differences exist only between the total number of errors (), number of errors of each type (), and results of the qualitative analysis () of the cooking task in the TBI and control groups.No significant correlation was found between the total number of errors in the cooking task and the scores on the different neuropsychological tests or behavioural questionnaires (i.e., RCF, WCST, TMT-B, Tower of London, six-part test, RBMT, BRIEF, DEX-C).
Chevignard et al.2010Cronbach’s Test-retest reliability
Total number of errors:

Duration of the task:

Types of error:

Substitution-inversion:

Estimation errors:

Purposeless action:
Age: the total number of errors in the CCT significantly decreased with age in the control group (; ) and in the TBI group (; ).
Group: there is significant difference only in the total number of errors between TD and TBI children ().
The CCT was translated.
In English, recipes in the cookbook and the utensils were mildly changed, as quantities were expressed in “cups” and “tablespoons” instead of glasses. The CCT was trialled by one examiner and three typically developing children, indicating that instructions and recipes were clear and understandable.
Overall, performance in the CCT was significantly correlated () to general cognitive ability, to some of the cognitive tests of executive functions on the D-KEFS (trails, verbal fluency, sorting, twenty questions), and the cognitive subscale of the DEX-C questionnaire.
Fogel et al.2020Group: significant differences were found between the groups in the CCT assessment scores ().
Error types: discriminate function was found for group classification of participants in the descriptive () and neuropsychological () analyses of the CCT.
The CCT was translated into Hebrew through a process of forward and backward translations. Content validity was pilot tested on a group of five children and a focus group of seven OTs.A medium positive correlation was found only between the BRIEF-SR subscales plan/-organization (, ) and task duration.
CKTARocke et al.2008Cronbach’s Interrater reliability:
Performance significantly improved as age increased (ns). Can discriminate between high- and low-scoring participants when compared to the BRIEF (inhibition: , BRI: ), D-KEFS Confirmed Correct Card Sorts (), and WISC-IV Digit Span backwards ().
Do-eatJosman et al.2010Content and face validity: validated by five expert consultants and five experienced pediatric occupational therapists.Cronbach’s Interrater reliability:
Construct validity for the Do-Eat was assessed by gauging the tool’s ability to distinguish between the groups of children with and without DCD and found significant differences in executive functions ().The EF task was not specifically correlated to any EF assessment.
Rosenblum et al.2015Cronbach’s Significant group differences were found in the EF scores () with and without ADHD.Significant correlations were found in the ADHD group between the EF Do-Eat score for “preparing chocolate milk” and BRIEF BRI () and MI () scores only.
PETADownes et al.2018Interrater reliability:

Intrarater reliability:
Age: performance significantly increased with age in line with the rapid development of executive skills reported during this period (). Chronological age predicted 40% of the variance in TS (). Age was strongly related to performance on all quantitative domains of the PETA (TS, TC, initiation, sequencing, metacognition, completion, time for completion; ), except for judgment/safety.
Domain scores: examiner ratings of organization during the PETA task showed that the poor PETA group obtained the poorest teacher ratings on the BRIEF-P plan/organize domain, followed by the typical group and the very good group (). Other results were not significant.
The PETA TS was compared with the BRIEF-P GEC. A significant association was observed between the PETA TS and the BRIEF-P GEC (). Other correlations were not significant.

Note: EF: executive function; BADS-C: Behavioural Assessment of the Dysexecutive Syndrome for Children; CKTA: Children’s Kitchen Task Assessment; DEX-C: Dysexecutive Syndrome for Children; PETA: Preschool Executive Task Assessment; TS: total summary score; TC: total number of cues; ICC: intraclass correlation; GEC: general executive composite; BRIEF: Behavior Rating Index of Executive Function; BRIEF-SR: Behavior Rating Index of Executive Function-Self-Report; BRIEF-P: Behavior Rating Index of Executive Function-Preschool; BRI: Behavioural Regulation Index; MI: metacognition index; RBMT: Rivermead Behavioural Memory Test; RCF: Rey-Osterrieth Complex Figure; D-KEFS: Delis–Kaplan Executive Function System; WCST: Wisconsin Card Sorting Test; WISC-IV: Wechsler Intelligence Scale for Children-IV; TMT-B: Trail Making Test Part B; DCD: developmental coordination disorder; ADHD: attention-deficit hyperactivity disorder. Empty cells: no evidence found.