The Modified Checklist for Autism in Toddlers (M-CHAT) questionnaire is a brief measure available in Spanish which needs to be validated for the Mexican population. Parents of children from (1) community with typical development (TD) and (2) psychiatric outpatient unit completed the CBCL/1.5–5 and the Mexican/MM-CHAT-version. The study sample consisted of 456 children (age M = 4.46, SD = 1.12), 74.34% TD children and 26.65% with Autism Spectrum Disorders (ASD). The MM-CHAT mean score for failed key items was higher for the ASD group compared with the TD group. Internal consistency for the Mexican/M-CHAT version was .76 for total score and .70 for the 6 critical items. Correlations between the MM-CHAT and the CBCL/1.5: PDD and Withdrawn subscales and with ADI-R dimensions: B non verbal) and A were high, and were moderate with ADI-R dimensions B1 (verbal) and C The failure rate of the MM-CHAT between the groups did not reproduce all the critical items found in other studies. Although the instrument has good psychometric properties and can be used for screening purposes in primary settings or busy specialized psychiatric clinics, these results support evidence for cultural differences in item responses, making it difficult to compare M-CHAT results internationally.

1. Introduction

Autism spectrum disorders (ASDs) affect 1-2% of children [15]. Early detection is important because it allows the introduction of early intensive treatment strategies to improve the psychosocial adjustment of these children [613]. The development of measurements to assess the autism spectrum disorders in the last two decades has increased. Unfortunately, the cost of using these tools for clinical and research purposes has become expensive and complicated [14]. Many of these instruments are very complex and targeted to specialized professionals with experience in autism, so their use in the primary care setting in Mexico is not feasible. Furthermore, some of them require training and take enormous time to administer and score [15, 16]. Education level and skills and attributes of parents, in addition to health and educational services, have a crucial role in recognition and diagnosis of ASD. First concerns are noticed by parents at 12–24 months of age [17, 18]. These initial observations of atypical development are followed by two types of delays to seek actions. Parental delays in seeking attention are approximately 4 months, and medical delay in assigning a diagnosis (from first reported concern to medical diagnosis) is 30.1 months [19]. In Latin-American countries most parents seek initial attention through public health services [20], but only 38% receive their diagnosis through this means [19]. Furthermore, recent studies show that many children are identified in school [21] so screening instruments need to be oriented to parents, teachers, and primary medical health providers.

The Modified CHAT (M-CHAT) (Robins et al., 2001) [5] is a simple questionnaire for parents that can be completed in 10 minutes. According to the authors, this instrument improves discrimination between autism and other developmental problems. The M-CHAT reported sensitivity and specificity  .87, and  .99, respectively, positive predictive power of  .80 and negative predictive power  .99. [5] The sensitivity and specificity of the M-CHAT was determined by using 2 criteria: (1) failing 2 critical items (critical) or more of the 6 critical and (2) failing 3 or more critical items (if any) of the 23 total. According to the authors [5] the sensitivity and specificity for criterion 1 was  .97 and  .95, and criterion 2 was  .95 and  .99, respectively. Internal consistency was adequate for the full list of symptoms ( 𝛼 = . 8 5 ) and for the 6 critical items ( 𝛼 = . 8 3 ) .

Mexico needs reliable and valid screening instruments for autism for use in the primary level of health care and education services. The purpose of this study was to investigate the psychometric properties of the Mexican (MM-CHAT-version) in a sample of referred young children with presumptive diagnosis of ASD and a sample from the general population.

2. Material and Methods

2.1. Participants

Children from 2 different settings participated in the study. (a)Clinical sample. Cases with a presumptive diagnosis of ASD (Autism, Asperger’s Disorder, PDD-NOS) were referred ( 𝑛 = 1 1 7 ) by the attending child psychiatrist of the PDD and ADHD outpatient clinic. (1)An expert child psychiatrist conducted a semistructured clinical interview based on DSM-IV criteria for assigning the ASD diagnosis (Autism, Asperger PDD-NOS) and ADHD (subtypes inattentive, hyperactive-impulsive, and combined)) and the most common comorbidity (e.g., tics, Tourette disorder, generalized anxiety, phobia, oppositional disorder and conduct disorder, dysthymia).(2)A senior board certified child psychiatrist with 20 years of experience administered the ADI-R.(b)Children with typical development ( 𝑛 = 3 3 9 ) were recruited from nurseries located in four different districts of the city. Parents and teachers agreed to participate in the study after receiving a detailed description of the project. All parents from both samples completed the MM-CHAT and CBCL/1.5–5.

The inclusion criteria were children from both sexes between 18 to 72 months of age with a presumptive diagnosis of Autism Spectrum Disorder (Autism, Asperger’s Disorder, PDD-NOS) and children from the general community with the same age range than the clinical group. Children were excluded if they had known comorbid severe chronic diseases that had the potential to bias the MM-CHAT scores, such as asthma, diabetes, cancer or sensory impairments such as deafness, blindness or a genetic syndrome associated with autism such as tuberous sclerosis, Rett syndrome, or fragile X.

2.2. Measures
2.2.1. M-CHAT (Robins et al., 2001) [5]

The M-CHAT is a brief, simple instrument which takes about 15 minutes to complete. It consists of 23 items. The M-CHAT was developed by translating each item into Spanish and then adding minor cultural adjustments, such as describing the “peek-a-boo” game, since Mexican mothers do not have a specific name for it. For the purpose of this validation we used the following scores(1)The sum total of the items failed: MM-CHAT-T.(2)The total sum of the 6 critical items proposed in the literature = MM-CHAT-6ci criteria with the cutoff suggested by the authors.(a)Two or more critical items (2/6) failed = MM-CHAT-2/6.(b)Any three or more failed items (3/23) failed = MM-CHAT-3/23.

2.2.2. Child Behavior Checklist, CBCL/1.5–5 (Achenbach and Rescorla, 2000) [22]

The CBCL/1.5–5 contains PDD, withdrawn and ADHD subscales. It consists of 100 emotional and behavioral problem items that are common in preschoolers. The results are grouped into the following syndromes: emotional reactivity, depression, anxiety, somatic complaints, attention problems, aggressive behavior, and sleep problems. In addition, the items are organized into three general scales of problems: total, externalized, and internalized. DSM also contains scales that assess the problems: mood, anxiety, developmental, attention-deficit hyperactivity, oppositional, and defiant behavior. The consistency of the scale is very high at  .95 and test-retest reliability is  .90.

The items are arranged in a Likert-type scale. Possible responses range from 0 = not applicable or never, 1 = sometimes, 2 = almost always.

In 2008 the scale was adapted and validated in Mexico [23]. For this study, we used the PDD, withdrawal and the ADHD subscales of the CBCL/1.5–5.

2.2.3. Autism Diagnostic Interview (Autism Diagnostic Interview-Revised) (ADI-R) (Lord et al., 1994) [24].

The ADI-R is a semistructured interview that should be administrated by a clinician with experience evaluating children with autism. It is the gold standard for autism diagnosis of children and adults with mental ages older than 18 months [24]. The interview is organized according to the DSM-IV criteria. It contains 93 questions to explore the child’s developmental history and questions that investigate problems associated with autism. The ADI-R algorithm generates scores for the three main domains of autistic symptomatology: (A) qualitative problems of reciprocal social behavior, (B) delayed language development, and (C) stereotyped behaviors and restrictive interests. It has an interrater reliability of  .83 to  .94.

In addition, the autism diagnosis in the clinical group was confirmed through a semistructured interview with DSM-IV criteria and the Autism Diagnostic Interview (ADI-R). Clinicians who conducted the interviews were blind to the questionnaire results. Inconsistency between both criteria was solved by consensus.

2.3. Ethical Issues

The study received approval from the hospital research committee. Written informed consent to participate in the study was obtained from each child’s caregiver.

2.4. Statistical Analysis
2.4.1. Demographics Variables Analysis

The demographic and clinical characteristics were expressed as means, standard deviations, and proportions. Student 𝑡 -test was used to compare continuous variables such as children and parents age and socioeconomic status (SES).

2.4.2. Reliability and Internal Consistency

Internal consistency was evaluated using the Kuder-Richardson coefficient for the total items (23) of MM-CHAT and the 6 critical items (2, 7, 9, 13, 14, 15) identified by the original validity study [5].

2.4.3. Convergent Validity

Convergent validity was analyzed by calculating the Spearman correlations between the total score of the CBCL/1.5/PDD/withdrawn and ADHD subscales and the total score of the MM-CHAT failed items.

2.4.4. Discriminant Validity

To investigate the discriminant validity we use 𝑡 -test and chi square to analyse mean and percentage differences between the TD and ASD group ratings for the MM-CHAT-T and the percentage of items failure rate.

2.4.5. Criterion Validity

The kappa coefficient ( 𝜅 ) was used to analyze the concordance between the categorical diagnosis of autism instruments obtained by ADI-R and the MM-CHAT using two cutoffs: (1) two or more critical items (2/6) failed (MM-CHAT-2/6) and (2) three or more any 23 critical items (3/23) failed (MM-CHAT-3/23).

3. Results

3.1. Demographic and Clinical Sample

Study participants were 456 children (74% male) with a range of 1–7 years and a mean age of 4.46 years (SD = 1.12). The sample was divided into two groups: (1) ASD ( 𝑛 = 1 1 7 ) and (2) typical development (TD) ( 𝑛 = 3 3 9 ). The groups were very similar for maternal age (ASD: 𝑀 = 32.12, SD 6.80, compared to TD: 𝑀 = 31.46, SD 7.10), paternal age (ASD: 𝑀 = 36.51, SD 7.83, compared to TD: 𝑀 = 36.06, SD 7.71), and socioeconomic status (ASD: 𝑀 = 5.99 SD 2.92; against TD: 𝑀 = 7.39, SD 6.64). However, the proportion of males was higher in the ASD group compared to the TD group (76.1% versus 51.91%). This difference was significant as shown in Table 1.

3.2. Internal Consistency

Internal consistency of the MM-CHAT for 23 items of the total sample was K R = . 7 6 and for the 6 items (MM-CHAT-6ci) was K R = . 7 0 .

3.3. Convergent Validity

In the ASD group convergent validity was assessed using Spearman correlation coefficient (Rho) between the MM-CHAT-T, MM-CHAT-6ci and the CBCL/1.5–5/PDD, withdrawn and ADHD subscales and the ADI-R dimensions ( 𝐴 , 𝐵 , and 𝐶 ). As shown in Table 2, correlations were varied. Dimension 𝐵 (nonverbal) of the ADI-R had the highest correlation with the MM-CHAT-T (rho = 0.636, 𝑃 . 0 1 ) and the CBCL1.5–5/withdrawn subscale ( r h o = . 6 6 , 𝑃 . 0 1 ).

The MM-CHAT-6ci showed the highest correlation with the ADI-R A domain ( r h o = . 6 6 , 𝑃 0 . 0 1 ) and subscale CBCL1.5–5/withdrawn subscale as shown in Table 2. In the TD group Spearman correlations between the MM-CHAT and CBCL/1.5–5/PDD and withdrawn subscales were very low and nonsignificant ( r h o = . 1 0 5 , 𝑃 < . 1 9 , r h o = . 0 7 3 , 𝑃 < . 2 6 ) in the TD showed in Table 2.

3.4. Discriminant Validity

The results are shown in Figure 1. The total score of the MM-CHAT was higher for the ASD group ( 𝑀 = 6 . 6 6 , SD 4.21) compared to TD group ( 𝑀 = 3 . 2 7 , SD 2.19), this difference was statistically significant ( 𝑃 . 0 0 0 1 ), see Table 3. By age group, the youngest group (1–3 years) had a higher mean score of the MM-CHAT (total failed items) for the ASD group ( 𝑀 = 5 . 4 4 , SD 3.77) against TD ( 𝑀 = 2 . 4 1 , SD 1.71, 𝑃 . 0 0 4 ), 𝐹 = 2 0 , 9 0 4 ( 𝑡 = 3 . 3 0 , df = 78, 𝑃 . 0 0 4 ). For the older group (4–6 years) the average score of the MM-CHAT (total failed items) for the groups was ASD ( 𝑀 = 5 . 9 7 , SD 3.90) against TD ( 𝑀 = 3.44, SD 2.25), with a statistically significant difference ( 𝐹 = 4 9 . 9 0 , 𝑡 = 7 . 2 3 , d f = 3 4 3 , 𝑃 . 0 0 0 1 ). The average MM-CHAT-6ci for the groups was ASD ( 𝑀 = 1 . 4 4 , SD 1.51) against TD ( 𝑀 0.66, SD. 89), with a statistically significant difference between groups ( 𝑃 . 0 0 0 1 ).

The chi-square test identified the significant failure rate for the ASD group and TD (see Figure 2) for the following 17 items (in bold): no. 2 (interest in other children) 35 versus 19.8, no. 6 (imperative pointing) 23.9 versus 11.8, no. 10 (eye contact) 30.8 versus 11.2, no. 11 (noise) 45.3 versus 21.3, no. 12 (responds to smile) 12.8 versus 2.1, no. 13 (imitation) 33.3 versus 17.7, no. 14. (response to name) 15.4 versus 2.4, no. 15 (shares object point) 22.2 versus 4.5, no. 16 (walk) 1.7 versus 0.3, no. 17 (gaze following) 45.3 versus 13, no. 18 (unusual finger movements) 35 versus 21.1, no. 20 (hearing concerns) 34.2 versus 10.1, no. 21 (understands what is said) 38.5 versus 5.6, no. 22 (stares at nothing) 43.6 versus 9.8.

3.5. Construct Validity

To analyze the construct validity of the MM-CHAT Mexican version, we calculated the kappa coefficient ( 𝜅 ) in the ASD group using the following criteria:

MM-CHAT 2/6 or greater (cutoff suggested in the original study; Robins, et al., 2001) [5]. The criterion for detection of autism is failing two or more of the 23 items (2 or more) (a)ADI-R (gold standard) dichotomic scoring of dimensions 𝐴 , 𝐵 , and 𝐶 .(b)ADI-R (gold standard) categorical diagnosis of autism. In the ASD group kappa coefficient ( 𝜅 ) between the MM-CHAT and the ADI-R dimensions was ( 𝜅 ) =  .17 to  .61 as shown in Table 3.

4. Discussion

This study investigated the psychometric properties of the MM-CHAT Spanish version for Mexico in two different samples: clinical and TD from the general community. Most validation studies of the M-CHAT [25, 26] including the original [5] have used large samples of the general population to identify a very small number of children with autism ( 𝑛 = 4 and 𝑛 = 7 , resp.). In this study we used a case control design which included a large clinical group of children who were seen in the outpatient PDD clinic before an autism diagnosis was assigned. Overall the MM-CHAT could discriminate between the TD and the ASD group. The instrument showed moderate internal consistency and convergent validity with CBCL/1.5–5. However, there are some results which deserve a more detailed analysis. ADI-R Spearman correlations with MM-CHAT scores were varied ( r h o = . 2 3 . 6 6 ) , this result is consistent with the notion that autism is a complex and heterogeneous disorder. However, overall MM-CHAT-T correlations were better than the MM-CHAT6ci particularly for the ADI-R (BV) and 𝐶 dimensions (rho  .23, 𝑃 < . 0 7 and rho  .36, 𝑃 < . 0 1 ), this gain represents a small drop in the 𝐴 dimension correlation from  .66 to  .61 with the same 𝑃 < . 0 1 value. An addition of items exploring communication abnormalities such as echolalia, language loss and/or delay salient aspects of autism could make the BV correlation rise and become significant.

All kappas were significant using the MM-CHAT-2 criteria (best cutoff in this study) and the ADI-R cutoff domains except for the dimension 𝐶 ( 𝜅 ) .17 𝑃 = . 0 9 .   The MM-CHAT-T had a higher concordance with the ADI-R nonverbal dimension (BNV) ( 𝜅 )   .61 𝑃 = . 0 0 0 1 in contrast to the 𝐵 verbal (BV) dimension and the 𝐶 dimension of stereotypical behavior ( 𝜅 ) .29 𝑃 = . 0 0 4 , and ( 𝜅 )   .16 𝑃 = . 0 1 . This result supports the idea suggested by other researchers that the M-CHAT detects better nonverbal children with low functioning autism [27, 28].

The MM-CHAT showed discriminant validity between the ASD and the TD group through analyzing differences on the MM-CHAT-T means and percentage of failed items. Critical items in this study are not the same as the one proposed in the original study.

The detection of the Critical items has been inconsistent in the studies [5, 26]. Some factors such as the sample composition (clinical versus community) or age range and the statistical procedures to derive them could explain these differences. Nevertheless, there is growing evidence from studies with combined samples from different race composition which suggests a cultural bias does exist for autistic measures like M-CHAT [29] and other standardized measures for autism [30].

However, as more international studies of validation are published it is becoming evident that the M-CHAT has important differences in items that parents endorsed more frequently. The reason for this cultural bias is unclear, but it is possible that differences in parenting and social behavior styles could be influencing this phenomenon. Many rating scales for autism are dichotomic because they were developed when autism was understood as a categorical disorder. Recent evidence suggests that autism traits are normally distributed in the general population [3036] and that not only parents, but also individuals without autism in their families express these traits. Based on this latter evidence, the suggestion made by some researchers [26, 37] to modify the M-CHAT as a quantitative measure with items reorganized as a likert scale type seems appropriate. In 2007, a supplementary telephone interview for parents of children who screened positive was developed [38]. The combined use of the M-CHAT screen and the telephone interview increases its positive predictive value without adversely affecting its sensitivity [38]. However, this incremental validity will raise the cost of M-CHAT excessively for countries like Mexico. There are important reasons to screen and take action at the same time. The addition of an interview requires training which can be very difficult to give and maintain in busy settings.

As in other studies, we also observed that some parents do not understand all of the MM-CHAT questions [25]. This has been attributed to a low education level of the caregiver. There is incipient evidence that contradicts this idea [39]. Autism signs are bizarre, elusive and some of them transient, so even highly educated parents can miss these symptoms. Parents might believe it to be irrelevant if their children do not point to share pleasure. Furthermore, parents of children with autism often share some autistic traits and there is evidence of assortative mating for these traits among spouses [40, 41]. It is unknown if having these traits can weaken the parents’ abilities to detect them in their children. In some Asian and Latin-American cultures like Mexico, making eye contact is considered inappropriate and a sign of disrespect. Some mothers discourage children from pointing with the index finger because it is considered to be rude [42, 43].

These cultural issues could explain some of the inconsistency in items responses independent of the instrument [44]. For these reasons, developmental assessment is one of the most challenging tasks for service providers and parents. The addition of pictograms employed by Inada et al., 2011 [26] is a proposal that deserves more investigation. Up to date no prevalence studies on autism have been done in Mexico. In this study, 5% of parents in the TD group met the MM-CHAT-2 criteria. This figure gives support to the urgent need to develop and/or validate gold standard instruments to confirm an autism diagnosis.

5. Conclusions

This study has some limitations. Despite having a large sample of undiagnosed ASD children from the psychiatric outpatient PDD clinic, we were unable to include the IQ tests results and analyze the effect of this variable on the MM-CHAT cutoff performance. The majority of young children were unmedicated, but others with challenging behaviors had medication for their hyperactivity and irritability, which could bias some of the parents’ responses. Overall, in this study we demonstrated that the MM-CHAT can discriminate between ASD and TD children. The instrument has good psychometric properties and can be used for screening purposes in primary settings or busy specialized psychiatric clinics. However, these results support evidence for cultural differences in item responses, making it difficult to compare M-CHAT results internationally.