Research Article

Psychometric Limitations of the Center for Epidemiologic Studies-Depression Scale for Assessing Depressive Symptoms among Adults with HIV/AIDS: A Rasch Analysis

Table 1

Overview of the analytic process using a Rasch model approach.

StepPsychometric property Statistical approach and criteriaResults
Original 20-item CES-DReduced 15-item CES-D (omits items with poor fit)Zhang et al. 10-item CES-D
[41]

1Rating scale functioning: does the rating scale function consistently across items?
(substantive validity)
(i) Average measures for each step category and threshold on each item should advance monotonically
(ii) -values < 2.0 in outfit mean square (MnSq) values for step category calibrations [42]
Rating scale met criteriaRating scale met criteriaRating scale met criteria

2Internal scale validity: how well do the actual item responses match the expected responses from the Rasch model?
(content validity)
Item goodness-of-fit statistics
MnSq values < 1.3 [43]
5 items failed to meet criterion:
(i) Item 4: MnSq = 1.58
(ii) Item 8: MnSq = 1.47
(iii) Item 11: MnSq = 1.33
(iv) Item 2: MnSq = 1.36
(v) Item 16: MnSq = 1.36
All items met criterionOne item failed to meet the criterion:
 item 8: MnSq = 1.53

3Internal scale validity: is the scale unidimensional (i.e., does it measure a single construct)?
(structural validity)
Principal component analysis
(i) ≥50% of total variance explained by first component (depressive symptoms) [44]
(ii) Any additional component explains < 5% of the remaining variance after removing first component [44]
(i) First component explained 32.5% of total variance
(ii) Second component explained 9.4% of total variance
(i) First component explained 37.9% of total variance
(ii) Second component explained 7.4% of total variance
(i) First component explained 34.2% of total variance
(ii) Second component explained 12.0% of total variance

4Person-response validity: how well do the individual responses match expected responses from the Rasch model?
(substantive validity)
Person goodness-of-fit statistics
(i) MnSq values < 1.5 and -value ≤ 2.0
(ii) ≤5% of sample fails to demonstrate acceptable goodness-of-fit values [45]
52 respondents (15.0% of sample) failed to demonstrate acceptable goodness-of-fit values36 respondents (10.3% of sample) failed to demonstrate acceptable goodness-of-fit values38 respondents (11.0% of sample) failed to demonstrate acceptable goodness-of-fit values

5Person-separation reliability: can the scale distinguish ≥3 distinct groups of depression in the sample tested? (reliability)Person-separation index
 ≥2.0 [46]
2.041.901.42

6Internal consistency: are item responses consistent with each other? (reliability)Cronbach's alpha coefficient
 >0.80 [46]
0.880.880.78

7Differential item functioning (DIF): are item difficulty calibrations stable in relation to key demographic variables?
(generalizability validity)
Mantel-Haenszel statistic
(i) with Bonferroni correction [47]
(ii) 1 item with DIF out of 20 may occur by chance and is deemed acceptable
Items with DIF
(i) Gender: 15 & 17
(ii) Race: 20, 19, 18, 16 & 6
(iii) Antidepressant use: 20
(iv) AIDS diagnosis: 8
Items with DIF
(i) Gender: 14 & 17
(ii) Race: 20, 19, 18 & 6
Not evaluated

8Differential test functioning (DTF): how consistent are the scores for the original CES-D and reduced-item scales?(i) ≤5% of -scores of the differences between the two test scores exceed ±1.96
(ii) Pearson correlation and
Not applicable(i) 6 measures (1.7%) had -scores exceeding ±1.96
(ii) ,
(i) 1 measure had a z-score exceeding ±1.96
(ii) = 0.942,  

Note. After initial evaluation of the original 20-item CES-D, a stepwise process was used whereby items failing to meet criteria were removed one at a time, and only those meeting criteria in earlier steps advanced to subsequent steps. If more than one item failed to meet a criterion, the item with the worst fit was removed and the step was repeated with the remaining items. The last column includes 15-item version omitting misfitting items 2 (appetite), 4 (as good as others), 8 (hopeful), 11 (restless sleep), and 16 (enjoyed life).
The five misfitting items did not all demonstrate misfit in the first iteration; some emerged in subsequent iterations; items are listed in the order of removal and the MnSq values shown reflect the iteration prior to the item’s removal.