Review Article

Misuse of Statistical Methods in 10 Leading Chinese Medical Journals in 1998 and 2008

Table 1

Errors/defects in statistical methods.

Types of errorsIncorrect use in 1998 𝑛 (%)Incorrect use in 2008 𝑛 (%) 𝜒 2 𝑃 valueOR* (1998/2008)95% CI**

𝑡 -test305 (62.0%)253 (44.4%)32.83<0.0012.04(1.60,2.61)
(1) Using multiple 𝑡 -test for multiple group comparison153 (31.1%)129 (22.6)9.710.0021.54(1.17,2.03)
(2) Using paired 𝑡 -test for unpaired data or vice versa40 (8.1%)34 (6.0%)1.910.1671.40(0.87,2.24)
(3) Using 𝑡 -test under nonparametric setting89 (18.1%)60 (10.5%)12.52<0.0011.88(1.32,2.67)
(4) Using 𝑡 -test without considering the baseline52 (10.6%)33 (5.8%)8.190.0041.92(1.22,3.03)
(5) Using 𝑡 -test to conduct repeated-measure data73 (14.8%)60 (10.5%)4.480.0341.48(1.03,2.13)
Others28 (5.7%)8 (1.4%)14.82<0.0014.24(1.91,9.39)

Contingency tables154 (48.3%)169 (32.3%)21.35<0.0011.96(1.47,2.60)
(1) No continuity correction or Fisher exact test if needed52 (16.3%)53 (10.1%)6.900.0091.73(1.15,2.61)
(2) No significant level adjustment for multiple comparison82 (25.7%)74 (14.2%)17.53<0.0012.10(1.48,2.98)
(3) Misusing Chi-square test for paired fourfold table10 (3.1%)12 (2.3%)0.550.4581.38(0.59,3.23)
(4) Using Chi-square test for ranked data29 (9.1%)31 (5.9%)3.000.0831.59(0.94,2.69)
(5) Ignorance of stratification factors12 (3.8%)12 (2.3%)1.540.2151.66(0.74,3.75)
(6) Using 𝑃 value of Chi-square test instead of contingency coefficient to describe the correlation of two variables8 (2.5%)4 (0.8%)3.130.0773.34(0.98,11.18)
Others21 (6.6%)19 (3.6%)3.810.0511.87(0.99,3.53)

ANOVA***128 (63.4%)263 (59.0%)1.120.2891.20(0.85,1.70)
(1) Using one-factorial ANOVA to analyse data from multifactorial designs10 (5.0%)31 (7.0%)0.940.3330.70(0.34,1.45)
(2) Ignoring the setting of ANOVA for completely random design data25 (12.4%)53 (11.9%)0.030.8581.05(0.63,1.74)
(3) No multiple pair-wise comparison of ANOVA when needed25 (12.4%)28 (6.3%)6.890.0092.11(1.20,3.72)
(4) Misusing the method of multiple pair-wise comparison of ANOVA51 (25.3%)132 (29.6%)1.300.2550.80(0.55,1.17)
(5) Using ANOVA to analyse repeated-measures data45 (22.3%)63 (14.1%)6.650.0101.74(1.14,2.67)
Others16 (7.9%)10 (2.2%)11.640.0013.75(1.67,8.42)

Rank transformation nonparametric test29 (43.3%)33 (17.7%)17.57<0.0013.56(1.93,6.57)
(1) Using multiple pair-wise comparison for multiple group comparison14 (20.9%)20 (10.7%)4.430.0352.21(1.04,4.47)
(2) Using wrong type of rank sum test for different study types4 (6.0%)3 (1.6%)2.070.1503.89(0.85,17.88)
Others20 (29.9%)6 (3.2%)38.11<0.00112.84(4.88,33.77)

*OR: odds ratio; **CI: confidence interval; ***ANOVA: analysis of variance.
Incorrect use of 𝑛 (%): for each statistical method, 𝑛 is the number of articles using this statistical methods incorrectly and the percentage = n/the number of papers using this statistical methods × 100%; for each error under certain statistical methods, 𝑛 is the number of articles with this mistake and the percentage = n/the number of papers using these statistical methods × 100%.