Abstract

Objectives. Psychometric properties of the Czech version of the Pittsburgh Sleep Quality Index (PSQI-CZ) have been evaluated only in patients with chronic insomnia, and thus, it is unclear whether PSQI-CZ is suitable for use in other clinical and nonclinical populations. This study was aimed at examining the validity and reliability of the PSQI-CZ and at assessing whether the unidimensional or multidimensional scoring of the instrument would be recommended. Methods. A total of 524 adult subjects from the Czech population participated in the study. The internal consistency of PSQI was evaluated using Cronbach’s alpha. The known-group validity was tested using the Kruskal-Wallis test to verify the difference between patients with sleep disorders and healthy control sample. For testing the structural validity, a cross-validation approach was used with both exploratory factor analysis (EFA) and confirmatory factor analysis (CFA). For EFA, the maximum likelihood method with direct oblimin rotation and parallel analysis was used. Results. The internal consistency of PSQI-CZ items was moderate (). Receiver operating characteristic (ROC) curve analysis showed high specificity (0.79) and moderate sensitivity (0.64) using an optimal cut-off score of 10. The EFA revealed a 3-factor structure with factors labelled as “sleep duration and efficiency,” “sleep disturbances and quality,” and “sleep latency.” The CFA showed that the emerged 3-factor model had a partly acceptable fit, which was better than other previously supported models. Conclusions. A high cut-off score of 10 is recommended to define poor sleep quality. Given the inconsistency of structural analyses, alternative scoring was not recommended. However, the individual components in addition to a total score should be interpreted when assessing sleep quality. We recommend editing and verifying the PSQI-CZ translation.

1. Introduction

Disturbed sleep represents one of the most frequent health issues. It has been shown that more than half of the adult population of economically developed countries experience unpleasant sleep disturbance [1]. The functioning of the sleep cycle can be verified by objective methods such as polysomnography or actigraphy. However, when assessing sleep, it is important to take into account the subjectively perceived quality of sleep as well as other variables such as comorbidities and environment. If we look at the quality of sleep by objective methods, sleep quality can involve several different parameters including sleep onset latency, sleep duration, sleep efficiency, and a number of awakenings [2]. Disruption, abnormality, or irregularity of some of these measures leads to a decrease in sleep quality. The prevalence of symptoms of difficulty in initiating or maintaining sleep ranges from 10% to 48% in the general population [3]. Poor sleep quality can contribute to absence from work, accidents at the workplace, and increased risk of negative health consequences such as sleep and neuropsychiatric disorders [4].

Although polysomnography is considered the gold standard for measuring sleep quality, the Pittsburgh sleep quality index (PSQI) is the most commonly used subjective measure that assesses important aspects of the sleep quality and the presence of symptoms of frequent sleep disorders in both clinical and research settings (see more in Section 2.2.). The PSQI has been translated into more than 46 languages. All language versions are managed by Mapi Research Trust and are available subject to compliance with the prescribed conditions of use (research, clinical practice). It is unknown whether the Czech version of PSQI officially distributed by Mapi Research Trust is an appropriate translation of the instrument. As stated on the PSQI distributors’ website (http://eprovide.mapi-trust.org), the listed translations may not have undergone a full linguistic validation process and may require further clarification. Nevertheless, studies of different language versions have demonstrated a good internal consistency (Cronbach’s alpha coefficient ranging from 0.71 to 0.85) and appropriateness of using the PSQI in clinical and population studies [3, 510].

Validity and reliability of the PSQI have been verified by comparisons of healthy control groups with clinical populations of patients with psychiatric disorders [11, 12], sleep disorders [8, 9, 13], or somatic disorders [14, 15]. Although studies have shown good validity and reliability of the questionnaire across a different spectrum of research groups, there is no uniform concept of its structural validity. A recent review pointed out that most structural validation studies had some shortcomings (e.g., inappropriate sample, unused Kaiser-Meyer-Olkin test, Bartlett’s test of sphericity, and lacking one of the factor analysis approaches or its relevant details). Insufficient or incorrectly chosen statistical methods may then create doubts about the described factor structures in individual research samples [16]. There are currently three most common model proposals. The original single-factor model suggests that a single summed total score best captures the multidimensional nature of sleep disturbance as indexed by the PSQI [11, 12]. The original single-factor model was confirmed by several studies [17, 18]. Other models question Buysse et al.’s combination of all seven PSQI components into one factor. Some suggest using 2-factor models (e.g., [5, 6, 14, 19, 20, 21]). One of the more replicated models proceeds from a study by Magee et al. [19], who suggested the following factors: (1) sleep efficiency—based on the values of two components sleep duration and habitual sleep efficiency and (2) perceived sleep quality—based on subjective sleep quality, sleep latency, sleep disturbance, use of sleep medications, and daytime dysfunction [19, 22]. Other studies copy Magee et al.’s model to the exclusion of the use of sleep medication component [20, 21]. Others recommend a 3-factor structure, which is based on Cole et al.’s study [12, 23, 24]. Cole et al. proposed three factors: (1) sleep efficiency (based on sleep duration, habitual sleep efficiency), (2) perceived sleep quality (based on subjective sleep quality, sleep latency, and use of sleep medications), and (3) daily disturbances (based on sleep disturbances and daytime dysfunction) [12]. Although no consensus has been reached, the original unidimensional scoring system and further validation were more recently recommended [1, 16].

Although the PSQI is widely used in research and clinical practice in the Czech Republic, psychometric characteristics of its Czech version (PSQI-CZ) have been evaluated only in patients with chronic insomnia [25]. Thus, the study was aimed at examining the known-group and construct validity and reliability (internal consistency) of the PSQI-CZ and at assessing whether the unidimensional or multidimensional scoring of the instrument would be recommended.

2. Materials and Methods

2.1. Study Sample

Data was collected at three clinical and research sites: Department of Neurology, First Faculty of Medicine, Charles University; Department of Sleep Medicine, National Institute of Mental Health; and private neurological clinic INSPAMED. Participants were recruited as part of 3 studies: a longitudinal study on aging and memory, the insomnia treatment programme at the National Institute of Mental Health (Czechia, NIMH-CZ), and a study directly focused on validation of the PSQI-CZ. The local institutional review boards approved the study (Ethics Committee of the General University Hospital Prague, No. 1774/15D; Ethics Committee of NIMH-CZ, No. 170/16). The study protocol was in line with international ethical standards [26]. All subjects were examined with the Czech version of PSQI, which was distributed by Mapi Research Trust. Basic sociodemographic information (age, sex, and diagnosis) has also been obtained. Answers were filled out in a paper-and-pencil form among the general population and people with sleep disorders between 2015 and 2018. In the patient group, the diagnostic categories were determined according to ICD-10. The native language of all participants was Czech. We had the data available from a total of 583 adults. We then excluded individuals under 18 and above 80 years old. An incompletely or incorrectly filled out questionnaire was the second exclusion criterion. Finally, we excluded patients with the unspecific or combined diagnoses. In total, 59 subjects were excluded. We did not perform any multiple imputations to address the missing values. From the remaining 524 adult probands who were included in the study, 326 probands were sleep laboratory patients (patients with sleep disorders (SDis)); the remaining 198 subjects formed the control group (HC). The HC group consisted of volunteers from the Czech population who responded to the invitation to participate in the research and stated that they do not suffer from any sleep and psychiatric disorder while other somatic disorders were not monitored.

2.2. PSQI

The PSQI was developed by Buysse et al. in 1989. It measures the quantitative and subjective aspects of sleep quality. The PSQI consists of 19 self-rated items and seven clinically derived domains of sleep difficulties in the past month: subjective sleep quality, sleep latency, sleep duration, habitual sleep efficiency, sleep disturbances, use of sleep medication, and daytime dysfunction. Each of these domains is weighted equally on a 0-3 scale. The seven component scores are summed to yield the total global PSQI score, which ranges between 0 and 21 points. A total PSQI denotes worse sleep quality [11], although some studies recommend that a higher cut-off score of 6 [8, 9], 7 [18, 27], 8 [15], or even 8.5 [13] would increase the PSQI’s specificity and lead to a very small decrease in its sensitivity. The questionnaire also consists of 5 additional questions that are rated by a bed partner or a roommate. The latter five questions are used for clinical information only [11].

2.3. Statistical Analysis

PSQI scores were not normally distributed both in the control and patient samples (). The known-group validity was tested using the Kruskal-Wallis test for the confirmation of the presence of the difference between patient and control samples. The effect size was calculated using eta squared () and evaluated using following criteria: 0.01-<0.06 (as small effect), 0.06-<0.14 (as moderate effect), and ≥0.14 (as large effect). The test characteristics and an optimal cut-off score were calculated and tested using the receiver operating characteristic (ROC) curve [28]; the optimal cut-off value was estimated using two methods: by the position closest to the top-left corner of the curve and by using the maximum value of Youden index [29].

The internal consistency of the PSQI was tested with Cronbach’s alpha [30]. A reliability statistic of 0.70 was considered acceptable, a range between 0.70 and 0.60 was questionable, and values lower than 0.60 were considered inadequate for the internally consistent instrument [31, 32]. Independence on factors (age and sex) was tested using basic linear models.

For testing the factor structure, a cross-validation approach was used; i.e., the study sample was randomly divided into two adequately sized subsamples; the first subsample was used for factor identification using exploratory factor analysis (EFA). The Bartlett test of sphericity and Kaiser-Meyer-Olkin test were used for verifying the suitability for the analysis. We used the following criteria for factor extraction: , loadings of [33], and all selected factors from the real data had to perform better in eigenvalue than factors from the random data. The maximum likelihood method with direct oblimin rotation was used for factor extraction, as we assumed correlation between components. The number of factors retained was estimated using parallel analysis, i.e., a data-driven approach comparing the observed eigenvalues of a correlation matrix with those from the random data [34].

The second subsample was then used for testing the emerged model and compare the goodness of fit with other published models using confirmatory factor analysis (CFA). Our proposed model was compared with previously published and supported models: the original 1-factor model [11], the 3-factor model first published by Cole et al. [12], and the 2-factor model first published by Magee et al. [19]. To assess model fit, multiple fit indices were used and considered good: comparative fit index (CFI) at ≥0.95 (or ≥0.90 for acceptable fit), Tucker-Lewis index (TLI) at ≥0.95 (or ≥0.90 for acceptable fit), standardized root mean square residual (SRMR) at ≤0.08, and root mean square error of approximation (RMSEA) at ≤0.05 (or ≤0.08 for adequate fit) along with 90% confidence intervals (90% CI). Statistically nonsignificant and lower chi-squared tests () were also considered to identify better models [35]. To determine the best model which fits our data, all models were compared to each other using Bayesian information criterion (BIC), difference tests (), and RMSEA CI overlap. As in Cole et al. [12], a model was considered better fitted if at least two of the three criteria for significant differences were met; i.e., it had a lower BIC (by at least 10 points), lower nonoverlapping RMSEA CIs, and a significantly different where a model with lower was better.

The whole analysis was performed in R language version 3.5.1 [36] and jamovi version 1.1 [37]; the following packages in R were used: Tidyverse group of packages [38], psych [39], cutpointr [40], and pROC package [41].

3. Results

3.1. Descriptive Statistics of the Studied Sample

The details of the subscale scores and total PSQI scores in our subsamples and whole sample are displayed in Table 1. The sleep disorder group included 196 women and 130 men (1.51 woman to man ratio, the significant difference observed, Kruskal-Wallis , , ) as opposed to the control group which includes 128 women and 70 men (1.83 women to man ratio, the nonsignificant difference observed, Kruskal-Wallis , , ). The primary condition of most patients was insomnia (), followed by obstructive sleep apnea (), somnambulism (), hypersomnia (), narcolepsy and cataplexy (), nightmare disorder (), REM sleep behaviour disorder (), restless legs syndrome (), sleep terrors (), and circadian rhythm sleep disorder ().

3.2. Reliability: Internal Consistency

We tested the reliability of the PSQI-CZ by estimation of PSQI-CZ internal item consistency using Cronbach’s alpha coefficient. The overall internal consistency of PSQI-CZ items was adequate (). Dropping any of the components did not result in a higher internal consistency (Table 2). The internal consistency of the PSQI was higher among patients () than controls ().

All PSQI components were positively correlated with the PSQI total score. The largest component-to-total-score correlation was observed for sleep duration (, ) and subjective sleep quality (, ), the lowest for habitual sleep efficiency (, ) and sleep latency (, ). The largest observed component-to-component correlation was observed between subjective sleep quality and sleep disturbance (, ), and the lowest between habitual sleep efficiency and sleep disturbance (, ).

3.3. Validity

We tested known-group validity on a sample of healthy controls (HC) and patients with a diagnosed sleep disorder (SDis). The patient group had a higher global score of PSQI-CZ () in comparison with the mean PSQI value of the control group (); the difference was significant and relevant (Kruskal-Wallis , , ), with an average mean difference of 4 points.

ROC analysis showed high specificity (0.79) and low sensitivity (0.635) using a cut-off score of 10 specified as a point closest to the top-left corner of the curve. Using the identification of cut-off value using the maximum Youden index, the optimal cut-off value was 12 with very high specificity (0.94) and very low sensitivity (0.50). The original recommended cut-off score of 5 was highly unspecific (Table 3, Figure 1). The total area under the curve (AUC) was 0.80.

3.4. Exploratory Factor Analysis

We tested structural validity using a cross-validation approach with both exploratory factor analysis (EFA) and confirmatory factor analysis (CFA). Prior to analysis, PSQI components were tested for sphericity using Bartlett’s test (, ) and sampling adequacy with the Kaiser-Meyer-Olkin test (). It was thus appropriate to proceed with EFA. Using EFA, a 3-factor model was identified (Table 4) using data-driven parallel analysis with the sum of the squared loadings (eigenvalues) 1.38, 1.38, and 1.10. The first factor explaining 19.85% of variance was termed sleep duration and efficiency with the highest loading in sleep duration. The second factor was termed sleep disturbances and quality with the highest loading in sleep disturbances followed by subjective sleep quality and daytime dysfunction components. The third factor was labelled as sleep latency and was loaded by sleep latency and sleep medication use components. All components were included as none of the loadings reached the minimum critical value of 0.35. The whole model was able to describe 55.11% of the variability. The correlations between factors were moderate (-0.54) [42].

3.5. Confirmatory Factor Analysis

To cross-validate our 3-factor solution, the second half of our test sample was used for CFA. CFA was also performed on the original 1-factor model [11] and two established 3- and 2-factor models as in Cole et al. [12] and Magee et al. [19], respectively. The goodness-of-fit indices for all selected models were performed and are shown in Table 5. The goodness-of-fit statistics for our proposed 3-factor solution and for Cole et al.’s model was acceptable for CFI and SRMR while other indices were insufficient. Both Buysse et al.’s and Magee et al.’s models were insufficient in all indices except the SRMR. When our model was compared to other models, our 3-factor solution was significantly better fitted than all other models. Descriptively, we also found that both Cole et al.’s and Magee et al.’s models were better fitted than Buysse et al.’s original model and that Cole et al.’s model was not better fitted than Magee et al.’s model. Loadings in our CFA model were adequate, ranging from good to excellent (0.45 to 0.99). The correlations between factors were 0.46, 0.50, and 0.69 (medium effect) (Figure 2).

4. Discussion

To the best of the author’s knowledge, this is the first study to examine the psychometric characteristics of the Czech version of the PSQI in various study samples (patients with sleep disorders and healthy volunteers). Our results demonstrate that the global internal consistency of the PSQI-CZ is lower () than in the original study () [11]. Given the characteristics of our sample, studies working with patients with sleep disorders show both similar [18], lower [25], and higher values of internal consistency [8, 9, 13]. Our Cronbach`s alpha was thus adequate and comparable to other studies that recommend the use of the questionnaire in clinical practice and research. Similar levels of Cronbach’s alpha can be found in studies performed in psychiatric patients [7, 43], cancer patients [14], the general healthy population [3, 24], and adolescents [5, 44].

In contrast to other previously published studies [5, 6, 13, 18, 27], dropping any of the PSQI components did not result in a higher internal consistency in our research sample. Similarly and in contrast to our findings, some studies tend to exclude one or more PSQI components (e.g., daytime dysfunction, sleep medications use) as a result of factor analyses [5, 6, 20, 21, 27, 4547]. Our findings however allowed for keeping all components, which was also shown in previous studies [11, 12, 19, 2224, 48]. The differences in results may be attributed to diversity in sample characteristics. Our study included the general healthy population as well as patients with sleep disorders, which is in contrast to other studies validating PSQI in specific populations such as centenarians [27], adolescents [5], pregnant women [6], and psychiatric patients [7, 43].

The PSQI factor structure is a controversial research topic as the widely used original one-factor model may not be satisfactory in all populations. In the present study, we used a cross-validation approach using the first EFA and a series of CFAs including the most published structures [11, 12, 19]. The results of our factor analyses did not show entirely consistent results. Our exploratory factor analysis revealed the same 3-factor structure as in the Peruvian sample of college students in Gelaye et al.’s study [22]. Our structure was different from the original 1-factor structure [11] and other commonly proposed structures [12, 19]. The 3-factor model in Peru explained approximately 59% of the total variance [22], and ours comparably 55% of the variability. A confirmatory factor analysis verified our emerged structure but showed only a partly acceptable fit for our model. We found a similarly acceptable fit for a model from Cole et al. [12]. However, when we compared Cole et al.’s model with our model, our model resulted in a significantly better fit. Present findings thus do not confirm previously found support for Cole et al.’s structure in a Czech insomnia sample [25]. The discrepancy with other studies can be attributed to differences in studied populations, diverse sample characteristics, nonuniform methodologies (e.g., factor rotation and extraction methods, estimation method selection) and highlights the inconsistency of structural validity of the PSQI across varied clinical and nonclinical populations [1, 27].

Together, our data point to limited usability of changing the factor structure or developing alternative scoring of the instrument. Based on the present findings, it is recommended that somnologists and other professionals should not solely rely on the overall PSQI score describing sleep quality. Instead, they ought to look at all components or at least at the components with consistently high loadings (i.e., sleep duration, subjective sleep quality, and sleep disturbances).

In line with other studies [8, 13], our results showed that the patient group had a significantly higher total score of PSQI-CZ than general controls. The difference between these groups was confirmed by large effect size. Our findings point to an unexpected result of a high value of 10 for an optimal cut-off score, respectively, 12 using the maximum Youden index value criterion. We recommend using a cut-off score of 10 based on its clinical relevance, i.e., the best ratio between sensitivity (0.64) and specificity (0.79) in comparison to score 12 based on the Youden index with high specificity (0.95) but mediocre sensitivity (0.50). The traditional cut-off score (>5) has previously been reported to be insufficient to distinguish between healthy and diseased subjects, and higher cut-off scores have been proposed [13, 15, 49]. However, to the authors’ knowledge, no other study proposed such a high cut-off score. Gomes et al. published that the optimal cut-off of 5 was to detect self-reported poor/good sleepers in nonclinical settings. To discriminate nonclinical from clinical sleep patients, the optimal cut-off was >7 [18, 27]. Given the high average total PSQI score in our HC group, it is thus possible that the group included individuals who had undiagnosed or untreated sleep disorders. The absence of the disease does not mean that the person sleeps well and, conversely, that the patient with a certain diagnosis sleeps subjectively poorly [50]. Moreover, it can be assumed that people who entered the study as healthy controls may have a greater degree of self-observation and interest in health. A higher level of self-observation of various changes, differences, and symptoms can then reflect a higher score in the PSQI. High values in the overall PSQI score can be explained, especially for young adults, also by the influence of social factors such as demands during university studies [51], loneliness [52], interest in sports activities [53], or the action of blue light when using electronic devices [54].

Our study had several limitations. Firstly, the results of the correlations suggest that there may be a translation discrepancy in question number one for PSQI-CZ. Respondents might have mistaken the meaning of going to bed (lying down) with falling asleep when answering the first question of the PSQI-CZ. It would be worthwhile to make a linguistic adjustment of the Czech version and verify whether it changes the psychometric outcomes of the PSQI. Secondly, as subjects in our control group were considered healthy based on their self-assessment, the potential inclusion of persons with undiagnosed sleep disorders in the control group is a further limitation of our study. Nevertheless, we consider the findings important for three reasons. Primarily, our study is the first that mapped the statistical properties of the Czech version of the PSQI on a relatively large research sample which included both healthy controls and patients with sleep disorders. Secondly, the higher cut-off found for this translation is an important information for clinical practice. And finally, our data demonstrated a 3-factor structure of the Czech PSQI that was not found useful for establishing an alternative scoring system.

5. Conclusion

For the current official Czech translation of the PSQI, a cut-off score higher than 10 is recommended to define poor sleep quality. Furthermore, not only the total score but also the results of the individual components should be taken into account. It is suggested that PSQI-CZ with a modified question should be created to verify respondents’ understanding of the meaning of questions. Further studies on the psychometric properties of PSQI-CZ in various research samples (e.g., general population, somatic disorders) including the test-retest reliability and verification of a modified translated version would strengthen our understanding of the potential benefits and limitations of PSQI-CZ in clinical and research practice in the Czech Republic.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Ethical Approval

The local institutional review boards approved the study (Ethics Committee of the General University Hospital Prague, No. 1774/15D; Ethics Committee of the National Institute of Mental Health, Klecany, Czechia, No. 170/16).

Conflicts of Interest

The authors declare that there is no conflict of interest.

Acknowledgments

The authors would like to thank the Department of Neurology, First Faculty of Medicine, Charles University; Department of Sleep Medicine, National Institute of Mental Health; and private neurological clinic INSPAMED (to all staff, especially Martin Pretl, MD.) for their support and participation. This work was supported by the Ministry of Health of the Czech Republic, grant no. NV18-07-00272, all rights reserved. This work was also supported by the Grant Agency of Charles University (project no. 990217) and by the project “Progres Q35” of the 3rd Faculty of Medicine, Charles University in Prague.