WHODAS 2.0 as a Measure of Severity of Illness: Results of a FLDA Analysis
WHODAS 2.0 is the standard measure of disability promoted by World Health Organization whereas Clinical Global Impression (CGI) is a widely used scale for determining severity of mental illness. Although a close relationship between these two scales would be expected, there are no relevant studies on the topic. In this study, we explore if WHODAS 2.0 can be used for identifying severity of illness measured by CGI using the Fisher Linear Discriminant Analysis (FLDA) and for identifying which individual items of WHODAS 2.0 best predict CGI scores given by clinicians. One hundred and twenty-two patients were assessed with WHODAS 2.0 and CGI during three months in outpatient mental health facilities of four hospitals of Madrid, Spain. Compared with the traditional correction of WHODAS 2.0, FLDA improves accuracy in near 15%, and so, with FLDA WHODAS 2.0 classifying correctly 59.0% of the patients. Furthermore, FLDA identifies item 6.6 (illness effect on personal finances) and item 4.5 (damaged sexual life) as the most important items for clinicians to score the severity of illness.
Having accurate indicators that measure the impact of illnesses on people’s live is a critical issue in several areas of medicine, including mental health. Disability is a useful construct for this. Disability refers to the difficulty of people suffering a disease to keep their premorbid or normal functionality. The World Health Organization (WHO) describes disability as a difficulty in functioning at the body, person, or societal levels, in one or more life domains, as experienced by an individual with a health condition in interaction with contextual factors . To know the degree of disability helps clinicians to measure the impact of being ill for a specific patient, to decide in which areas a person needs help and to evaluate treatment effectiveness.
The need to quantify disability first appears in 1962, with the publication of Health-Sickness Rating Scale (HSRS) . This scale was replaced by the Global Assessment Scale (GAS) in 1976  which was further reviewed as the Global Assessment of Functioning Scale (GAF), included in the DSM-III and DSM-IV . GAF is a scale which is still frequently used to measure a person’s psychological, social, and occupational functioning on a hypothetical continuum of mental health-illness ranging from 1 to 100; simplicity and unidimensionality of GAF have been proposed as a strength of this scale . In DSM-IV is also included Social and Occupational Functioning Assessment Scale (SOFAS) as a functionality measure, but an important weakness of this scale is that it does not consider symptoms severity .
In response to the need to have a tool to evaluate functionality with a cross-cultural perspective and at the same time be easy to apply, WHO developed the World Health Organization Disability Assessment Schedule (WHODAS), and its next version, with more domains, WHODAS 2.0 . Currently, DSM-5 recommends the replacement of GAF by WHODAS 2.0 in order to increase the reliability of disability scores. WHODAS 2.0 has high internal consistency, high test-retest reliability, and good concurrent validity in patient classification when compared with other recognized disability measurement instruments. Nevertheless, WHODAS has certain limitations. It is not valid for children and youth and bodily impairments and environmental factors are not measured . WHODAS has been translated into more than ten languages; it is useful in the evaluation of disability in mental health conditions but also in a wide range of physical health diseases . The demonstrated reliability during its use favored its inclusion in DSM-5.
In routine clinical practice, clinicians generally classify patients’ illness severity according to their clinical experience and are supported by severity criteria used in measurement scales and classification manuals. Due to time restrictions in clinical practice, use of scales and questionnaires is limited. Simple scales such as the Global Clinical Impression Scale (CGI) allow the clinician to measure the severity and evolution of a patient without too much impact on the clinician’s care and clinical activity. CGI is an evaluation method for seriousness of symptoms in mental illnesses. The scale is composed by three global measures: severity of illness at the moment of evaluation (CGI-S); global improvement since last visit (CGI-I), and an efficacy index useful to compare the premorbid status and severity of treatment side effects (CGI-E). It is commonly used in clinical trials in depression or schizophrenia [9, 10] or to be compared with other instruments like, for example, Beck Depression Inventory . Nonetheless, CGI validity has been questioned and CGI is occasionally pointed as an inconsistent, unreliable, and too general measure [12–14].
Although the relationship between illness severity and functionality or disability has been widely studied in mental disorders such as schizophrenia , studies using these two particular questionnaires, WHODAS 2.0 and ICG, are scarce and all previous works have used standard statistical techniques. Using WHODAS 2.0, Bastiaens et al. demonstrated a significant correlation between CGI and WHODAS 2.0 in patients with dual disorders  and Guilera et al. found a positive correlation between CGI and WHODAS 2.0 subscales .
In the present study, we use Fisher Linear Discriminant Analysis (FLDA), a pattern recognition method  to explore if WHODAS 2.0 can be used for identifying severity of illness measured by CGI-S in a sample of outpatients in mental health facilities evaluated in real clinical practice and for identifying which individual items of WHODAS 2.0 are more discriminant for severity of illness classification. Furthermore, we hypothesized that FLDA would improve the accuracy of WHODAS 2.0.
2. Materials and Methods
2.1. Setting and Participants
From January to March 2017, a sample of 122 patients was evaluated in routine psychiatric or psychological visits at mental health facilities affiliated with the Fundación Jiménez Díaz Hospital in Madrid, Spain (Rey Juan Carlos Móstoles Hospital, Infanta Elena Valdemoro Hospital, General Hospital of Villalba, and University Hospital Fundación Jiménez Díaz).
All patients attended in the Psychiatry Department were candidates to participate in the study as long as they met the following inclusion criteria: outpatients, aged 18 or older, and who gave written informed consent. Exclusion criteria were illiteracy, refusal to participate, and situations in which the patient’s state of health did not allow for written informed consent.
All clinicians (psychiatrists, psychologists, and mental health nurses) were trained in the use of WHODAS 2.0 and ICG in December 2016 in a consensus meeting and after that, all of them were encouraged to use the instruments in their daily clinical practice. They were all asked to assess between 5 to 7 patients. Thirty-one clinicians participated actively in patient’s recruitment and they included a mean of patients.
CGI and WHODAS 2.0 were used to assess all patients, in an electronical version integrated in MEmind (https://www.memind.net), a web-based platform used in the Psychiatry Department since May 2014 as part of the standard clinical activity . At the end of 2016, all clinicians were trained in the use of WHODAS 2.0 and were instructed to use it in addition to usual questionnaires in a free way. In this way, until the end of March 2017, 122 patients were randomly selected and assessed.
WHODAS 2.0  arises after recognizing the difficulty in the daily clinical practice to use ICF; it is translated to more than ten languages, including Spanish . Symptoms of disability are divided into six domains with several items in each one. For every item, users have to answer how much difficulty they have had in the last 30 days to do something. Items are scored from one to five: 1 (none difficulty), 2 (mild), 3 (moderate), 4 (severe), and 5 (extremely difficult/cannot). WHODAS 2.0 is composed by 36 items: 6 in the “cognition domain,” 5 in “mobility domain,” 4 items in “self-care,” 5 questions on “getting alone and the interaction with the others,” 8 items about “life activities,” and last domain with 8 questions about “joining in community activities.” In this study, we used the 36-item interviewer-administered version of WHODAS 2.0, which scores from 0 to 100 with higher scores reflecting greater disability.
CGI is an instrument to assess the severity of symptoms of mental disease according to the judgment of the clinician [21, 22]. CGI is composed of three measures: CGI-S, CGI-I, and CGI-E. With CGI-S, the measure employed in this study, the observer describes the severity of illness at the present moment in a 7-point Likert scale from 1 (normal, nonillness) to 7 (most gravity of disease). We divided score in three groups of severity: 1 to 4 representing low severity; 4 representing medium severity; and 6-7 as the worst group according to severity.
Furthermore, information on sociodemographics and ICD 10 diagnosis was collected.
2.3. Ethical Issues
This study was conducted in compliance with the Declaration of Helsinki and approved by the IRB at Fundación Jiménez Díaz Hospital. All patients who participated in the study signed an informed consent that was detailed by the clinician who did the assessment.
Concerning data protection, access to the online user interface was restricted to participating clinicians (MEmind Study Group). The data provided by the clinician was encrypted by Secure Socket Layer/Transport Layer Security (SSL/TLS) between the investigator’s computer and the server. Data was stored in an external server created for research purposes. An external auditor guaranteed that security measures met the Organic Law for Data Protection standards at a high protection level.
2.4. Statistical Analysis
In the pattern recognition community, Fisher Linear Discriminant Analysis (FLDA)  is one of the most used analytical tools to transform the raw data into a lower dimensional subspace by maximizing a class separation criterion. Concisely, if the data contain observations belonging to possible classes, this technique finds linear projections in such a way that the class separation is maximized and the intraclass variation minimized. Before applying the FLDA algorithm, a principal component analysis keeping 95% of the variance was applied to remove noise . Blasco-Fontecilla et al.  used this technique to readjust the Holmes and Rahe stress inventory to successfully discriminate controls from suicide attempters.
Once the data has been transformed into a more suitable space, we use the k-nearest neighbour classifier to determine the class of a new observation. This classifier finds the observations with less distance to the new observation and assigns the majority class of these observations to the new one. In this article, the Euclidean distance is used and we consider is equal to 1, 3, 5, and 7.
A -fold cross-validation set-up was carried out to evaluate the classification accuracy of this approach (FLDA + -nearest neighbour). In this article, we use . That is, observations were used to conduct the FLDA and the -nearest neighbour and the holdout observation was used to test the performance of the classifier. This process was repeated times, once for each observation that is left out.
3. Results and Discussion
3.1. Sample Description
The sample contains 55 (45.1%) men and 67 (54.9%) women, with a mean age of years. Concerning civil status, 63 patients (51.6%) were married whereas the rest were single, divorced, or widower. Concerning occupation, 70 patients (59.3) were active population.
When we performed Pearson test for study correlation, we found a low positive correlation between CGI-S and total WHODAS 2.0 ; . This result contrasts with results of previous studies, which have found higher correlations: 0.48 in the study on 100 patients with dual diagnoses in a community correctional treatment  and correlation indexes between CGI and the different domains of WHODAS ranging from 0.341 (self-care) to 0.629 (participation) in 291 patients with bipolar disorder . As it is explained later, this lower correlation might be explained by the fact that we analyzed a more general population than these previous works.
3.2. FLDA Analyses
In Table 3 and Figure 1, we can observe that higher scores in the first projection imply more illness severity, represented with red dots. That means that individual items with higher positive values are the most important when clinicians assign patients a worse clinical conditions. Specifically, the two items related to a high level of severity of illness were item 6.6 (weight = 1.3728) and item 4.5 (weight = 0.6378), which means that patient in whom illness has a negative effect on personal finances (item 6.6) or has damaged sexual life (item 4.5) tends to be scored as severely ill or among the most extremely ill patients by their doctors. Additionally, in the figure can be recognized differentiated groups but also areas of overlapping are clear. This is not surprising as ICG-S has been pointed out to have some limitations [12–14], and some authors have found ICG does not correlate well with other measures of severity of illness in depression  or dementia .
In order to determine the accuracy attained by our FLDA/-nearest neighbour approach and to discover if this approach improves the accuracy obtained by the standard clinical approach, we performed a cross-validation experiment. Table 4 shows the classification accuracy of both FLDA and clinical approach in a 122-fold cross-validation experiment. In this table, we can notice that FLDA obtains a better accuracy than the clinical approach (score WHODAS 2.0 in the traditional way) for any considered. In addition, the best value is obtained when we use 3 neighbours.
Finally, we make a classification map for the best result which is showed in Figure 2. In this map, we observe the existence of some “islands” as a consequence of the previously described overlapping.
We found that WHODAS 2.0 is a useful scale for measuring severity of illness scored by clinicians with ICG, and so WHODAS 2.0 correctly classifies 59.0% of the patients. Compared with the traditional correction of WHODAS 2.0, FLDA improves accuracy in near 15% with respect to the traditional method. However, as it is shown in the classification map figure, the classification is far from being perfect and there are overlapped areas and some patients can be catalogued by WHODAS 2.0 with a low level of illness severity whereas clinicians classified them with higher scores and vice versa. Finally, FLDA shows that there are certain items of WHODAS more important for clinicians when considering severity of illness, specifically items regarding economic repercussion of illness and regarding a detriment of sexual life.
In contrast with previous studies, our sample is composed of patients obtained in a real clinical environment with a range variety of diagnoses which represent one strength of our study. To develop studies in real clinical settings is important as this gives us a useful insight for a daily practice. Furthermore, we do not just study correlations between CGI and WHODAS 2.0 but use a more sophisticated statistical method and demonstrated that FLDA is useful for better classification of illness severity of patients using a disability measure, in a similar way that we previously did in the field of suicide . Consequently, we proposed this statistical method as a promising method to be used in the field of mental health and in other areas of health.
However, our study also has certain limitations. First, our sample size was relatively small, which in part is influenced by data collection method as MEmind web platform is time consuming for a clinician. Moreover, while the range variety of diagnoses composing our sample is a strength, this heterogeneity can also be considered a limitation. As the impact on the disease in the functionality is very different in every mental disorder, a further analysis differentiating by diagnosis would be necessary, but unfortunately our sample size does not allow us to do that. This point should be taken into account as a future perspective of our work.
In conclusion, in this study we demonstrated an association between WHODAS 2.0 and ICG in a group of patients heterogeneously diagnosed. Future works focusing on this relationship in particular diagnoses are warranted.
Conflicts of Interest
The authors declare that they have no conflicts of interest.
This work was partially supported by Instituto de Salud Carlos III Fondos FEDER (ISCIII PI16/01852), Delegación del Gobierno para el Plan Nacional de Drogas (20151073), and American Foundation for Suicide Prevention (AFSP) (LSRG-1-005-16). The authors want to acknowledge the collaboration of the clinicians (MEmind Study Group) involved in the collection of data and the development of MEmind. MEmind Study Group is composed of Fuensanta Aroca, Antonio Artes-Rodriguez, Enrique Baca-García, Sofian Berrouiguet, Romain Billot, Juan Jose Carballo-Belloso, Philippe Courtet, David Delgado Gomez, Jorge Lopez-Castroman, Mercedes Perez-Rodriguez, Elsa Arrua, Rosa Ana Bello-Sousa, Covadonga Bonal-Giménez, Pedro Gutiérrez-Recacha, Elena Hernando-Merino, Marisa Herraiz, Marta Migoya-Borja, Nora Palomar-Ciria, Ruth Polo-del Rio, Alba Sedano-Capdevila, Leticia Serrano-Marugán, Iratxe Tapia-Jara, Silvia Vallejo-Oñate, María Constanza Vera-Varela, Antonio Vian-Lains, Susana Amodeo-Escribano, Olga Bautista, Maria Luisa Barrigón, Rodrigo Carmona, Irene Caro-Cañizares, Sonia Carollo-Vivian, Jaime Chamorro-Delmo, Javier Fernández-Aurrecoechea, Marta González- Granado, Jorge Hernán Hoyos-Marín, Miren Iza, Mónica Jiménez-Giménez, Ana López-Gómez, Laura Mata-Iturralde, Laura Muñoz-Lorenzo, Rocío Navarro-Jiménez, Santiago Ovejero, María Luz Palacios, Margarita Pérez-Fominaya, Ana Rico-Romano, Alba Rodriguez-Jover, Sergio Sánchez-Alonso, Juncal Sevilla-Vicente, María Natalia Silva, Ernesto José Verdura-Vizcaíno, Carolina Vigil-López, Lucía Villoria-Borrego, Ana Alcón-Durán, Ezequiel Di Stasio, Juan Manuel García-Vega, Pedro Martín-Calvo, Ana José Ortega, Marta Segura-Valverde, Edurne Crespo-Llanos, Rosana Codesal-Julián, Ainara Frade-Ciudad, Marisa Martin-Calvo, Luis Sánchez-Pastor, Miriam Agudo-Urbanos, Raquel Álvarez-García, Sara María Bañón-González, Sara Clariana-Martín, Laura de Andrés-Pastor, María Guadalupe García-Jiménez, Sara González-Granado, Diego Laguna-Ortega, Teresa Legido-Gil, Pablo Portillo-de Antonio, Pablo Puras–Rico, and Eva María Romero-Gómez.
WHO, International Classification of Functioning, Disability and Health (ICF), 2017, http://www.who.int/classifications/icf/en/.
R. Spitzer, J. Williams, and J. Endicott, “Global assessment of functioning,” in in Outcomes assessment in clinical practice, L. I. Sederer and B. Dickey, Eds., pp. 76–78, Williams and Wilkins, Baltimore, Md, USA, 1996.View at: Google Scholar
WHO, WHO Disability Assessment Schedule 2.0 (WHODAS 2.0), 2017, http://www.who.int/classifications/icf/whodasii/en/.
A. Hale, R. M. Corral, C. Mencacci, J. S. Ruiz, C. A. Severo, and V. Gentil, “Superior antidepressant efficacy results of agomelatine versus fluoxetine in severe MDD patients: A randomized, double-blind study,” International Clinical Psychopharmacology, vol. 25, no. 6, pp. 305–314, 2010.View at: Publisher Site | Google Scholar
M. H. Hsieh, W. W. Lin, S. T. Chen et al., “A 64-week, multicenter, open-label study of aripiprazole effectiveness in the management of patients with schizophrenia or schizoaffective disorder in a general psychiatric outpatient setting,” Annals of General Psychiatry, vol. 9, article no. 35, 2010.View at: Publisher Site | Google Scholar
F. Dahlke, A. Lohaus, and H. Gutzmann, “Reliability and clinical concepts underlying global judgments in dementia: Implications for clinical research,” Psychopharmacology Bulletin, vol. 28, no. 4, pp. 425–432, 1992.View at: Google Scholar
C. M. Bishop, Pattern Recognition and Machine Learning, Springer, 2006.View at: MathSciNet
J. L. Vázquez-Barquero et al., “Spanish version of the new World Health Organization Disability Assessment Schedule II (WHO-DAS-II): initial phase of development and pilot study. Cantabria disability work group,” Actas Espanolas de Psiquiatria, vol. 28, no. 2, pp. 77–87, 2000.View at: Google Scholar
W. Guy, Early Clinical Drug Evaluation (ECDEU) Assessment Manual for Psychopharmacology, Department of Health, Education, and Welfare, Rockville, MD, US, 1976.
P. N. Belhumeur, J. P. Hespanha, and D. J. Kriegman, “Eigenfaces vs. Fisherfaces: Recognition using class specific linear projection,” in Computer Vision — ECCV '96, vol. 1064 of Lecture Notes in Computer Science, pp. 43–58, Springer Berlin Heidelberg, Berlin, Heidelberg, 1996.View at: Publisher Site | Google Scholar
H. Blasco-Fontecilla, D. Delgado-Gomez, T. Legido-Gil, J. de Leon, M. M. Perez-Rodriguez, and E. Baca-Garcia, “Can the Holmes-Rahe Social Readjustment Rating Scale (SRRS) Be Used as a Suicide Risk Scale? An Exploratory Study,” Archives of Suicide Research, vol. 16, no. 1, pp. 13–28, 2012.View at: Publisher Site | Google Scholar