Deep Neural Networks for Cognitive Health AssessmentView this Special Issue
An Efficient Machine Learning-Based Feature Optimization Model for the Detection of Dyslexia
Dyslexia is among the most common neurological disorders in children. Detection of dyslexia therefore remains an important pursuit for the research works across various domains which is illustrated by the plethora of work presented in diverse scientific articles. The work presented herein attempted to utilize the potential of a unified gaming test of subjects (dyslexia/controls) in tandem with principal components derived from data to detect dyslexia. The work aims to build a machine learning model for dyslexia detection using comprehensive gaming test data. We have attempted to explore the potential of various kernel functions of the Support Vector Machine (SVM) on different number of principal components to reduce the computational complexity. A detection accuracy of 92% is obtained from the radial basis function with 5 components, and the highest detection accuracy obtained from the radial basis function with 3 components is 93%. On the contrary, the Artificial Neural Network(ANN) shows an added advantage with minimal number of hyperparameters with 3 components for obtaining an accuracy of 95%. The comparison of the proposed method with some of the existing works shows efficacy of this method for dyslexia detection.
One of the most complicated neurological brain disorders that is attracting attention among researchers in modern neuroscience is Dyslexia . The International Dyslexia Association defines dyslexia as a disorder identified by difficulties with spelling, language processing, and accurate word recognition. The overall paradigm of dyslexia can be summarized in Figure 1. The main actors of dyslexia consist of phonological disorder (PD), visual disorder (VD), and auditory disorder (AD). These disorders start to evolve from the time of birth and manifest themselves into an abnormality. The associated abnormality with these actors plays a very critical role in shaping the personality of a person. The consequences are multifold with numerous behavior deficits (BD) and cognitive deficits (CD). Most people think of dyslexia as a disorder in which a person is seeing letters and words backwards such as seeing “b” as “d” and vice versa, “was” as “saw” and vice versa. However, the truth is that people with dyslexia see things the same way as everyone else. Dyslexia is caused by a phonological processing problem  meaning people affected by it have trouble not with seeing language but with manipulating it. For example, if a person with dyslexia hears a word such as “heat” and then someone asks him/her to remove the first word (which is “h”). It would be very difficult for a person with dyslexia to tell what word is left (“eat”).
Another example of a person with dyslexia is that they tend to break a word in parts to read it, thus delaying reading comprehension. Dyslexia affects about 5–17% of population across most languages . The dyslexia condition emerges at some stage in childhood and evolves progressively in adolescence. This effect hampers the academic growth and subsequently diminishes self-esteem and confidence . On the other hand, the emergence of transformative healthcare technologies has catalyzed a revolution in provisioning and operational functioning of healthcare services, driven mostly by Computer Aided Detection and Diagnostic systems, interchangeably referred to as CAD systems. Recent advances in imaging technologies have made it plausible for medical practitioners to use advanced and hybrid imaging techniques such as PET, USG, X-Rays, CT scan, fMRI, and SPECT, in addition to others. Enhancements in these techniques enable medical practitioners to gather detailed information about body organs and physiology. These techniques typically make use of internal, external, or both sources of energy [5, 6].
The work proposed herein have multiple advantages that make it a potential candidate as a dyslexia detection framework. This does not consider any imaging modality information for the development of CAD for dyslexia. The information used is generated using a gamified online test structured in such a way that behavioral and cognitive deficits are perceived and quantified. It utilizes acquired data to establish a machine learning framework with principal component analysis. With SVM and ANN as machine learning frameworks, PCA is seen to be very effective in the detection of dyslexia. Moreover, PCA significantly reduces the computational overhead of the model since it has to deal with narrow feature space.
The overall organization of the paper is as follows: section 2 presents some of the existing work in the domain of dyslexia detection. The proposed framework is illustrated in section 3 of the paper, while the experimental setup along with the results are discussed in section 4 of the paper. Section 5 presents the conclusion of the work.
2. Related Work
With the advent of smart devices that are utilized in different domains such as healthcare, business organizations, educational sector, cities, and agriculture, a humongous amount of data is being generated. These insights to these data open new challenges and possibilities in a wide range of applications. The information collected from various sources in a healthcare setup open possibility for early detection or future prediction of various diseases. Studies presented in [7–10] have leveraged the healthcare data for different detection tasks. Studies by [11, 12] reveal that the collection of data sets for dyslexia are relatively cheap when we create a dataset by using standardized psychoeducational tests and learner’s handwritings. This is the reason that we are using a gamified online test for the study. So, the use of these data sets actually provides 2 benefits: first is that it is very cheap to collect and the other one being that the size of the data set, that is, in terms of features is very large, which is one of the fundamental requirements for building a stable machine learning model. The next subsection provides a list of machine learning algorithms that have been proposed from time to time for the detection of dyslexia. All these studies have used different types of machine learning algorithms and datasets of varying nature and sizes. The study by  demonstrated the application of artificial neuron networks to identify the presence of dyslexia in school children. The study used the test score as the data and MLP architecture of ANN. With a 10-fold cross validation, an accuracy of 75% was reported in the study.
The work presented by  used all sequences of machine learning algorithms which include support vector machines, artificial neural networks, and k-means. MRI scans were used to classify between dyslexia and control groups. With a dataset size of 56, ANN showed up with the best accuracy of 94.8%. The work done by  demonstrates the use of MRI scans for discriminating dyslexia and control cases. The study is carried out on a dataset of 236 subjects with SVM as the ML algorithm. An accuracy of 83% was reported in this study. EEG scans of 80 subjects were used for the diagnosis of dyslexia at an early stage using machine learning algorithms which included the k means, ANN, and fuzzy logic classifier with an average accuracy of 89.6%, 89.7%, and 85.7%, respectively, by . Another study based on EEG scans is presented in  on 6 subjects with a median age group of 5. Here, a multilayer perception model was used to detect dyslexia by analyzing brain activity signals, achieving an accuracy of 85%. MATLAB’s LIBSVB toolbox was used to implement the linear support vector machine classifier on 61 MRI scans to discriminate a dyslexia biomarker using white matter features of the brain. The accuracy reported in this study for dyslexia detection is 83.6%.
The work done in  categorizes dyslexia and nondyslexia cases on MRI scan data of 925 subjects using linear SVM. The study reports an accuracy of 80%. A linguistic computer game-based dyslexia detection was done by  on a 267 subject dataset utilizing eye tracker features. This study reports an accuracy of 85% using the SVM from the LIBSVM Toolbox of MATLAB. Another eye tracker-based dyslexia detection was performed by a study carried by  on a dataset of 185 subjects. The SVM was used with automatic recursive feature elimination, obtaining an accuracy of 96%. Another study wherein the SVM has been inducted for dyslexia detection on an eye tracking feature is reported in [20–24]. An accuracy of 80% on a dataset size of 97 is achieved in this work.
3. Proposed Methodology
The methodology adopted for this study is pictorially shown in Figure 2. With a large number of methodologies existent on the use of imaging modalities for dyslexia detection, the utilization of the gaming-based tests is also being explored for the potential detection methodology for dyslexia. The next subsection discusses the mathematical framework of the work from start
Assuming a feature vector (Fmi) of each subject, we should see Eqwhere m corresponds to subject number and i corresponds to each index of the feature vector. For the complete feature space in 2D, we should see Eq
With , we estimate an accuracy parameter Au from a set of machine learning kernel functions corresponding to SVM. In addition, the same accuracy parameter is estimated for the ANN. Having obtained an Au from a feature space of size 3644 × 196, our aim is to reduce the feature space by weighted reduction for improving Au. To put it in a more generic way, we aim to obtain the best possible Au with a feature space which maximizes σ2 for all Fi,j. Initially, a set of parameters are chosen for different kernel functions of SVM as given in equations (3)–(8)
These equations correspond to kernel tricks, namely, linear, radial basis function, Laplace, hyperbolic tangent, Bessel, and linear spline, respectively. These 6 kernel functions yield 6 machine learning algorithms whose potential we wish to explore with the change in feature space size. The choice of the kernels as given in eq. 4, eq. 5, and eq. 7 largely depends on the tunable parameters respectively. The selection of these 3 parameters determines the efficacy of the kernel in specific and the SVM as a classifier in general. The selection of can neither be underestimated nor overestimated. If the values are overestimated, the kernel function will behave more like a linear function and thus losing the capability of a nonlinear projection. On the other hand, if the values are underestimated, the decision boundary will be sensitive to the noisy data; thus, there will be a lack of regularization.
In line with this rationale, the values of are set as 0.15. With all these parameters of the kernel function set, we implement principal component analysis on to help us extract a new set of coefficients. The main idea of applying principal component analysis is to reduce the higher dimensionality of a feature space having large correlated data with a lower dimensionality feature space having small correlated data. The principal components derived from the original data tend to capture most of the variance of the data and hence can be effectively utilized to train a classifier model. Figure 3 shows an instance wherein we have plotted 100 principal components against the amount of variance that they have captured in the form of eigen values. As can be seen in Figure 3, the first few components capture almost all the variance of the data implying its efficacy. Algorithm 1 shows a pseudocode for the proposed methodology.
4. Experimental Results
The dataset  chosen for this study is a thorough evaluation of the following components of language speaking and understanding: phonological awareness, morphological awareness, visual discrimination and categorization, alphabetic awareness, syllabic awareness, semantic awareness, auditory discrimination and categorization, visual working: memory, and sequential auditory: working memory. The setup is quite contrasting to the setups which use different types of imaging modalities as a tool for detecting dyslexia. The dataset is 3644 subjects, 2 class-labeled data with 196 attributes. Figure 4 shows the distribution of the cases with respect to various age groups. The number of dyslexic and nondyslexic cases is well distributed in the range of 07 to 17 years.
Figure 5 shows the percentage of dyslexia subjects’ age wise. The point to observe here again is that the distribution is almost evenly distributed. With the given feature set, the proposed model uses two classifiers. Several classification methods exist, which include quadratic discriminant analysis (QDA), linear discriminant analysis (LDA), decision trees, maximum entropy classifier, Naive Bayes classifier, K-nearest neighbor, support vector machine (SVM), and Artificial Neural Network (ANN) . The work herein uses the said dataset to detect dyslexia with the SVM and ANN. First, we propose PCA-driven new feature vectors as the indicators for the dyslexia.
Table 1 depicts the dyslexia detection accuracy using 10 principal components. The highest accuracy is achieved by using the radial basis kernel function with a hyperparameter value σ1 = 0.5. The lowest detection accuracy is obtained using the spline kernel function for the SVM. Similarly, Table 2 gives a comparative detection accuracy of the 6 kernel functions with 5 principal components: PC1, PC2, PC3, PC4, and PC5. As expected, the accuracies obtained are slightly better compared to the results obtained in Table 2. The reason that can be attributed to this is depicted in Figure 3 wherein it is seen that lower principal components capture most of the variance in the data. The dyslexia accuracy is seen to improve further when the number of components used in the framework of SVM kernels is reduced to 3. The same is depicted in Table 3.
In comparison to a detection accuracy of 92% obtained from the radial basis function with 5 components, the highest detection accuracy obtained from the radial basis function with 3 components is 93%. The capability of principal components in detecting dyslexia is also depicted by the score plot shown in Figure 6. The plot shows how dyslexic and nondyslexic cases are segregated by the two principal components. Firstly, the PC1 shown as the dotted red vertical line divides all the given cases in the direction of the maximum variance. The number of outliers on the application of the first principal component is significantly large. The second principal component as shown in the purple dotted horizontal line is now seen to reduce the number of outliers.
The same methodology for predicting dyslexia using the online gaming-based test is carried out using ANN. The aim of this part of the study was to observe an accuracy improvement by changing the input size of the ANN. We choose a fixed hidden layer size of 10. With two output classes, dyslexic and nondyslexic, the input to the ANN was changed as per the number of principal components retained from the feature space. Table 4 shows the number of weights learnt by each NN with the changing number of inputs. On one side, the smaller number of components hide most of the information from the data, and on the other hand, the number of components leads to a smaller number of weights that were needed to be learnt. The comparison of the proposed methodology with some of the recent works reported in [15, 25, 26] is tabulated in Table 5. Most of the work reported for the detection of dyslexia has 4 main parameters, namely, the size of the dataset, the nature of the dataset, underlying machine learning approach, and the performance of the overall methodology. Based on these 4 parameters tabulated in Table 5, most of the work has been carried out on a relatively small-sized dataset. The demerit of the small-sized dataset in the machine learning framework is that it lacks generalization. The proposed work is carried out on a dataset which is comparatively much larger than the other reported works and hence is better in terms of generalization.
As the numbers state, dyslexia is listed over a population of 10% across the globe with consequences from moderate to severe personality changes. In Saudi Arabia, the incidence rate of dyslexia is found to be around 7%. Early detection of this disorder can help effective treatment in most of the cases. With researchers, clinicians and experts from various domains taking a stride to address this issue, the success is not that far away. Artificial intelligence and machine learning in contention with the medical imaging modalities have come up with possibilities of hope. The work presented herein successfully attempted to use an online gaming test-based strategy for the detection of dyslexia. It is pertinent to mention that with the age and lifestyle of the subjects under consideration, online gaming methodology for data acquisition becomes one of the first choices. The work extends by utilizing this acquired data to establish a machine learning framework with principal component analysis. With SVM and ANN as machine learning frameworks, PCA is seen to be very effective in the detection of dyslexia. Moreover, PCA significantly reduces the computational overhead of the model since it has to deal with narrow feature space. The work herein reports an accuracy of 95% with PCA and ANN with nearly 4000 subjects in the overall experimentation setup. The proposed work shows potential as depicted by the comparison of this methodology with some of the existing works. This work can be a promising candidate for the development of the learning management system for dyslexia. In future, the authors will try to improve the results of this research work by employing a deep learning model where optimization will be carried out on input images directly [27, 28].
Data used in this article will be shared on request to the corresponding author.
Conflicts of Interest
The authors declare that they have no conflicts of interest.
J. M. Fletcher, G. R. Lyonm, M. Barnes et al., “Classification of learning disabilities: an evidence based evaluation,” in Proceedings of the Identification of Learning Disabilities: Research to Practice, R. Bradley, L. Danielson, and D. P. Hallahan, Eds., pp. 185–250, Erlbaum Associates Publishers, Washington, DC, USA, 2002.View at: Google Scholar
S. O. Wajuihian, “Neurobiology of developmental dyslexia Part 1: a review of evidence from autopsy and structural neuro-imaging studies,” Optometry Vis Develop, space, vol. 43, no. 3, pp. 121–131, 2012.View at: Google Scholar
H. Barrett and W. Swindell, “Radiological Imaging: The Theory of Image Formation,” Academic Press, vol. 1, no. 4, 2 pages, 1981.View at: Google Scholar
J. T. Bushberg, J. A. Seibert, E. M. Leidholdt, and J. M. Boone, The Essentials of Medical Imaging, Williams & Wilkins, Philadelphia, USA, vol. 2, 2002.
P. Sharma and M. Kaur, “Classification in pattern recognition: a review,” International Journal of Advanced Research in Computer Science and Software Engineering, vol. 6, no. 5, p. 2495, 2013.View at: Google Scholar
K. Spoon, D. Crandall, and K. Siek, “Towards detecting Dyslexia in children’s handwriting using neural networks,” in Proceedings of the International Conference on Machine Learning AI for Social Good Workshop, pp. 1–5, Long Beach, CA, USA, 2019.View at: Google Scholar
K. Spoon, K. Siek, D. Crandall, and M. Fillmore, “Can we (and should we) use AI to detect Dyslexia in children’s handwriting?” in Proceedings of the International Conference on Machine Learning AI for Social Good Workshop, pp. 1–6, Long Beach, CA, USA, 2019.View at: Google Scholar
M. Kohli and T. V. Prasad, “Identifying dyslexic students by using artificial neural networks,” in Proceedings of the world congress on engineering, vol. 1, no. 1, London, U.K, July 2010.View at: Google Scholar
P. Płoński, W. Gradkowski, A. Marchewka, K. Jednoróg, and P. Bogorodzki, “Dealing with the heterogeneous multi-site neuroimaging data sets: a discrimination study of children dyslexia,” in Proceedings of the Brain Informatics and Health space, vol. 8609, pp. 471–480, Springer, Switzerland, Europe, 2014.View at: Google Scholar
I. Karim, W. Abdul, and N. Kamaruddin, “Classification of Dyslexic and normal children during resting condition using KDE and MLP,” in Proceedings of the 5th International Conference on Information and Communication Technology for the Muslim World (ICT4M), pp. 4–8, Rabat, Morocco, March 2013.View at: Publisher Site | Google Scholar
S. Talwani, K. Alhazmi, J. Singla, H. J. Alyamani, and A. K. Bashir, “Allocation and migration of virtual machines using machine learning,” Computers, Materials & Continua- Tech Science, vol. 70, no. Sept, 2021.View at: Google Scholar
G. Hemant, M. Awais, A. K. Bashir et al., “AI-enabled radiologist in the loop: novel AI-based framework to augment radiologist performance for COVID-19 chest CT medical image annotation and classification from pneumonia,” Neural Computing and Applications, pp. 1–19, 2022.View at: Google Scholar
M. Usman, O. Lateef, R. ChandrenMuniyandi, K. Omar, and M. Mohamad, “Advance Machine Learning Methods for Dyslexia Biomarker Detection: A Review of Implementation Details and Challenges,” IEEE Access, vol. 9, pp. 36879–36897, 2021.View at: Google Scholar
Y. Lakretz, G. Chechik, N. Friedmann, and M. Rosen-Zvi, “‘Probabilistic graphical models of Dyslexia,” in Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1919–1928, New York, NY, USA, August 2015.View at: Google Scholar