Table of Contents Author Guidelines Submit a Manuscript
Applied Bionics and Biomechanics
Volume 2017 (2017), Article ID 6848014, 12 pages
Research Article

Prediction of Epileptic Seizure by Analysing Time Series EEG Signal Using -NN Classifier

1Department of EEE, Khulna University of Engineering & Technology (KUET), Khulna 9203, Bangladesh
2FSTK, University Sultan Zainal Abidin (UniSZA), 21300 Kuala Terengganu, Terengganu, Malaysia

Correspondence should be addressed to Md. Kamrul Hasan; moc.liamg@teukeeelurmak

Received 30 November 2016; Revised 29 March 2017; Accepted 11 April 2017; Published 13 August 2017

Academic Editor: Thibault Lemaire

Copyright © 2017 Md. Kamrul Hasan et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.


Electroencephalographic signal is a representative signal that contains information about brain activity, which is used for the detection of epilepsy since epileptic seizures are caused by a disturbance in the electrophysiological activity of the brain. The prediction of epileptic seizure usually requires a detailed and experienced analysis of EEG. In this paper, we have introduced a statistical analysis of EEG signal that is capable of recognizing epileptic seizure with a high degree of accuracy and helps to provide automatic detection of epileptic seizure for different ages of epilepsy. To accomplish the target research, we extract various epileptic features namely approximate entropy (ApEn), standard deviation (SD), standard error (SE), modified mean absolute value (MMAV), roll-off (), and zero crossing (ZC) from the epileptic signal. The -nearest neighbours (-NN) algorithm is used for the classification of epilepsy then regression analysis is used for the prediction of the epilepsy level at different ages of the patients. Using the statistical parameters and regression analysis, a prototype mathematical model is proposed which helps to find the epileptic randomness with respect to the age of different subjects. The accuracy of this prototype equation depends on proper analysis of the dynamic information from the epileptic EEG.

1. Introduction

Epilepsy is a long-lasting neurological disorder categorized by repeated, gratuitous seizures, electrophysiological disturbances in the human brain which may range from brief gaps of attention or muscle bumps to severe and prolonged seizures. Epileptic seizures are the visible or apparent manifestations that are produced when the brain briefly becomes dysfunctional because of abnormal paroxysmal discharge of the nerve cells in the cerebral cortex [13]. Alternately, epilepsy is a group of neurological disorders characterized by epileptic seizures [4, 5]. Epileptic seizures are incidents which may be varied from brief and nearly undetectable to long periods of vigorous shaking [6]. In epilepsy, seizures tend to recur and have no immediate underlying cause while seizures that occur due to a specific cause are not deemed to represent epilepsy [4, 7]. Characteristics of seizures vary and depend on where in the brain the disturbance first starts and how far it spreads. Temporary symptoms occur, such as loss of awareness or consciousness and disturbances of movement, sensation (including vision, hearing, and taste), mood, or other cognitive functions. Figure 1(a) represents normal neuronal-ion-channel function and in this section the membrane resting potential is −70 mV which is due to the sodium and potassium channels as a primary requirement of action potential.

Figure 1: Ion-channel dysfunction for the formation of epilepsy.

The sodium and potassium channels are associated with a depolarizing phase which occupy the medium position by sodium channel opening and a repolarizing phase due to potassium-channel opening and sodium-channel inactivation. On the other hand, remaining potassium channels contribute to a longer-term repolarization that acts as the prevention of repetitive excitation of the neuron. In Figure 1(b), mutations in SCN1B, which encode a voltage-gated sodium-channel subunit, are associated with generalized epilepsy with febrile seizures plus [4]. The movement of an increased amount of sodium current, which would lead to a greater depolarization during the action potential and an increased tendency to excite repetitive bursts is the outcome of apparent mutations. Similarly, in Figure 1(c), mutations in KCNQ2 and KCNQ3 will occur in both the potassium and sodium electrodes where encoding of potassium channels occur which are related with benign ancestral neonatal spasms. People with seizures can be injured, have fractures or bruises more frequently than controls, or have higher rates of psychological problems like anxiety or depression which causes more physical problems (such as fractures and bruising from injuries related to seizures). Similarly, the risk of premature death in people with epilepsy is up to 3 times higher than the general population, with the highest rates found in low- and middle-income countries and rural versus urban areas. A great proportion of the causes of death related to epilepsy in low- and middle-income countries are potentially preventable, such as falls, drowning, burns, and prolonged seizures [810]. There are more than 30 different forms of epilepsy and more than 40 different types of seizures [2]. According to a report of the World Health Organization (WHO) [11], around 50 million people worldwide have epilepsy. Around 90% of them are from developing countries and one-fourth of them do not have access to medication. Epilepsy cannot be cured, but it can usually be controllable with medication. For initial treatment of epilepsy, antiepileptic drugs (AEDs) are used [12]. Epilepsy is not transmissible. The idiopathic epilepsy is the most common type of epilepsy, which may affect 6 (out of 10) people with the disorder, and it has no detectible cause. Epilepsy which may take place due to known cause is called secondary epilepsy or symptomatic epilepsy. The major causes of secondary epilepsy [11] might be as follows: (i)The brain may get impairment from injuries(ii)Inherited abnormalities with associated brain defects(iii)A severe head injury(iv)Stroke may limit the amount of oxygen to the brain(v)Some infection like meningitis and encephalitis of the human brain(vi)A brain tumour which creates more randomness.

There are several methods to diagnose epilepsy such as electroencephalography (EEG), magnetic resonance imaging (MRI), functional magnetic resonance imaging (fMRI), single-photon emission computed tomography (SPECT), positron emission tomography (PET), and magnetoencephalography (MEG). As EEG has speed, high time resolution, and non-invasive advantages, still now it remains one of the most useful and effective tools in the treatment of epilepsy. Prediction of epileptic seizure based on EEG signals can be separated into three classes: time domain, frequency domain, and the nonlinear methods [13]. In recent times, seizure is detected from the recorded seizures in order to quantify the clinical image and propose video-based seizure recognition. In some papers, information-based measure are also proposed for the detections of epileptic seizure [14]. Entropy is a measure of rate of information that may be used in the signal processing for the detection of noise where a higher value corresponds to increased unpredictability while a lower value corresponds to higher predictability [15]. In our proposed research, we use six features for the classification, and among these features, entropy has the higher ranked features that is used for the regression model for prediction of level of epilepsy.

2. Mathematical Background of Classifier and Statistical Features

Mathematical background for the classifier (-NN) and statistical features (approximate entropy (ApEn), standard deviation (SD), standard error (SE), modified mean absolute value (MMAV), roll-off (), and zero crossing (ZC)) are described below.

2.1. -Nearest Neighbours (-NN)

The -nearest neighbours (-NN) algorithm is a nonparametric learning algorithm mechanism mainly used for the classification of signal pattern or pattern recognition as shown in Figure 2(a). The major goals of this mechanism are to assign to an unseen point the leading class among its -nearest neighbours within the training sets of data [16, 17].

Figure 2: Architecture of -NN classifier (a) simple classification and (b) cluster classification.

Among all of the method of classification like support vector machine (SVM), artificial neural network (ANN), linear discriminant analysis (LDA), naive Bayes (NB), and RBF neural network (RBFNN), -NN is the best classifier statistical pattern recognition or neighbour cluster selection as shown in Figure 2(b) due to its consistently high performance, without a priori assumptions. The -NN classifier extends this idea by taking the -nearest points and assigning the sign of the majority [18]. The positive integer “k” indicates how many neighbours guide the classification. The default value is called the nearest neighbour algorithm. In the classification analysis, -NN is the supervised learning algorithm [19, 20]. The learning algorithm of -NN for the classification of any data set is described below step by step. (1)Consider that training categories is the column vector of training set. If there are numbers of categories in a training set which is denoted by . The summation makes m-dimensional feature vector.(2)The sample data set should have the same dimensional vector for the proper classification which is denoted by .(3)In this state, the similarity between training set and data set should be calculated. Taking jth sample . The similarity is mentioned in(4)Select the value of k which is larger from N similarity of . Now, the probability function has the following mathematical form:where is the category of attribute function which satisfies the following mathematics: (5)Finally, justification of sample to categories which have larger value of .

In the -NN classifier, the distance between two sets of data points is measured by some distance vectors, which are Euclidean distance, cityblock distance, cosine distance, and correlation distance.

In statistical mathematics, the Euclidean distance is the distance between two points in Euclidean space, which becomes a metric space whose norm form is commonly known as Euclidean norm. The Euclidean distance, , is in

The distance between two points is the sum of the absolute differences of their Cartesian coordinates known as the cityblock distance which is also known as Manhattan length [21]. Cityblock distance is represented in

Cosine distance is the distance which is used for the complement in positive space, that is, . Cosine distance is represented in

Correlation distance is the measure of statistical distance between two random variables or two random vectors of arbitrary, not necessarily equal dimension. Correlation distance is represented in where and

The statistical features used for the classification using -NN classifier in this research are described below.

2.2. Approximate Entropy (ApEn)

ApEn is a statistical feature that indicates the predictability of the current amplitude values of a physiological signal, for example, EEG based on its earlier amplitude. The value of ApEn drops sharply during an epileptic seizure, and this property is used to detect the epileptic seizures. A high value of approximate entropy signifies more irregularity; on the contrary, a low value signifies that the time series is deterministic which reflects the intracortical information flow in the brain when applied to EEG signals [22, 23]. The value of ApEn can be calculated by using

Mathematical procedures of approximate entropy (ApEn) calculation are described in a flow chart [23, 24] in Figure 3.

Figure 3: Computational flow diagram for ApEn.
2.3. Standard Deviation (SD) and Standard Error (SE)

The measurements of square root of a variance of random variable, statistical population, any kinds of data set, or probability distribution is known as the standard deviation (SD) which is also known as absolute deviation. The standard deviation can be defined for any distribution with finite first two moments, which can be measured mathematically by using where N is the number of samples in data sets, is the actual value of the nth term in data sets, and is the average value of those data sets. The standard error (SE) is define as the standard deviation (SD) of a sample data set which is the estimation of sample mean based on the population mean. SE is the mean which is calculated using

2.4. Modified Mean Absolute Value (MMAV)

Mean absolute value (MAV) is the moving average of full-wave rectified data sets which is the measurement of average value by taking the average of absolute value of data sets. So, MMAV is the extension of MAV, in which the individual value is multiplied by weighting function [24] that can be determined by

2.5. Roll-Off ()

Roll-off is the steepness of a transmission function with frequency, particularly used in signal feature extraction. The roll-off can be defined as the frequency below which 85% of the magnitude distribution of the data sets is intense [24]. It is also a measure of spectral shape which can be written mathematically in

2.6. Zero Crossing (ZC)

Zero crossing () is the frequency domain features of the data sets which measures the number of times that the amplitude value of data sets crosses the zero y-axis [24]. It can be expressed mathematically in

2.7. Regression Analysis

In mathematics, regression analysis is the procedure to find out the mathematical relationship between dependent variables with independent variables. In limited conditions, regression analysis can be used to infer causal relationships between the independent and dependent variables. However, in many applications, especially with small effects or questions of causality based on observational data, regression methods can give misleading results. The function which fits a polynomial regression model by the method of linear least squares is mentioned below. where Y represents predicted outcome value for the polynomial model with regression coefficients to for the kth order polynomial and intercept .

3. Proposed Research Architecture

The overall proposed methodology is mentioned in a flow diagram as shown in Figure 4. The epileptic EEG signal is loaded into MATLAB workspace to find out the feature vector of epileptic EEG signal which are approximate entropy (ApEn), standard deviation (SD), standard error (SE), modified mean absolute value (MMAV), roll-off (), and zero crossing (ZC). These feature vectors are classified according to the standard feature vector using -NN classifier. After classification of epilepsy, regression equation is used to find the level of ApEn for different ages of epilepsy from epileptic EEG. High irregular time series EEG signal gives higher value of ApEn and vice-versa. Moreover, higher value of ApEn indicates more irregularity of the epileptic EEG signal. The level of ApEn is increased with the increase of the age of epileptic patients. Finally, error for each interpretation is measured to find out the best fitted equation for the interpretation which is most suitable and optimized regression equation for the prediction.

Figure 4: Proposed flow diagram of the research work.

4. Results and Discussions

4.1. Classification and Clustering Using -NN

The epileptic EEG data is processed for the achievement of the feature vector and then a template as mentioned in Table 1 is formed for the train of -NN network. In Table 1, all columns indicate the normalized features set and each row indicates the subject used for the train of network. In Figure 5(a), all the nearest neighbour is determined by the trained -NN network in which all the arrows indicate the nearest neighbour where blue squares indicate the train features set and red diamond are the desired points whose nearest neighbour is our goal. On the other hand, in Figure 5(b), the cluster of the feature vectors (ApEn, MMAV, SD, SE, roll-off, and ZC) is represented using a circle from the classification using -NN classifier. To accomplish the research goal, one desired standard feature point is set as a reference and then the -NN network is trained; its clustering circle is determined around the point of interest. In Figure 5(b), nearest neighbour is determined inside the circle to find out the close approximation of epileptic EEG signal using feature vectors (ApEn, MMAV, SD, SE, roll-off, and ZC) for that one feature of vector from the normal EEG data (free from the epilepsy) from the patient is required.

Table 1: Training and testing template of feature vectors of epileptic EEG data.
Figure 5: Classification using -NN classifier. (a) Nearest neighbour searching. (b) Clustering with -nearest neighbour.
4.2. Accuracy Analysis of -NN Classifier

In this research work, four distance parameters namely cityblock, correlation, cosine, and Euclidean are used in our analysis and their performance is analysed by considering other parameters keep constant. Similarly, the performance of classifier rule namely nearest neighbour (NN), random neighbour (RN), and smallest neighbour (SN) as well as different values is also analysed keeping corresponding parameters constant. The performance of the -NN classifier which is a confusion matrix is shown in Tables 2 and 3 and Figure 6. From Tables 2 and 3 as well as Figure 6, it is concluded that lower classification rate is found at the “cityblock” when and the classifier rule is smallest neighbour (SN).

Table 2: Percentage of accuracy due to variation of nearest number k and other parameters kept constant.
Table 3: Percentage of accuracy due to variation of classification rule other parameters kept constant.
Figure 6: Presentation of confusion matrix for various k values, distance types, and classification rules.
4.3. Regression Model for Level of Epilepsy

The 3rd-order fittings of the approximate entropy (ApEn) is shown in Figure 7(a). The corresponding regression equation is mentioned in (15). In this equation, if we put the age of the epileptic people, we may be interpreting the degree of randomness of EEG signal.

Figure 7: (a) 3rd-order fitting and (b) residual of ApEn with different ages of subjects.

The modification between the predicted value and actual value of the independent value is called the residual which is the measure of accuracy of prediction. The residual of the 3rd-order fitting is shown in Figure 7(b) and its regression equation is (16). From this equation, we may find the error of prediction at any age of the epileptic persons.

In a similar manner, 4th-order fittings of the approximate entropy (ApEn) is shown in Figure 8(a). The corresponding regression equation is mentioned in (17). In this equation, if we put the age of the epileptic people, we may be interpreting the degree of randomness of the EEG signal.

Figure 8: (a) 4th-order fitting and (b) residual of ApEn with different ages of subjects.

The residual of 4th-order fitting is shown in Figure 8(b) and its regression equation is (18). From this equation, we may find the error of prediction at any age of the epileptic persons.

4.4. Error Analysis of Prediction

In Table 4, the error of prediction is shown where the accuracy of prediction (interpretation) is more in the 3rd-order fitting. The 1st-order fitting is a liner fitting like which has more error probability and also it has a larger value of residual than other types of fitting of ApEn. From the table, it is noticed that the increase of order of fitting may reduce the error probability, but after the 3rd-order fitting, the error probability as well as the computational complexity is increased. Hence, optimum prediction equation for the epileptic seizure is the 3rd-order which has less computational complexity and less error probability than the 4th-order fitting. Form the table, it is also remarkable that at the smaller age of the epileptic people, the prediction error is more because at the increasing ages of the epileptic persons the EEG (epileptic) is more severe.

Table 4: Error (% deviation) calculation for different orders of fitting and for different test values (age) of subjects.

5. Conclusions

The electrophysiological activity of the brain called EEG signal can analyze for the prediction and diagnosis of epilepsy of the living animals. The epileptic EEG signal is more and more random and this EEG containing epilepsy is not suitable for the perfect brain-computer interface (BCI) paradigms. Hence, prediction of epilepsy is a vital issue in the modern biomedical field of research. For the prediction of epilepsy, a statistical approach was explained in this manuscript. In our research, the epileptic EEG signals for different aged epileptic subjects was analyzed and one of the vital features Approximate entropy (ApEn) was measured which was the indicator of randomness of any time domain signal. The regression equation of ApEn with respect to different ages of the epileptic persons may help the BCI researchers or the neural researcher to predict the randomness, namely, level of epilepsy corresponding to different ages. This may help the clinical person to provide the treatment of the epileptic person after finding the level of randomness.

Conflicts of Interest

The authors declare no conflict of interest.


  1. R. Appleton and T. Marson, Epilepsy: The Facts (Facts), Oxford University Press, 2009.
  2. M. F. Mendez, P. Catanzaro, B. C. Doss, B. Arguello, and W. H. Frey, “Seizures in Alzheimer’s disease: clinicopathologic study,” Journal of Geriatric Psychiatry and Neurology, vol. 7, no. 4, pp. 230–233, 2016. View at Google Scholar
  3. B. Liu, L. Yan, L. Li, and W. Wang, “Comparing study of nonlinear model for epileptic preictal prediction,” in 2010 4th International Conference on Bioinformatics and Biomedical Engineering, pp. 1–4, Chengdu, June 2010. View at Publisher · View at Google Scholar · View at Scopus
  4. S. Bernard, M. D. Chang, H. Daniel, and M. D. Lowenstein, “Epilepsy,” The New England Journal of Medicine, vol. 349, pp. 1257–1266, 2003. View at Google Scholar
  5. R. S. Fisher, C. Acevedo, A. Arzimanoglou et al., “ILAE official report: a practical clinical definition of epilepsy,” Epilepsia, vol. 55, no. 4, pp. 475–482, 2014. View at Publisher · View at Google Scholar · View at Scopus
  6. R. S. Fisher, W. V. Boas, W. Blume et al., “Epileptic seizures and epilepsy: definitions proposed by the International League Against Epilepsy (ILAE) and the International Bureau for Epilepsy (IBE),” Epilepsia, vol. 46, no. 4, pp. 470–472, 2005. View at Publisher · View at Google Scholar · View at Scopus
  7. D. Longo, A. Fauci, D. Kasper, S. Hauser, J. Jameson, and J. Loscalzo, Harrison’s Principles of Internal Medicine, McGraw-Hill Professional, 19th edition, 2011.
  8. “Epilepsy,” September 2015,
  9. D. J. Thurman, E. Beghi, and C. E. Begley, “Standards for epidemiologic studies and surveillance of epilepsy,” Epilepsia, vol. 52, Supplement 7, pp. 2–26, 2011. View at Publisher · View at Google Scholar · View at Scopus
  10. J. R. Hughes, “Absence seizures: a review of recent reports with new concepts,” Epilepsy & Behavior, vol. 15, no. 4, pp. 404–412, 2009. View at Publisher · View at Google Scholar · View at Scopus
  11. U. R. Acharya, F. Molinari, and S. V. Sree, “Automated diagnosis of epileptic EEG using entropies,” Biomedical Signal Processing and Control, vol. 7, no. 4, pp. 401–408, 2012. View at Publisher · View at Google Scholar · View at Scopus
  12. T. Glauser, E. Ben-Menachem, and B. Bourgeois, “ILAE treatment guidelines: evidence-based analysis of antiepileptic drug efficacy and effectiveness as initial monotherapy for epileptic seizures and syndromes,” Epilepsia, vol. 47, no. 7, pp. 1094–1120, 2006. View at Publisher · View at Google Scholar · View at Scopus
  13. G. Giannakakis, V. Sakkalis, M. Pediaditis, and M. Tsiknakis, “Methods for seizure detection and prediction: an overview,” Modern Electroencephalographic Assessment Techniques, vol. 91, pp. 131–157, 2015. View at Google Scholar
  14. V. Sakkalis, G. Giannakakis, C. Farmaki et al., “Absence seizure epilepsy detection using linear and nonlinear EEG analysis methods,” in 2013 35th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), Osaka, Japan, July 2013. View at Publisher · View at Google Scholar · View at Scopus
  15. U. R. Acharya, H. Fujita, V. K. Sudarshan, S. Bhat, and J. E. W. Koh, “Application of entropies for automated diagnosis of epilepsy using EEG signals: a review,” Knowledge-Based Systems, vol. 88, pp. 85–96, 2015. View at Google Scholar
  16. F. Lotte, M. Congedo, A. Lecuyer, and F. Lamarche, “A review of classification algorithms for EEG-based brain–computer interfaces,” Journal of Neural Engineering, vol. 4, no. 2, article R1, 2007. View at Publisher · View at Google Scholar · View at Scopus
  17. R. O. Duda, P. E. Hurt, and D. G. Stork, Pattern Classification, John Wiley & Sons, 2012.
  18. M. J. Islam, Q. J. Wu, M. Ahmadi, and M. Sid-Ahmed, “Investigating the performance of naive-Bayes classifiers and k-nearest neighbor classifiers,” in 2007 International Conference on Convergence Information Technology (ICCIT 2007), pp. 1541–1546, Gyeongju, 2007. View at Publisher · View at Google Scholar
  19. N. Suguna and K. Thanushkodi, “An improved K-nearest neighbor classification using genetic algorithm,” International Journal of Computer Science Issues, vol. 7, no. 2, pp. 18–21, 2010. View at Google Scholar
  20. B. V. Dasarathy, Nearest Neighbor ({NN}) Norms:{NN} Pattern Classification Techniques, IEEE Computer Society Press, 1991.
  21. “Taxicab geometry,”
  22. H. Ocak, “Automatic detection of epileptic seizures in EEG using discrete wavelet transform and approximate entropy,” Expert Systems with Applications, vol. 36, no. 2, pp. 2027–2036, 2009. View at Publisher · View at Google Scholar · View at Scopus
  23. C. Cobelli and E. Carson, Introduction to Modeling in Physiology and Medicine, Academic Press, 2008.
  24. M. A. Riheen, M. W. Rahman, and A. B. M. Aowlad Hossain, “Selection of proper frequency band and compatible features for left and right hand movement from EEG signal analysis,” in 2013 16th International Conference on Computer and Information Technology (ICCIT), pp. 272–277, Khulna, 2013. View at Publisher · View at Google Scholar · View at Scopus