Abstract

An automatic artifact extraction system is proposed based on a hybridization of Stone’s BSS and genetic algorithm. This hybridization is called evolutionary Stone’s BSS algorithm (ESBSS). Original Stone’s BSS used short- and long-term half-life parameters as constant values, and the changes in these parameters will be affecting directly the separated signals; also there is no way to determine the best parameters. The genetic algorithm is a suitable technique to overcome this problem by finding randomly the optimum half-life parameters in Stone’s BSS. The proposed system is used to extract automatically the common artifacts such as ocular and heart beat artifacts from EEG mixtures without prejudice to the data; also there is no notch filter used in the proposed system in order not to lose any useful information.

1. Introduction

Electrical activities of the brain are usually measured by electroencephalogram (EEG) to describe the state of the patient’s brain. The visual analysis for EEG activities by the technicians is very difficult because these activities are submerged with artifacts [1]. The artifacts are one of the limitations in the EEG acquisition unit and maybe taken mistakenly as wanted data in brain signal analysis or in a brain computer interface (BCI) system [2].

The common artifacts in EEG signals are power line noise interference (LN), electrocardiogram (ECG), and electrooculogram (EOG) [3]. Numerous approaches have been sophisticated in time, frequency, and time-frequency domains to remove or separate these artifacts [4].

Many researchers have been used blind source separation (BSS) techniques to separate the artifacts from brain signals [5]. Automatic removal approach of EOG artifacts from EEG data based on BSS is offered in [6]. Two ICA algorithms, InfoMax (IICA) and Extended-InfoMax (EIICA), were utilized to extract eye movements and power noise of 50 Hz from EEG data is proposed in [7]; the EIICA can isolate both super-Gaussian artifacts (eye blinks) and sub-Gaussian signal (power line noise interference), but IICA is only restricted to remove super-Gaussian artifacts (eye blinks). BSS and parallel factor analysis (PFA) are integrated to reject the EEG artifacts [8, 9]. Wavelet transforms (WT) with independent component analysis (ICA) and statistical autoregressive moving average model have been used to reject the artifacts [10]. Pesin [11] demonstrates a novel approach to recognize and reject eye blink artifacts from EEG system based on an integration between wavelet technique and FastICA to expose the temporal position of eye blink and then remove it. In recent years, new studies are used to extract the EEG data from EEG mixture based on modified BSS algorithms, such as in [2] which tries to propose a complete artifact rejection system based on constrained independent component analysis (cICA) to separate ECG and EOG artifacts from EEG signals measured inside MRI.

ICA algorithms usually used in EEG signal processing and the most widely used are Infomax [12], FastICA [13], SOBI [14], and BGSEP [15]; the SOBI and BGSEP used second-order statistic but the InfoMax and FastICA used high-order statistic [1]. A comparison study between ICA algorithms was proposed in [16]. ICA algorithms have an inherent disadvantage such as (i) source ambiguity, (ii) undetermined variances of the components, and (iii) the performance of ICA being decreased when the dataset is small and with large dataset the redundancy case is not sufficient to recover the independent components [2].

Stone’s BSS is used instead of ICA due to these limitations [1719]; it was first motivated by Stone [20]. Many researchers try to discuss and modify it to enhance the separation process [17, 2123]. Stone’s BSS was used successfully to extract the ocular artifact from EEG mixture [19].

Stone’s BSS algorithm based on a temporal predictability measure to recover the sources from the mixture. Short- and long-term half-life parameters are used to calculate the temporal predictability of the signal; these parameters are taken as follows: the long-term half-life is 100 times longer than corresponding short-term half-life. The changes in these parameters will be affecting the output; also there is no technique to calculate the best values.

Evolutionary algorithms such as genetic algorithm (GA) and particle swarm optimization (PSO) are partly successful used to solve BSS problem in some applications but there are two issues addressed when using evolutionary algorithms to solve the BSS problem: (i) generating random initial coefficients of separate matrix maybe does not give the candidate solutions; (ii) it is relatively slow due to large population size.

Due to the limitations in both original Stone’s BSS and the genetic algorithm, the proposed algorithm is used to overcome these limitations by a hybridization technique of Stone’s BSS with genetic algorithm.

Simple ad hoc criterion called sparsity measure proposed in [1] is used in the proposed system to classify the extracted signal into artifact or not. This criterion imposes the high amplitude and short duration artifact such as EOG and ECG.

In this paper, automatic artifact extraction system is proposed to clean the brain mixtures from common artifacts. This topic is identified as being of importance to the workers in brain signal analysis.

2. EEG Signals and Artifacts

The EEG system measures the brain signals by electrodes placed on the head surface (scalp); these electrodes (channels) are commonly arranged based on 10–20 international system as shown in Figure 1 [24]. This system has been incorporated by the American Electroencephalographic Society. In this system there are two reference points: nasion and inion to define the electrode location. The channel name indicates a specific brain regions, () frontal polar, () frontal, () central region, () parietal, () nasopharyngeal, () occipital area, and () ear lobe [24, 25]. The neurophysiological signals measured by EEG have different variation in amplitude, frequency, and shapes [26]. The frequency of EEG signal can be divided into 5 subbands as shown in Table 1 [11, 24, 27].

EEG signal is highly nonstationary random weak signal (Figure 2(a)). The amplitude of EEG brain signals is divided into three divisions as shown in The artifacts represent one of the limitations in brain signal analysis and may be taken mistakenly as a brain signal. The common artifacts in EEG signal analysis are.

2.1. Power Line Noise Interference

Brain EEG signals are often contaminated by power line noise interference signal (50 or 60 Hz/AC power supply). This signal is monomorphic waveform and distributed in several electrodes. The power line noise signal is generated from wires, light fluorescents, and other tools in recording system. Usually the torch fluorescents’ light produces an artificial spike in recorded signals from brain. Figure 2(b) shows the EEG signal submerged with power line noise waveform [3, 28].

2.2. Electrocardiography (ECG)

The cardiac activity is a high electrical energy explicit effect on EEG signals. The ECG artifact is appear like regular spikes in EEG recording process as shown in Figure 2(c). These types of artifacts may be clinically misleading [29]. ECG artifact or heartbeat artifacts are produced when an electrode is placed on or near a blood vessel [27, 30].

2.3. Electrooculogram (EOG)

The electrical activity produced by eye blinks or eye movement is known as the electrooculogram (EOG) artifact or ocular artifact (OA). The electrical dipole is generated by positive cornea and negative retina in the eye and the movements or blinks of human eye will be changing this dipole to produce EOG artifacts [19, 31]. Eye blink has spikes shape (Figure 2(d)) while the eye movements have square shapes (Figure 2(e)). The frequency of eye blink artifact is lower than 4 Hz. The eye blinks have low propagation but the eye movements have high propagation [32]. In the clinical interpretation these artifacts should be removed from EEG data. The EOG signal is measured by EOG electrodes placed above and under eye as shown in Figure 3 [30].

3. Artifact Rejection Methods

During the recording process the data are contaminated by different types of artifacts. These artifacts should be removed before analyzing the EEG signal. There are many techniques used for this purpose.

3.1. Manual Method

This is very simple method to eliminate the artifacts from EEG mixtures. The model for this method is governed by this condition:if artifact exists in epoch, then remove corrupted epoch.The important data will be lost during the removing process, particularly when limited amount of data are available or many artifacts submerged in EEG signals [33].

3.2. Filtering Method

This method depends on the analysis of frequency characteristic of EEG signal and artifacts. The frequency features provide efficient information for identifying the artifacts, but the spectra of artifacts are overlapped with EEG signal spectra. Therefore, the important data may be lost during the filtering process [34].

3.3. Regression Method

The regression method is based on the subtraction process for removing the artifact from contaminated EEG. The procedure for regression analysis to remove the ocular artifact is defined in [31]. The recorded EEG signal (EEGr) can be described as the sum of original EEG signal (EEGo) and a fraction () of the EOG signal [31, 35]: The correlation () at zero lag between EOG signal and observed EEG is given by Substitute (2) in (3): Equating (3) and (4) provides However, in the regression technique, it is assumed that there is no correlation between the EEGO and EOG; therefore, Substitute (6) in (5): From (7) the value of the propagation factor () can be calculated by The EEGO signal can be calculated by inserting the propagation factor in (2): The regression approach is very easy in implementation but some of assumptions should be satisfied [11, 33, 36].(i)EEG signal and artifacts are uncorrelated.(ii)The EEG signal is a linear combination.(iii)The artifacts must not have any brain activity in order not to lose data at subtracting process.(iv)Same propagation factors for different artifacts.

3.4. Blind Source Separation Method

In EEG acquisition unit, the electrodes are placed on the scalp at close distance and each electrode sensing a mixture of brain stimuli is based on the distance from the sources as shown in Figure 4 [2].

Many sources (neurons) are stimulated for any action in the brain and there is no information about the sources and the mixing procedure which happened inside the brain. Brain signal analysis is a blind source separation problem as mentioned in [2]. Typical BSS mixing model is shown in Figure 5 [18].

The mixing system without noise is where are mixed signals from sensor (known), are source signals (unknown), superscript refers to transpose operator, is a mixing matrix (unknown), and the symbol is time or sample index. The goal is to recover from without knowing ; to solve this problem the separating matrix should be founded to calculate the recovered signals by where is a permutation of source signal up to scaling factor.

Finally, the limitations summary for each method is shown in Table 2.

4. Proposed Work

Automatic artifact extraction system is proposed based on modified Stone’s BSS (called evolutionary Stone’s BSS algorithm (ESBSS)) and artifact detection measure to clean EEG-brain signals from common artifacts. ESBSS based on the joint between original Stone’s BSS and genetic algorithms (GA). For easy reference, the outline of the proposed work is summarized in Figure 6.

Each block will be explained below.

(i) Raw EEG. Almost the brain electrical activities are measured by electroencephalography (EEG) device and the main characteristics of EEG signals are as follows: being easily recorded by electrodes, being complex-spatiotemporal signals, being very good in temporal regulation, being poor in spatial resolution, and depending on the number of electrodes [37]. These signals are submerged by artifact signals. Different raw EEG data are taken to test the proposed system as shown in the result section.

(ii) Preprocessing. The raw EEG data are preprocessed by centering and whitening techniques to make the BSS problem simple and better conditioned [38]. The centering process is very necessary to simplify the BSS estimation; it refers to centering the received variables by subtracting their sample mean (12); that is, remove the sample mean from received vectors and add it after recovering the original sources [27]: where is the centered signal; is the received signal; and is the expectation of .

The whitening process is a linear transformation used to simplify the calculation by transforming the received vector () to another vector (), whereby the whitened components are uncorrelated and their variance equals unity: Usually, the eigenvalue decomposition technique of the covariance matrix is used to obtain the whitening matrix: where is the covariance matrix, is the orthogonal matrix of eigenvector, and is the diagonal matrix of eigenvalue.

The mixing matrix is transformed to orthogonal mixing matrix : where

The calculation of estimate parameters in is reduced to parameters in [39].

(iii) Evolutionary Stone’s BSS Algorithm (ESBSS). Evolutionary Stone’s BSS algorithm is a joint between original Stone’s BSS and genetic algorithm. The half-life parameters (, ) values are generated randomly and tuned by genetic algorithm to enhance the separation process in original Stone’s BSS.

Stone’s BSS is based on the temporal predictability measure (TP) to separate the original sources from their mixture and its conjecture. The conjecture of Stone is as follows: the TP of any signal mixture of any of its components. This conjecture is used to find the weight vector which gives an orthogonal projection of mixtures [20]. Stone’s measure of temporal predictability of signal is defined as [20] where is the number of samples of , , , and , are half-life parameters. The half-life of is 100 times longer than corresponding half-life of according to Stone [20], but this limitation maybe does not give the optimal solution; therefore the proposed work used genetic algorithm to find the optimum parameters which satisfied the better separation between the components.

GA is used to generate randomly half-life parameters , ) and tune these values until stopping criteria are satisfied instead of fixed values in original Stone’s measure. Each chromosome consists of two real genes, where the first gene represents short-term half-life parameter and the second gene represents long-term half parameter as shown in Figure 7.

The parameters of the genetic algorithm are(i)maximum number of generations = 20,(ii)population size (pop.) = 40,(iii)length of chromosome = 2,(iv)probability of crossover = 0.95,(v)probability of mutation = 0.05,(vi)fitness function: where are separated signals, is the entropy of the signals, represent the mutual information calculated using the concept of differential entropy between signals, and is a constant value (0.0001).

The mutual information is always nonnegative and zero if the components are statistically independent. The epsilon () constant is added to in the denominator of the fitness function to avoid the infinity case. The definition of the fitness function parameter (Fit) is the key point in the performance of genetic algorithm [40].

GA attempts to maximize the fitness function by minimizing the mutual information between the components and is significantly successful at this task. Therefore the inverse of the mutual information will be taken as a fitness function (18). The dependence among the separated components is minimized when the fitness is maximized [41].

For easy reference, the flowchart of the ESBSS is depicted in Figure 8.

The separated signals are calculated by , ; then (17) is rewritten as where is a long-term covariance matrix () between signal mixtures; is a short-term covariance matrix () between signal mixtures; and between th and th mixtures:

The main aim is to maximize Rayleigh’s quotient () to yield unmixing vectors; thereby generalized eigenvectors of are considered to solve this problem [20, 21, 42]; to find the eigenvectors of matrix which are orthogonal in the covariance matrices, where When short-term half-life parameter is toward zero value (: Also when long-term half-life parameter is toward infinity ( and has zero mean, the long-term mean is Now under these conditions the expectation for and is equal to zero:

Therefore, this indicates that each recovered signal which can be calculated by is uncorrelated with every other signal which is also calculated by ; also if and are independent, then the expectation value is also zero. This method is powerful for any linear mixture with statistically independent signals and is guaranteed to separate the independent components. Also the temporal derivative of each recovered signal is uncorrelated with every one and the expectation value equals zero: The separating matrix is calculated by Matlab eigenvalue function as

One of the advantages of Stone’s BSS is to simplify the BSS problem into generalized eigenproblem [22].

(i) Artifact Detection. The artifact detection process is based on simple ad hoc criterion called sparsity measure (28) which is implemented by [1].

(ii) Consider where is the th components, is the number of samples in the frame, std is the standard deviation, and is the time index.

This criterion imposes that the artifacts with high amplitude have short duration compared with selected frame length; this is called sparse in a time domain [1]. The sparsity value equals 2.5 for super-Gaussian artifact (i.e., EOG and ECG) as mentioned in [1] but for sub-Gaussian signal (i.e., power line noise interference) it is less than 1 as concluded from the simulation and experimental results.

5. Results

Simulated and real EEG data are tested by the proposed system. The performance of the simulated and semisimulated data is evaluated by interference signal ratio ISR (29) and a cross-correlation measure between the original and estimated artifacts: where is the original signals, is the recovered signals, and is the time or sample index.

The result of the separating process is better whenever the ISR measure is less. For real EEG data the ISR measure is not applicable because there is no information about the original sources. Therefore, EOG electrodes (vEOG and hEOG) are used to measure the face activity (artifacts) and then compare these artifacts with extracted artifacts. The results are compared with the different BSS algorithms (EFICA [43], original Stone’s BSS [20], FICA, SOBI, and JADE). The power line noise interference 50 Hz is separated as a biological artifact; that is, there is no notch filter used during the recording process in order not to lose any useful information around 50 Hz, where the gamma band (25–100 Hz) lies within the notch filter range.

5.1. Simulation Results

Simulated EEG signal and common artifacts (LN, EOG, and ECG) are generated in Matlab program based on [2, 44, 45].

For easy reference, the procedure is divided into four steps.

Step  1: Generate Artificial Sources. EEG signal is very weak compared with artifacts. It is highly nonstationary random signal and notable in the frequency domain characteristics. Almost the artifacts have high amplitude and are localized in the time and/or in frequency domains [46]. The simulation of EEG signal and different types of artifacts is implemented based on the characteristics of each signal. Predominantly two theories are widely used to generate simulated EEG signals: classical and phase-resetting theories [2, 44, 45]. In classical theory, the peaks in event-related potential (ERP) waveforms reflect phasic bursts of activity in one or more brain regions that are triggered by experimental events of interest. Specifically, it is assumed that an ERP-like waveform is evoked by each event, the ERP “signal” is buried in ongoing EEG signal “noise.”

In the phase-resetting theory, the experimental events reset the phase of ongoing oscillations [44]. The phase-resetting method which is proposed in [44] is used here to generate the data. Figure 9 shows the simulated original artificial signals () for EEG and different types of artifacts EOG, ECG, and LN. Eye blink artifact is simulated using Sinc function [45, 47], ECG artifact is simulated using ecg function in Matlab, and the power line noise interference is simulated based on sinusoidal function (50 Hz). Figure 10 shows the signals with zero mean and unit variance.

Step  2: Mixing Process. The signals are mixed randomly by mixing matrix to produce mixture . All the possibilities of the mixing process are taken as shown in the schematic diagram of the mixing (Figure 11) in order to cover all the expected mixtures and to produce different types of mixtures (Figures 12, 13, 14, 15, and 16).

Step  3: Extraction Process. The EEG signals considered a projection of a group of mixed signals from brain data and artifact. The main challenge in the human brain signal analysis is to clean the EEG data from common artifacts by separating the mixture into its individual components. ESBSS algorithm is used to extract the artifact signals from brain mixture. Very good extraction results are obtained by ESBSS as shown in Figures 17, 18, 19, 20, and 21. In these figures the extracted signals are shifted vertically for display purposes.

The comparison of ISR value for different BSS algorithms is shown in Tables 3, 4, 5, 6, and 7. ESBSS algorithm is surpassed significantly for other BSS algorithms as shown in Tables 8, 9, 10, 11, and 12.

Step   4: Artifact Detection. Sparsity measure is used to detect the artifacts as mentioned in the proposed work section. Table 13 classifies the separated components based on sparsity value. For power line noise the sparsity value is very low (less than 1) but for EOG or ECG artifacts the sparsity value is high (more than 2.5) [1].

5.2. Semisimulated Data

Real EEG signals are apparently artifact-free signals recorded from a newborn individual with active sleep (Figures 22(a1) and 22(a2)). These signals are mixed randomly (Figure 23) with simulated artifacts to produce contaminated data (Figure 24). The real EEG data are available in http://sisec2010.wiki.irisa.fr/. Sinc function is used to produce the shape of eye blinking artifact [45]. Heartbeat artifact is simulated by ecg function in Matlab; the power line noise interference is simulated by sin function.

Figure 25 shows the original source signals and corresponding recorded signals using the ESBSS algorithm shifting vertically for display purposes.

The performance of the ESBSS algorithm is evaluated by interference to signal ratio ISR as shown in Table 14. Another performance evaluation measure based on cross-correlation between original and the estimated artifact is shown in Table 15.

The sparsity value is used to indicate the artifact components; for short-duration artifacts (i.e., eye blinking and heartbeat) the sparsity value should exceed 2.5 [1], but for line noise interference it should not exceed 1. Therefore, if the sparsity value for any separated component lies within this limitation (), then the separated component is artifacts as shown in Table 16.

5.3. Real Data with 8 Channels

Real EEG data are taken from computerized EEG device. Two studies are implemented with the main goal being to extract the artifact as independent components. The brain signals are recorded from six electrodes (, , , , , and ) placed on the scalp according to 10–20 system with a ground placed at as shown in Figure 26; the sampling rate is 256 Hz. Furthermore, the EOG electrodes (vEOG and hEOG) are used to measure EOG activity from eyes; these electrodes are placed above and on the side of the left eye socket. The EOG channels (vEOG and hEOG) are used to assess the performance of the proposed system.

The ISR index is not applicable for real EEG data because the mixing process is unknown. Therefore, the correlation measure is used to calculate the correlation between extracted artifacts and the recorded artifacts (i.e., EOG channels). The sparsity measure is used to classify the separated components signal into artifact or not [48].

5.3.1. Data Set I

The EEG signals are contaminated by eye blink artifact and power line noise 50 Hz. Eye blinking artifact is clearly in the frontal channels (, ) and decreased in the distance of the electrodes from the eye. All the EEG channels are contaminated by power line noise 50 Hz but with strongly different contamination. Power line noise is pronounced on the central and occipital channels (i.e., , , , and ) [48] as shown in Figure 27(a). By very good extraction of the line noise interference and eye blink artifact by the ESBSS algorithm, the line noise is concentrated and separated in IC1 (Figure 28) and the eye blink artifact is clearly isolated in IC6 as shown in Figure 28. Table 17 shows the correlation between vEOG channel (Figure 27(b)) and extracted eye blink artifact component (IC1) (Figure 28). The correlation result illustrates that the ESBSS algorithm is more powerful than other BSS algorithms to extract the eye blinking artifact. Table 18 shows type of separated components based on sparsity value, where IC1 is a power line noise because its value is very low (less than 1), and the IC6 is classified as the artifact signal due to high sparsity value (more than 2.5).

The result is also confirmed by power spectra of the mixture signals and extracted components. As mentioned above, all channels are interference with line noise 50 Hz, particularly , , , and as shown in Figure 29. The frequency components of the extracted signals showed that the power line noise was successfully extracted by ESBSS as shown in Figure 30.

5.3.2. Data Set II

The EEG channels are contaminated by eye movement (eye muscle artifact) and eye blinking artifact as well as by power line noise interference during the recording process. Eye muscle artifacts present in all channels; eye blink artifacts appear strongly in the frontal channels and the power line noise 50 Hz strongly appears on the central and occipital channels as shown in Figure 31. Good extraction of the artifacts by EBSS algorithm is shown in Figure 32. The line noise is concentrated and separated in IC1, and the eye muscle artifact is isolated in IC5 and the eye blink artifact is clearly isolated in IC6. The performance of the proposed system is tested by the correlation between EOG channels (Figure 31(b)) with the extracted artifacts as shown in Table 19. The sparsity value is used to indicate the artifact components as shown in Table 20. The frequency components of the EEG data and extracted components around 50 Hz range showed that the power line noise is successfully isolated as shown in Figures 33 and 34.

5.4. Real EEG with 19 Channels

Real EEG data contaminated by power line noise interference and EOG artifact (eye blink) are measured by computerized EEG device in Ibn-Rushd Hospital, Baghdad, Iraq. The computerized EEG is a computer with a PCI card of data acquisition unit that acquires the signals from the scalp through macroelectrodes as shown in Figure 35.

One healthy subject, male, 24 years old, participated in this study. EEG signals were measured using 19 electrodes used to measure the brain signals placed on the scalp according to 10–20 system and referenced against forehead [27]. According to the specification of computerized EEG device the recorded signals were digitized at 256 Hz, and trail length is 10 Sec ( sampls), during which the subject was allowed to perform eyes blink artifacts.

The block diagram of the proposed procedure is explained in Figure 36.

After the EEG trace has been finished, it can be saved as an ASCII code fromFile > Export >.This ASCII file can be opened using the Notepad program. Figure 37 shows the EEG trace arranged in columns (the sequences of channels are predefined). The data is imported into Microsoft Excel program to delete the first column that contains the timing information and to delete the channel’s names as shown in Figure 38, and then the data is imported into Matlab.

Figure 39 shows the contaminated signals; these signals are preprocessed for simplification as shown in Figure 40. ESBSS and different BSS algorithms were applied to 19 channels of 10-second data to extract the power line noise 50 Hz and eye blink artifact. Figure 41 represents the extracted components by ESBSS algorithms. The eye blink artifact is separated successfully in and the power line noise interference is separated in without using notch filter. The performance of the proposed system is evaluated by the correlation measure between EOG channel and the estimated artifact as shown in Table 21. The separated signals are classified based on sparsity measure as shown in Table 22.

6. Conclusion

Automatic artifact extraction system is proposed based on evolutionary Stone’s BSS algorithm ESBSS. The system has been proven to be a powerful technique for extracting both super-Gaussian signal and sub-Gaussian signal from brain EEG mixtures automatically and simultaneously. ESBSS was shown to perform better than different types of blind source separation algorithms as demonstrated in simulated and experimental results. The proposed system solves many problems by the hybridization process between the original BSS and genetic algorithm. Almost the previous works in the artifact extraction field used notch filter as a preprocessing step to remove power line noise but, in the proposed system, there is no filter used in order not to lose any information. The results obtained by the ESBSS algorithm are encouraging and can be used to extract other types of artifacts.

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

Acknowledgments

This work was supported by the National Natural Science Foundation of China (Grant no. 61172159) and the Fundamental Research Funds for the Central Universities (HEUCFT1101). The authors would like to express their thanks to Dr. Danilo P. Mandic, Imperial College, United Kingdom, for giving them real data (8 channels data); also they express their thanks to Mr. Salim, Wasit University, Iraq, for giving them real data (19 channels).