Feedback Artificial Shuffled Shepherd Optimization-Based Deep Maxout Network for Human Emotion Recognition Using EEG Signals

Bhanumathi, K. S.; Jayadevappa, D.; Tunga, Satish

doi:https://doi.org/10.1155/2022/3749413

International Journal of Telemedicine and Applications

On this page

Abstract Introduction Related Work Results and Discussion Conclusion Data Availability Conflicts of Interest References Copyright Related Articles

Research Article | Open Access

Volume 2022 | Article ID 3749413 | https://doi.org/10.1155/2022/3749413

Feedback Artificial Shuffled Shepherd Optimization-Based Deep Maxout Network for Human Emotion Recognition Using EEG Signals

K. S. Bhanumathi,¹D. Jayadevappa,¹and Satish Tunga²

Academic Editor: Fei Hu

Received31 Aug 2021

Accepted24 Dec 2021

Published21 Jan 2022

Abstract

Emotion recognition is very important for the humans in order to enhance the self-awareness and react correctly to the actions around them. Based on the complication and series of emotions, EEG-enabled emotion recognition is still a difficult issue. Hence, an effective human recognition approach is designed using the proposed feedback artificial shuffled shepherd optimization- (FASSO-) based deep maxout network (DMN) for recognizing emotions using EEG signals. The proposed technique incorporates feedback artificial tree (FAT) algorithm and shuffled shepherd optimization algorithm (SSOA). Here, median filter is used for preprocessing to remove the noise present in the EEG signals. The features, like DWT, spectral flatness, logarithmic band power, fluctuation index, spectral decrease, spectral roll-off, and relative energy, are extracted to perform further processing. Based on the data augmented results, emotion recognition can be accomplished using the DMN, where the training process of the DMN is performed using the proposed FASSO method. Furthermore, the experimental results and performance analysis of the proposed algorithm provide efficient performance with respect to accuracy, specificity, and sensitivity with the maximal values of 0.889, 0.89, and 0.886, respectively.

1. Introduction

Electroencephalography (EEG) is a composite time series data, and it is considered as a noninvasive technique such that the experts are utilizing it often in numerous applications. Emotion recognition is an affective computing procedure, which includes studies of how emotions are processed and recognized by computers. Emotions support humans in daily life for reasoning, planning, decision-making, and several human mental tasks. A psychologically balanced individual with positive emotions led a successful life when compared to an emotionally unbalanced individual. Besides, negative emotions may harm a person’s decision-making skills and health. Moreover, emotion recognition has numerous applications in various fields, such as the entertainment industry, e-learning, adaptive games, adaptive advertisement, and an emotion-based music player. Various clinical analysts utilize EEG signals for recognizing emotions as the EEG cap is a moveable device in such a way that it is suitable and can be used in different emotion recognition applications [1]. Emotion models can be generally categorized into two different types, namely, discrete and dimensional models. According to Ekman’s theory, emotions are classified into discrete entities, such as anger, fear, happiness, disgust, and surprise [2]. The latter illustrates emotions by means of underlying dimensions, such as valence, dominance, and arousal [3], where the emotions can be measured from passive to active, unpleasant to pleasant, and submissive to dominant. EEG determines the voltage variations from the cortex regions in the brain to disclose significant information regarding the different mental states of human emotions [4]. For instance, larger relative left frontal EEG action has been estimated when experiencing positive emotions. The voltage variations in various regions of brain are computed by the electrodes connected to scalp such that every electrode acquires EEG signals in a single channel [5].

Emotion recognition concentrates on recognizing the emotions of humans concerning different modalities, like body language, audio-visual expressions, and physiological signals. The physiological signals, like electrocardiogram (ECG), electroencephalogram (EEG), and electromyography (EMG), benefit from being intricate to disguise or hide when compared to various other modalities. EEG-driven emotion recognition has recently gained significant attraction towards various research fields and applications because of the progressive development of simpler, inexpensive, and noninvasive EEG recording devices. Emotion is a basic psychological condition that can have an effect on human perception, human cognition, and rational decision-making. Automatic recognition of human emotions has recently achieved progressive growth with respect to rapid improvement of brain-computer interaction (BCI). As the emotions are detected from the movement of the body and facial expression, EEG is considered as the efficient way to find the emotions of the human as the EEG signals compute and record the electrical activities from various regions of the brain [6, 7]. The recognition of emotions can be carried out by speech, body posture, facial expressions, physiological actions, etc. These methods are based on outwardly expressed emotions, which cannot measure the inner feelings. Meanwhile, EEG-driven signals reveal this secret data and offer various emotion-based patterns [8, 9]. Therefore, a variety of techniques based on the feature extraction were developed for measuring the EEG signal information. These features can measure the nonlinearity and underlying difficulty of the EEG signals [10, 11]. Recently, deep learning approaches facilitate automatic feature extraction and selection, and these approaches have huge impacts on signal and information processing [12]. The various deep learning techniques, namely, deep belief network (DBN), autoencoder (AE), and convolution neural network (CNN), have been efficiently utilized in processing the physiological signals and achieved significant results when compared with the conventional shallow models. Bimodal deep autoencoder (BDAE) extracts the high level representation features and was considered very significant for recognizing the emotions. Deep learning methods, such as AE or CNN, could not grab the temporal details of EEG signals. Recurrent neural network (RNN) is a deep learning algorithm such that the connections among the units generate a loop for processing the sequence data. RNN has been commonly utilized in the process of speech recognition and machine translation. Various types of RNN, like long short-term memory, were employed for processing and analyzing physiological signals. Moreover, bimodal long short-term memory (bimodal LSTM) approach was devised for recognizing the emotions with multimodal signals. EEG signals and the features related to eye movement were considered as inputs while using SEED dataset, whereas peripheral physiological signals and the EEG signals were considered while using DEAP dataset. In addition, a hybridized deep learning system incorporating LSTM and CNN was developed to extract the task-driven features, exploring interchannel correlation and combining the contextual data from the frames.

The key contribution of this research work is an effective and accurate emotion recognition approach to classify the human emotions using the proposed FASSO-based deep maxout network (DMN). The major focus is to design a proposed FASSO-based DMN for the human emotion recognition. In the first stage, the input EEG signals are acquired from the DEAP dataset and preprocessed using median filter. After that, the required features are extracted from the preprocessed signal for further processing. Besides, the extracted features are presented to the data augmentation stage, and then, the emotion recognition phase is performed using the DMN classifier. Furthermore, the training of the DMN classifier is carried out by the FASSO algorithm, which comprises both FAT and SSOA models.

Emotion encompasses consciousness as well as cognition in all human beings and it plays a very significant role in all aspects of humans. Hence, recognition of emotion has become a very important research area. As part of the proposed work, this section reviews various emotion recognition techniques using EEG signals. This review also provides advantages, challenges, and limitations of the existing emotion recognition approaches.

Zhong et al. [5] developed a regularized graph neural networks (RGNN) for the automated emotion recognition. This technique reduced the overfitting problems, but it failed to control the imbalance among the testing sets and the training sets. Ekman and Keltner [2] proposed a firefly integrated optimization algorithm (FIOA) to strengthen the EEG-based emotion recognition. This technique significantly reduced the artificial selection of the work loads. However, this method suffers from computational complexity issues. Sharma et al. [10] designed a LSTM- (long short-term memory-) driven deep learning method for automated emotion recognition. This method did not involve any primary knowledge regarding the functional parameters. However, the major challenge lies in maximizing the processing speed while using larger datasets. Wei et al. [12] designed a simple recurrent unit (SRU) network and ensemble learning technique for the EEG-based emotion recognition. This method achieved comparatively lower computational cost. However, deep learning is highly reliant on computation control, and it utilizes higher time for the training process.

Chao and Dong [13] presented an advanced convolutional neural network (CNN) for recognizing the emotions from the multichannel EEG signals. The distinctive grouping technique of filter preserves the regional features with respect to the diverse areas, but this technique suffers from higher computational complexity. Salankar et al. [14] developed an empirical mode decomposition (EMD) for the emotion recognition based on EEG signals. This method was more effective for the medical recognition of high- and low-dominance regions in the subjects. However, this method failed to classify the states, such as Alzheimer’s, depression, and epilepsy for enhanced outcomes. Yin et al. [15] introduced a graph convolutional neural networks (ECLGCNN) and LSTM for recognizing the emotions using EEG signals. The processing time of this method was low and maintains poor recognition accuracy. Pandey and Seeja [1] devised a deep CNN for recognizing the EEG emotions. Here, frontal electrodes are more effective for recognizing the emotions when compared to all other electrodes. However, this technique failed to apply attention mechanisms on various brain regions in order to achieve effective performance results.

Maheshwari et al. [16] proposed the deep CNN architecture; it has eight convolutions, three average pooling, four batch normalization, three spatial dropouts, two dropouts, one global average pooling, and three dense layers. It is validated using three publicly available databases: DEAP. But still it suffers with poor accuracy in classifying various emotions. Hector et al. [17] work, architect, design, implement, and test a handcrafted, hardware convolutional neural network, named BioCNN, optimized for EEG-based emotion detection and other biomedical applications. The EEG signals are generated using a low-cost, off-the-shelf device, namely, Emotiv Epoc+, and then denoised and preprocessed ahead of their use by BioCNN. For training and testing, BioCNN uses three repositories of emotion classification datasets, including the publicly available DEAP and DREAMER datasets. Hu et al. [18] presented a novel convolutional layer, called the scaling layer, which can adaptively extract effective data-driven spectrogram-like features from raw EEG signals. Furthermore, it exploits convolutional kernels scaled from one data-driven pattern to expose a frequency-like dimension to address the shortcomings of prior methods requiring hand-extracted features or their approximations. This has achieved state-of-the-art results across the established DEAP and AMIGOS benchmark datasets. Liu and Fu [19] have proposed an emotion recognition by deeply learned multichannel textural and EEG features. In this work, multichannel features from the EEG signal for human emotion recognition are applied. Here, the EEG signal is generated by sound signal stimulation. Specifically, applying multichannel EEG and textual feature fusion in time domain recognizes different human emotions, where six statistical features in time domain are fused to a feature vector for emotion classification. It conducts EEG and textual-based feature extraction from both time and frequency domains. Various challenges of human emotion recognition approaches are as follows. (i)SRU was developed for the emotion recognition. However, the major challenge lies in selecting the appropriate SRU network parameters, like training parameters and the total nodes based on the trial-and-error technique [12](ii)FIOA offers a hybridized optimization scheme for recognizing the patterns of higher dimensionality datasets, but the experimental information used in this method are simply multiple physiological signals. Hence, the major challenge lies in using the FIOA to automatic pattern detection of medical image data [20](iii)The EMD technique implemented can be valuable for medical recognition of low- and high-dominance regions in the subjects, but major challenge lies in classifying the various brain conditions, like sadness, Alzheimer’s, and epilepsy [14](iv)Deep CNN approach was very efficient in focusing the independent emotion recognitions with respect to classification accuracy when compared with various existing techniques. This can be enhanced by applying attention mechanisms on various regions of the brain for improving the classification accuracy [1]

Based on the literature review inference, the ECLGCNN achieved better classification accuracy, which explores only the binary categorization of emotions, such as positive or negative valence and low/high arousal. This can be made to overcome by considering ECLGCNN as a multiclassifier for effectively distinguishing the different states of emotions [15].

3. Proposed Method

The proposed FASSO-based deep maxout network (DMN) for the human emotion recognition mainly consists of four stages, namely, preprocessing, feature extraction, data augmentation, and finally emotion recognition. In the first stage, input EEG signals are preprocessed using the median filter to remove unwanted noise. After that, the feature extraction process is carried out in order to extract the required features, such as DWT, spectral flatness, logarithmic band power, fluctuation index, spectral decrease, spectral roll-off, and relative energy. Once the feature extraction phase is completed, the data augmentation is done by adding the noises to the original signals for generating the new samples. Finally, the emotion recognition is performed using the DMN classifier [21] for recognizing the emotions, like pride, elation, joy, satisfaction, relief, hope, interest, surprise, sadness, fear, shame, guilt, envy, disgust, contempt, and anger. In this case, the training process of the DMN classifier is carried out using the proposed FASSO approach. This approach is newly designed by the integration of FAT [22] and SSOA [23]. Figure 1 shows the schematic representation of the proposed FASSO-based DMN for human emotion recognition using EEG signals.

Let us consider a DEAP dataset with EEG signals of amount of human emotions, which is given as follows: where signifies the EEG signals of data and indicates the total human emotions. Here, the input EEG signal is subjected as an input to the preprocessing phase in order to remove the noise from the original signals. The removal of noise is the fundamental step to enhance the input signals for further processing. Thus, median filter is used for an efficient preprocessing and the corresponding equation for preprocessing is as follows. where and are the signal information and the median filter output is denoted as .

3.1. Feature Extraction

Once the preprocessing is performed, the feature extraction can be done for extracting the significant features for further processing. The median filter output is applied to extract features such as DWT, spectral flatness, logarithmic band power, fluctuation index, spectral decrease, spectral roll-off, and relative energy. The extracted features are explained in the following sections.

3.1.1. DWT Features

DWT feature [24] is used for the signal transformation from spatial to frequency domain, and in this case, Haar wavelet transform is used. This wavelet transform extracts the information from the signals at different scales by passing the EEG signals through the low-pass and high-pass filters. Moreover, the wavelet features enable superior multiresolution capabilities and energy compaction functionalities. Thus, the size of DWT feature is in the dimension of and is signified as.

3.1.2. Spectral Flatness

Spectral flatness is utilized in digital signal processing in order to characterize the audio spectrum and is measured in decibels. It is measured by the ratio of the geometric mean to the arithmetic mean of power spectrum, and the equation is expressed as follows: where signifies the magnitude of bin number , and moreover, spectral flatness feature is in the dimension of and is denoted as .

3.1.3. Logarithmic Band Power

Logarithmic band power is computed using the logarithmic power of different EEG signal bands. Hence, this feature is measured by computing the band power frequency value, and the equation is expressed as follows: where signifies the logarithmic band power of the preprocessed signal and the term represents the logarithmic band power. The logarithmic band power feature is in the dimension of .

3.1.4. Fluctuation Index

Fluctuation index is used for computing the characteristic value of the EEG signals, and the equation is expressed as follows: where represents the signal data points, signifies the points in the signal value, and the fluctuation index feature is in the dimension of and is indicated as .

3.1.5. Spectral Decrease

The spectral decrease is used for computing the reduction in the magnitude spectrum over a time. On the other hand, spectral decrease represents the amount of spectrum decrease while emphasizing slopes of minimum frequencies, and the equation for spectral decrease is formulated as follows:

where the term indicates the spectral decrease and the size of spectral decrease feature is in the dimension of , respectively.

3.1.6. Spectral Roll-Off

Spectral roll-off is defined as the frequency of the signal below which a particular percentage of the overall spectral energy lies, and spectral roll-off feature is in the dimension of . In addition, the term specifies the spectral roll-off feature.

3.1.7. Relative Energy

Relative energy is utilized for examining the variations in the EEG frequency bands, and the relative energy feature is in the dimension of and is indicated as . Finally, the extracted features are integrated together in order to generate a feature vector output indicated as.

The size of the feature vector dimension is . The extracted feature is fed as an input to data augmentation phase for maximizing the emotion recognition performance effectively. In the data augmentation process, additive white Gaussian noises are added to original input signal to generate the new samples. After that, the generated new samples are incorporated with the feature vector output extracted from the feature extraction phase to obtain the final feature output with the size of . The data augmentation output is denoted as .

3.2. Emotion Recognition

Once the data augmentation is performed, emotion recognition is carried out using deep maxout network. The data augmented output is presented as an input for the DMN classifier for recognizing the emotions. This classifier is also trained by the proposed FASSO approach for the effective recognition. The DMN classifier [9] is a trainable activation function built in a multilayer structural arrangement. The major benefit of using this classifier is that it can efficiently improve the speed of the training process. Here, the input is subjected into a network and the activation function of the hidden unit is expressed as follows:

where represents the overall units in the layer and signifies the total layers in the DMN classifier. The conventional nonlinear activation functions, like absolute value rectifier and rectified linear, can be approximated effectively by DMN. The arbitrary activation functions would be approximated by the deep maxout network by maximizing the factors even in a complex nonlinear activation function. Thus, the classified output obtained is denoted as . Figure 2 illustrates the structure of DMN classifier.

3.2.1. Training Procedure of Deep Maxout Network Using FASSO

The training of the DMN is carried out by the proposed FASSO algorithm and thereby obtaining an optimal solution. In fact, the FASSO algorithm is designed by the hybridization of FAT [21] and SSOA [23]. FAT is motivated from the transportation of the organic matters and revised ideas of branches. Here, organic matter transportation is designed with feedback approach of moistures. The exchange process of the entire material means that both the organic matter transfer and the moisture feedback are taken into consideration. FAT is very efficient in solving different kinds of optimization problems and it can adaptively handle the parameter for enhancing the search efficiency. On the other hand, SSOA is motivated by the behavior of the shepherd. Here, the agents are partitioned into multicommunities and the optimization procedure is carried out based on shepherd characteristics in nature working on every community. The proposed FASSO method exhibited robust performance and achieved better optimal solution. By incorporating the FAT with the SSOA, the optimization complexities are reduced in an effective way. The following are the algorithmic phases of the proposed FASSO model. (i)Initialization: The branch population is initialized with number of branches and is given as follows: where denotes the total branches and signifies branch. (ii)Compute fitness measure: The fitness measure is used to compute the optimal solution by calculating the optimal fitness value, and the fitness measure equation is formulated as follows:where indicates the classifier output and represents the target output. (iii)Update solution: Once the fitness value is computed, the update solution is achieved using the proposed FASSO approach. According to FAT algorithm, the update solution of the self-evolution operator is given as follows:

To solve the optimization issues and to improve the algorithm performance, FAT algorithm can be incorporated with FA, and thus, the standard equation of SSOA is as follows:

Substituting Equation (18) in Equation (13), where signifies the best position in branch, lies between the range , denotes the maximum iterations, represents the iterations, and indicates the solution vectors of chosen horse and chosen sheep. (iv)Feasibility evaluation: The feasibility evaluation is performed to obtain the optimal value with respect to the fitness function. If the newly achieved solution has the optimal value than the existing one, then the existing solution is replaced with newly obtained optimal value.(v)Termination: All the abovementioned steps are repeated until best solution is achieved.

The pseudo code of the proposed FASSO algorithm is as follows.

Input:
Output: Best solution
Begin
Initialize the branch population

Count the total branches in the branch population

Combine current population and Evaluate better branches
end if
For
if the branch territory is not crowd
Execute crossover operator for producing new branch
Else
Perform self-evolution and dispersive propagation operator to create a new branch
Attain new branch population of existing generation
Update the optimal solution

End

4. Results and Discussion

The experiment of the proposed FASSO-driven deep maxout network technique is conducted using the DEAP dataset [25]. The implementation is carried out in a MATLAB environment. Figure 3 illustrates the experimental outcomes of the proposed method. The input EEG signals are shown in Figure 3(a), and the DWT features for the corresponding input signals are depicted in Figure 3(b). The EEG input and feature output signal are plotted with amplitude along the -axis and number of samples along the -axis. Each graph shows variation in the amplitude range and the corresponding extracted DWT features.

(a)

(b)

The performance evaluation of the proposed algorithm is carried out based on three metrics, namely, accuracy, sensitivity, and specificity. (i)Accuracy: It is used for computing true positive and the true negative results of the recognized emotions.where is true positive value, true negative is false positive value is , and false negative value is . (ii)Specificity: It is a measure utilized for calculating the exact true negative outcomes of the recognized emotions.(iii)Sensitivity: It is used to compute the true positive values of the recognized emotions.

The experimental outcomes of the proposed FASSO-based deep maxout network are illustrated in Figure 3. The input EEG signal and the corresponding DWT feature output for the first set of samples are shown in Figures 3(a) and 3(b), respectively. Similarly, Figures 3(c) and 3(d) depict the input EEG signal and the corresponding DWT feature output.

4.1. Performance Analysis

The performance evaluation of the proposed FASSO-based deep maxout network is carried out by varying the percentage of training data and the number of iterations.

4.1.1. Performance Analysis Based on Training Data Percentage

The performance assessments of the proposed method with respect to accuracy, specificity, and sensitivity are illustrated in Figure 4 by varying the iterations for different values. Figures 4(a), 4(b), and 4(c) depict the performance analysis of the proposed method with respect to accuracy, specificity, and sensitivity. The accuracy value obtained by the FASSO-based deep maxout network with an epoch values of 20, 40, 60, 80, and 100 is 0.771, 0.791, 0.817, 0.830, and 0.865, respectively, for the 60% training data. Similarly for the 50% training data, the specificity value achieved by the proposed method with epoch 20 is 0.755, epoch 40 is 0.771, epoch 60 is 0.806, epoch 80 is 0.829, and epoch 100 is 0.855. Similarly, by considering the training data as 70%, the sensitivity metrics of the proposed method for an epoch 20 is 0.763, epoch 40 is 0.783, epoch 60 is 0.792, epoch 80 is 0.798, and epoch 100 is 0.842.

(a)

(b)

(c)

4.1.2. Comparative Analysis

This section illustrates the comparison of the proposed FASSO-based DMN based on training data and -fold. The comparative analysis is carried out with the existing methods, namely, regularized graph NN [5], LSTM [10], RNN+ensemble [12], and CNN+LSTM [15], deep maxout network, FAT-based deep maxout network, and SSOA-based deep maxout network (Table 1).

Analysis using training data: Figure 5 illustrates the comparative assessment of developed FASSO-based deep maxout network based on the metrics, such as accuracy, specificity, and sensitivity by considering the training data percentage. Figure 5(a) illustrates the accuracy assessment. The accuracy values obtained by the techniques, such as regularized graph NN, LSTM, RNN+ensemble, CNN+LSTM, deep maxout network, FAT-based deep maxout network, SSOA-based deep maxout network, and the proposed FASSO-based deep maxout network, are 0.759, 0.780, 0.791, 0.824, 0.847, 0.849, 0.856, and 0.871 for the training data 60%. The performance improvement achieved by the developed FASSO-based deep maxout network technique in comparison with the existing techniques is 12.80%, 10.40%, 9.138%, 5.327%, 2.76%, 2.53%, and 1.72%, respectively.

(a)

(b)

(c)

The analysis using specificity metric is presented in Figure 5(b). The proposed FASSO-based deep maxout network measured a specificity value of 0.889, while the specificity values achieved by the existing techniques are 0.782, 0.811, 0.815, 0.865, 0.866, 0.874, and 0.877, respectively, for the training data 80%. The performance gains computed by the proposed method in comparison with the existing techniques are 12.08%, 8.80%, 8.353%, 2.686%, 2.59%, 1.69%, and 1.35%, respectively. Figure 5(c) presents the assessment based on the sensitivity metric. By considering the training data as 70%, the sensitivity value achieved by the proposed method is 0.879, while the sensitivity values obtained by the existing techniques are 0.771, 0.801, 0.804, 0.841, 0.861, 0.865, and 0.877. Moreover, the performance improvements measured by the proposed FASSO-based deep maxout network by comparing with the existing methods are 12.34%, 8.956%, 8.604%, 4.387%, 2.05%, 1.59%, and 0.23%.

Analysis using -fold: The comparative analysis of developed FASSO-driven deep maxout network based on the -fold value with respect to the metrics such as accuracy, specificity, and sensitivity is presented in Figure 6. Figure 6(a) portrays the assessment of accuracy metric. When -fold value is 5, the accuracy value achieved by the developed FASSO-based deep maxout network is 0.845, while the accuracy values obtained by the techniques are as follows: regularized graph NN is 0.728, LSTM is 0.756, RNN+ensemble is 0.786, CNN+LSTM is 0.798, deep maxout network is 0.808, FAT-based deep maxout network is 0.815, and SSOA-based deep maxout network is 0.828. Moreover, the performance improvements measured by the proposed FASSO-based deep maxout network by comparing with the existing methods are 13.79%, 10.47%, 6.923%, 5.514%, 4.38%, 3.55%, and 2.01%.

(a)

(b)

(c)

The assessment based on specificity metric is presented in Figure 6(b). For the -fold value is 9, the specificity values of the proposed FASSO-based deep maxout network are 0.88 and existing methods are 0.799, 0.819, 0.83, 0.844, 0.871, 0.878, and 0.879, respectively. The performance improvements achieved by the developed technique in comparison with the existing techniques are 9.2%, 6.93%, 5.68%, 4.09%, 1.02%, 0.113%, and 0.113%. The analysis using sensitivity is presented in Figure 6(c).

The proposed FASSO-based deep maxout network measured a sensitivity value of 0.875, while the sensitivity values achieved by the existing techniques are 0.804, 0.808, 0.819, 0.857, 0.859, 0.866, and 0.868, respectively, for the -fold value 9. The performance gains computed by the developed FASSO-based deep maxout network in comparison with the existing techniques are 8.11%, 7.66%, 6.4%, 2.06%, 1.83%, 1.03%, and 0.8%.

4.2. Analysis Based on Computational Cost

The analysis based on the computational cost is provided in Table 2. The proposed FASSO-based deep maxout network has the computational time of 0.0189654 sec, which is the minimum computational time as compared to other methods. Here, the regularized graph NN has the high computational time of 0.634567 sec.

5. Conclusion

An efficient and robust FASSO-based deep maxout network classifier system is developed and its performance is evaluated using EEG signals. The proposed method is designed by combining FAT and SSOA algorithms. The experiment is conducted using DEAP database and its performance is evaluated with various performance metrics. The proposed method is also compared with the existing techniques to validate the obtained results. The improved outcomes in terms of accuracy, specificity, and sensitivity show the effectiveness of the proposed algorithm. The future enhancement of the proposed work would be the reflection of various deep learning classifiers for an accurate recognition and classification of human emotions. Also, the training process can be further improved by incorporating various optimization algorithms.

Data Availability

DEAP dataset [24] is a dataset, which is used for analyzing the emotions using physiological, video, and EEG signals. Here, the peripheral physiological and the EEG signals of the 32 participants are recorded, and then, each video is rated by every participant with respect to the levels of arousal, dominance, valence, familiarity, like, and dislike. Among 32 participants, video of the frontal face is recorded for the 22 participants. Moreover, the input size of each data in the file and the label size are in such a way that every class is processed effectively.

Conflicts of Interest

The authors have no conflict of interest.

References

P. Pandey and K. R. Seeja, “Subject independent emotion recognition system for people with facial deformity: an EEG based approach,” Journal of Ambient Intelligence and Humanized Computing, vol. 12, no. 2, pp. 2311–2320, 2021.
View at: Google Scholar
P. Ekman and D. Keltner, Universal facial expressions of emotion, Nonverbal Communication: Where Nature Meets Culture, Lawrence Erlbaum Associates, 1997.
A. Mehrabian, “Pleasure-arousal-dominance: a general framework for describing and measuring individual differences in temperament,” Current Psychology, vol. 14, no. 4, pp. 261–292, 1996.
View at: Google Scholar
L. A. Schmidt and L. J. Trainor, “Frontal brain electrical activity distinguishes valence and intensity of musical emotions,” Cognition and Emotion, vol. 15, no. 4, pp. 487–500, 2001.
View at: Google Scholar
P. Zhong, D. Wang, and C. Miao, “EEG-based emotion recognition using regularized graph neural networks,” IEEE Transactions on Affective Computing, vol. 1, no. 1, 2020.
View at: Publisher Site | Google Scholar
B. García-Martínez, A. Martínez-Rodrigo, R. Alcaraz, A. Fernández-Caballero, and P. González, “Nonlinear methodologies applied to automatic recognition of emotions: an EEG review,” in International Conference on Ubiquitous Computing and Ambient Intelligence (ICUCAI), pp. 754–765, 2017.
View at: Google Scholar
U. Rajendra Acharya, H. Fujita, V. K. Sudarshan, S. Bhat, and J. E. W. Koh, “Application of entropies for automated diagnosis of epilepsy using EEG signals: a review,” Knowledge-Based Systems, vol. 88, pp. 85–96, 2015.
View at: Google Scholar
G. G. Knyazev, J. Y. Slobodskoj-Plusnin, and A. V. Bocharov, “Gender differences in implicit and explicit processing of emotional facial expressions as revealed by event-related theta synchronization,” Emotion, vol. 10, no. 5, pp. 678–687, 2010.
View at: Google Scholar
M. Cabanac, “Physiological role of pleasure,” Science, vol. 173, no. 4002, pp. 1103–1107, 1971.
View at: Google Scholar
R. Sharma, R. B. Pachori, and P. Sircar, “Automated emotion recognition based on higher order statistics and deep learning algorithm,” Biomedical Signal Processing and Control, vol. 58, p. 101867, 2020.
View at: Google Scholar
X. Chai, Q. Wang, Y. Zhao, X. Liu, O. Bai, and Y. Li, “Unsupervised domain adaptation techniques based on auto-encoder for non-stationary EEG-based emotion recognition,” Computers in Biology and Medicine, vol. 79, pp. 205–214, 2016.
View at: Google Scholar
C. Wei, L. L. Chen, Z. Z. Song, X. G. Lou, and D. D. Li, “EEG-based emotion recognition using simple recurrent units network and ensemble learning,” Biomedical Signal Processing and Control, vol. 58, p. 101756, 2020.
View at: Google Scholar
H. Chao and L. Dong, “Emotion recognition using three-dimensional feature and convolutional neural network from multichannel EEG signals,” IEEE Sensors Journal, vol. 21, no. 2, pp. 2024–2034, 2020.
View at: Google Scholar
N. Salankar, P. Mishra, and L. Garg, “Emotion recognition from EEG signals using empirical mode decomposition and second-order difference plot,” Biomedical Signal Processing and Control, vol. 65, p. 102389, 2021.
View at: Google Scholar
Y. Yin, X. Zheng, B. Hu, Y. Zhang, and X. Cui, “EEG emotion recognition using fusion model of graph convolutional neural networks and LSTM,” Applied Soft Computing, vol. 100, p. 106954, 2021.
View at: Google Scholar
S. K. Daksh Maheshwari, R. K. T. Ghosh, U. Manish Sharma, and R. Acharya, “Automated accurate emotion recognition system using rhythm-specific deep convolutional neural network technique with multi-channel EEG signals,” Computers in Biology and Medicine, vol. 134, p. 104428, 2021.
View at: Google Scholar
H. A. Gonzalezi, S. Muzaffar, J. Yoo, and I. M. Elfadel, “BioCNN: a hardware inference engine for EEG-based emotion detection.,” IEEE Access, vol. 8, pp. 140896–140914, 2020.
View at: Google Scholar
J. Hu, C. Wang, Q. Jia, Q. Bu, R. Sutcliffe, and J. Feng, “ScalingNet: extracting features from raw EEG data for emotion recognition,” Neurocomputing, vol. 63, no. 6, pp. 177–184, 2021.
View at: Google Scholar
Y. Liu and G. Fu, “Emotion recognition by deeply learned multichannel textural and EEG features,” Future Generation Computer Systems, vol. 119, pp. 1–6, 2021.
View at: Google Scholar
H. He, Y. Tan, J. Ying, and W. Zhang, “Strengthen EEG-based emotion recognition using firefly integrated optimization algorithm,” Applied Soft Computing, vol. 94, p. 106426, 2020.
View at: Google Scholar
W. Sun, F. Su, and L. Wang, “Improving deep neural networks with multi-layer maxout networks and a novel initialization method,” Neurocomputing, vol. 278, pp. 34–40, 2018.
View at: Google Scholar
Q. Q. Li, Z. C. He, and E. Li, “The feedback artificial tree (FAT) algorithm,” Soft Computing, vol. 24, no. 17, pp. 1–28, 2020.
View at: Google Scholar
A. Kaveh and A. Zaerreza, “Shuffled shepherd optimization method: a new meta-heuristic algorithm,” Engineering Computations, vol. 37, no. 7, 2020.
View at: Publisher Site | Google Scholar
E. Gupta and R. S. Kushwah, “Combination of global and local features using DWT with SVM for CBIR,” in 4th International Conference on Reliability, Infocom Technologies and Optimization (ICRITO), pp. 1–6, 2015.
View at: Google Scholar
The DEAP, March 2021 database. http://www.eecs.qmul.ac.uk/mmv/datasets/deap/index.html.

Copyright

Copyright © 2022 K. S. Bhanumathi et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

PDF Download Citation

Download other formats

Order printed copies

Views

878

Downloads

918

Citations

International Journal of Telemedicine and Applications

Feedback Artificial Shuffled Shepherd Optimization-Based Deep Maxout Network for Human Emotion Recognition Using EEG Signals

Abstract

1. Introduction

2. Related Work

3. Proposed Method

3.1. Feature Extraction

3.1.1. DWT Features

3.1.2. Spectral Flatness

3.1.3. Logarithmic Band Power

3.1.4. Fluctuation Index

3.1.5. Spectral Decrease

3.1.6. Spectral Roll-Off

3.1.7. Relative Energy

3.2. Emotion Recognition

3.2.1. Training Procedure of Deep Maxout Network Using FASSO

4. Results and Discussion

4.1. Performance Analysis

4.1.1. Performance Analysis Based on Training Data Percentage

4.1.2. Comparative Analysis

4.2. Analysis Based on Computational Cost

5. Conclusion

Data Availability

Conflicts of Interest

References

Copyright