Human Behavior and Emerging Technologies

Review Article

Academic Emotion Classification Using FER: A Systematic Review

Table 3

Summary of the advantages and disadvantages of the emotion classifier algorithms.


FER classification method	Algorithm	Advantage	Disadvantage

Conventional machine learning algorithm	Viola-Jones algorithm	Fast face detection algorithm [34]	Has lower detection accuracy compared to more complex algorithms [34]
	Support vector machine (SVM)	High accuracy in classification tasks [77]	Computationally intensive, especially when dealing with large datasets or complex models [77]
	Support vector regression (SVR)	Less susceptible to overfitting [57]	Computationally expensive, particularly for large datasets [78]
	Extreme learning machine (ELM)	Perform faster in classification [59]	Takes more computational time and has a lower accuracy than PNN [59]
	Probabilistic neural network (PNN)	Takes less computational time than ELM and is more efficient in classification [59]	Possible overfitting of the data [59]
	Decision tree	Able to handle missing data by not incorporating the missing feature during the decision-making process [79]	More computation time compared to KNN [60]
	K-nearest neighbors (KNN)	Achieved higher accuracy when compared to the decision tree algorithm and lesser computation time [60]	Slower performance compared to decision tree [60]
	Multilayer perceptron (MLP)	Capable of adaptive learning and optimal processing [80]	Lower classification accuracy compared to the random forest algorithm [69]
	Adaptive boosting (AdaBoost)	Enhance the performance of classification out of weak learners [81]	Susceptible to noisy data [82]
	Random forest (RF)	Robust to noisy data or outliers [83]	Prone to overfitting [84]

Deep learning algorithm	Convolutional neural networks (CNN)	Effective at handling complex image and video data [61]	Requires a large amount of training data and significant data augmentation to avoid overfitting [85]
	Neural network	Able to achieve high classification accuracy [33]	Long processing time [33]
	Deep belief network (DBN)	Robustness in classification [35]	Requires large amounts of training data [35]
	Long short-term memory (LSTM)	Addressed the issue of vanishing gradients [52]	Slow computational speed for complex architectures [52]
	Temporal relation network (TRN)	Able to achieve state-of-the-art performance on FER benchmarks [56]	Prone to overfitting/underfitting on small datasets [56]
	Deep facial spatiotemporal network (DFSTN)	Able to fuse facial spatiotemporal information [64]	Require larger amounts of training data to learn effective feature representations and to avoid overfitting [64]
	Deep CNN (DCNN)	Effective at learning complex features from raw image data [86]	Difficult to interpret, as it might be challenging to understand the underlying mechanisms behind the model’s decision-making process [87]
	Squeeze and excitation-deep adaptation networks (SE-DAN)	Can be used for transfer learning and domain adaptation [71]	Require a significant amount of computational resources and time to train [71]
	Multitask cascaded convolutional neural network (MTCNN)	High accuracy in detection and classification tasks; able to detect multiple faces in a single image [75]	Require a large amount of training data to achieve high accuracy [75]

Hybrid algorithm	Support vector machine+convolutional neural network (SVM+CNN)	Enhance the performance of classification compared to using just one of these algorithms alone [30]	Hyperparameter tuning of this combination of two algorithms can be challenging and time-consuming [88]
	BERN (combination of temporal convolution, bidirectional LSTM, and attention mechanism)	Achieved state-of-art performance [47]	Requires a large amount of training data and a long training time [47]
	Hybrid convolutional neural network (hybrid CNN)	More robust to variations in input data [89]	Long training time [48]
	Hybrid deep neural network (hybrid DNN)	Able to handle a wide range of data types and classification tasks [90]	Requires a large amount of training data [54]