Advances in Multimedia
 Journal metrics
Acceptance rate8%
Submission to final decision84 days
Acceptance to publication26 days
CiteScore1.800
Impact Factor-

Face Alignment Algorithm Based on an Improved Cascaded Convolutional Neural Network

Read the full article

 Journal profile

Advances in Multimedia publishes research on the technologies associated with multimedia systems, including computer-media integration for digital information processing, storage, transmission, and representation.

 Editor spotlight

Advances in Multimedia maintains an Editorial Board of practicing researchers from around the world, to ensure manuscripts are handled by editors who are experts in the field of study.

 Special Issues

Do you think there is an emerging area of research that really needs to be highlighted? Or an existing research area that has been overlooked or would benefit from deeper investigation? Raise the profile of a research area by leading a Special Issue.

Latest Articles

More articles
Research Article

Multimedia Archives: New Digital Filters to Correct Equalization Errors on Digitized Audio Tapes

Multimedia archives face the problem of obsolescing and degrading analogue media (e.g., speech and music recordings and video art). In response, researchers in the field have recently begun studying ad hoc tools for the preservation and access of historical analogue documents. This paper investigates the active preservation process of audio tape recordings, specifically focusing on possible means for compensating equalization errors introduced in the digitization process. If the accuracy of corrective equalization filters is validated, an archivist or musicologist would be able to experience the audio as a historically authentic document such that their listening experience would not require the recovery of the original analogue audio document or the redigitization of the audio. Thus, we conducted a MUSHRA-inspired perception test (n = 14) containing 6 excerpts of electronic music (3 stimuli recorded NAB and 3 recorded CCIR). Participants listened to 6 different equalization filters for each stimulus and rated them in terms of similarity. Filters included a correctly digitized “Reference,” an intentionally incorrect “Foil” filter, and a subsequent digital correction of the Foil filter that was produced with a MATLAB script. When stimuli were collapsed according to their filter type (NAB or CCIR), no significant differences were observed between the Reference and MATLAB correction filters. As such, the digital correction appears to be a promising method for compensation of equalization errors although future study is recommended, specifically containing an increased sample size and additional correction filters for comparison.

Research Article

Knowledge Graph Reasoning Based on Tensor Decomposition and MHRP-Learning

In the process of learning and reasoning knowledge graph, the existing tensor decomposition technology only considers the direct relationship between entities in knowledge graph. However, it ignores the characteristics of the graph structure of knowledge graph. To solve this problem, a knowledge graph reasoning algorithm based on multihop relational paths learning (MHRP-learning) and tensor decomposition is proposed in this paper. Firstly, MHRP-learning is adopted to obtain the relationship path between entity pairs in the knowledge graph. Then, the tensor decomposition is performed to get a novel learning framework. Finally, experiments show that the proposed method achieves advanced results, and it is applicable to knowledge graph reasoning.

Research Article

A Vehicle Reidentification Algorithm Based on Double-Channel Symmetrical CNN

It has become a challenging research topic to accurately identify the vehicles in the past from the mass monitoring data. The challenge is that the vehicle in the image has a large attitude, angle of view, light, and other changes, and these complex changes will seriously affect the vehicle recognition performance. In recent years, the convolutional neural network (CNN) has achieved great success in the field of vehicle reidentification. However, due to the small amount of vehicle annotation in the dataset of vehicle reidentification, the existing CNN model is not fully utilized in the training process, which affects the ability to identify the deep learning model. In order to solve the above problems, a double-channel symmetric CNN vehicle recognition algorithm is proposed by improving the network structure. In this method, two samples are taken as input at the same time, in which each sample has complementary characteristics. In this case, with limited training samples, the combination of inputs will be more diversified, and the training process of the CNN model will be more abundant. Experiments show that the recognition accuracy of the proposed algorithm is better than other existing methods, which further verifies the effectiveness of the proposed algorithm in this study.

Research Article

Lifting-Based Fractional Wavelet Filter: Energy-Efficient DWT Architecture for Low-Cost Wearable Sensors

This paper proposes and evaluates the LFrWF, a novel lifting-based architecture to compute the discrete wavelet transform (DWT) of images using the fractional wavelet filter (FrWF). In order to reduce the memory requirement of the proposed architecture, only one image line is read into a buffer at a time. Aside from an LFrWF version with multipliers, i.e., the LFr, we develop a multiplier-less LFrWF version, i.e., the LFr, which reduces the critical path delay (CPD) to the delay of an adder. The proposed LFr and LFr architectures are compared in terms of the required adders, multipliers, memory, and critical path delay with state-of-the-art DWT architectures. Moreover, the proposed LFr and LFr architectures, along with the state-of-the-art FrWF architectures (with multipliers (Fr) and without multipliers (Fr)) are compared through implementation on the same FPGA board. The LFr requires 22% less look-up tables (LUT), 34% less flip-flops (FF), and 50% less compute cycles (CC) and consumes 65% less energy than the Fr. Also, the proposed LFr architecture requires 50% less CC and consumes 43% less energy than the Fr. Thus, the proposed LFr and LFr architectures appear suitable for computing the DWT of images on wearable sensors.

Research Article

Stock Index Prices Prediction via Temporal Pattern Attention and Long-Short-Term Memory

This study attempts to predict stock index prices using multivariate time series analysis. The study’s motivation is based on the notion that datasets of stock index prices involve weak periodic patterns, long-term and short-term information, for which traditional approaches and current neural networks such as Autoregressive models and Support Vector Machine (SVM) may fail. This study applied Temporal Pattern Attention and Long-Short-Term Memory (TPA-LSTM) for prediction to overcome the issue. The results show that stock index prices prediction through the TPA-LSTM algorithm could achieve better prediction performance over traditional deep neural networks, such as recurrent neural network (RNN), convolutional neural network (CNN), and long and short-term time series network (LSTNet).

Research Article

Context-Aware Attention Network for Human Emotion Recognition in Video

Recognition of human emotion from facial expression is affected by distortions of pictorial quality and facial pose, which is often ignored by traditional video emotion recognition methods. On the other hand, context information can also provide different degrees of extra clues, which can further improve the recognition accuracy. In this paper, we first build a video dataset with seven categories of human emotion, named human emotion in the video (HEIV). With the HEIV dataset, we trained a context-aware attention network (CAAN) to recognize human emotion. The network consists of two subnetworks to process both face and context information. Features from facial expression and context clues are fused to represent the emotion of video frames, which will be then passed through an attention network and generate emotion scores. Then, the emotion features of all frames will be aggregated according to their emotional score. Experimental results show that our proposed method is effective on HEIV dataset.

Advances in Multimedia
 Journal metrics
Acceptance rate8%
Submission to final decision84 days
Acceptance to publication26 days
CiteScore1.800
Impact Factor-
 Submit