Grass Leaf Identification Using dbN Wavelet and CILBPRead the full article
Advances in Multimedia publishes research on the technologies associated with multimedia systems, including computer-media integration for digital information processing, storage, transmission, and representation.
Advances in Multimedia maintains an Editorial Board of practicing researchers from around the world, to ensure manuscripts are handled by editors who are experts in the field of study.
Latest ArticlesMore articles
A Novel Multisupervised Coupled Metric Learning for Low-Resolution Face Matching
This paper presents a new multisupervised coupled metric learning (MS-CML) method for low-resolution face image matching. While coupled metric learning has achieved good performance in degraded face recognition, most existing coupled metric learning methods only adopt the category label as supervision, which easily leads to changes in the distribution of samples in the coupled space. And the accuracy of degraded image matching is seriously influenced by these changes. To address this problem, we propose an MS-CML method to train the linear and nonlinear metric model, respectively, which can project the different resolution face pairs into the same latent feature space, under which the distance of each positive pair is reduced and that of each negative pair is enlarged. In this work, we defined a novel multisupervised objective function, which consists of a main objective function and an auxiliary objective function. The supervised information of the main objective function is the category label, which plays a major supervisory role. The supervised information of the auxiliary objective function is the distribution relationship of the samples, which plays an auxiliary supervisory role. Under the supervision of category label and distribution information, the learned model can better deal with the intraclass multimodal problem, and the features obtained in the coupled space are more easily matched correctly. Experimental results on three different face datasets validate the efficacy of the proposed method.
Pyramidal Part-Based Model for Partial Occlusion Handling in Pedestrian Classification
Pedestrian detection and classification are of increased interest in the intelligent transportation system (ITS), and among the challenging issues, we can find limitations of tiny and occluded appearances, large variation of human pose, cluttered background, and complex environment. In fact, a partial occlusion handling is important in the case of detecting pedestrians, in order to avoid accidents between pedestrians and vehicles, since it is difficult to detect when pedestrians are suddenly crossing the road. To solve the partial occlusion problem, a pyramidal part-based model (PPM) is proposed to obtain a more accurate prediction based on the majority vote of the confidence score of the visible parts by cascading the pyramidal structure. The experimental results on the proposed scheme achieved 96.25% accuracy on the INRIA dataset and 81% accuracy on the PSU (Prince of Songkla University) dataset. Our proposed model can be applied in the real-world environment to classify the occluded part of pedestrians with the various information of part representation at each pyramid layer.
High Dynamic Range Imaging Based on Bidirectional Structural Similarities and Weighted Low-Rank Matrix Completion
High dynamic range (HDR) imaging, aiming to increase the dynamic range of an image by merging multiexposure images, has attracted much attention. Ghosts are often observed in a resultant image, due to camera motion and object motion in the scene. Low-rank matrix completion (LRMC) provides an effective tool to remove ghosts. However, user specification of the included or excluded regions is required. In this paper, we propose a novel HDR imaging method based on bidirectional structural similarities and weighted low-rank matrix completion. In our method, we first propose the bidirectional structural similarities containing forward-projection structural similarity (FPSS) and backward-projection structural similarity (BPSS) to divide each image into four groups: motion region, saturated region in the source image, saturated region in the reference image, and static and unsaturated regions. Then, the weight maps and the motion maps constructed based on FPSS and BPSS are introduced in the weighted LRMC model to reconstruct the background irradiance maps. Experiments are conducted on several challenging image sets with complex scene, and the results show that the proposed method outperforms three current state-of-the-art methods and Photoshop cs6 and is robust to the reference image.
A Color-Image Encryption Scheme Using a 2D Chaotic System and DNA Coding
This paper proposes a method of encrypting images with password protection for secure sharing based on deoxyribonucleic acid (DNA) sequence operations and the tangent-delay ellipse reflecting the cavity-map system (TD-ERCS). The initial values of the TD-ERCS system are generated from a user’s password, and the TD-ERCS system is used to scramble the pixel locations of the R, G, and B matrices of the original image. Next, three DNA-sequence matrices are generated by encoding the permuted color image such that it can be transformed into three matrices. Then, the TD-ERCS system is employed to generate three chaotic sequences before encoding the DNA into the three matrices. Thereafter, a DNA exclusive OR (XOR) operation is executed between the DNA sequences of the permuted image and the DNA sequences generated by the TD-ERCS system to produce three encrypted scrambled matrices. Finally, the matrices of the DNA sequences are decoded, and the R, G, and B channels are recombined to form an encrypted color image. The results of simulation and security tests reveal that the proposed algorithm offers robust encryption and demonstrates the ability to resist exhaustive, statistical, and differential attacks.
Euclidean Distance-Based Weighted Prediction for Merge Mode in HEVC
Merge mode can achieve a considerable coding gain because of reducing the cost of coding motion information in video codecs. However, the simple adoption of the motion information from the neighbouring blocks may not achieve the optimal performance as the motion correlation between the pixels and the neighbouring block decreases with their distance increasing. To address this problem, the paper proposes a Euclidean distance-based weighted prediction algorithm as an additional candidate in the merge mode. First, several predicted blocks are generated by motion compensation prediction (MCP) with the motion information from available neighbouring blocks. Second, an additional predicted block is generated by a weighted average of the predicted blocks above, where the weighted coefficient is related to Euclidean distances from the neighbouring candidate to the pixel points in the current block. Finally, the best merge mode is selected by the rate distortion optimization (RDO) among the original merge candidates and the additional candidate. Experimental results show that, on the joint exploration test model 7.0 (JEM 7.0), the proposed algorithm achieves better coding performance than the original merge mode under all configurations including random access (RA), low delay B (LDB), and low delay P (LDP), with a slight coding complexity increase. Especially for the LDP configuration, the proposed method achieves 1.50% bitrate saving on average.
A Human-Computer Interaction System for Agricultural Tools Museum Based on Virtual Reality Technology
Traditional museums and most digital museums use window display to exhibit their collections. However, the agricultural tools are distinctive for their use value and wisdom contained. Therefore, this paper first proposes a method of virtual interactive display for agricultural tools based on virtual reality technology, which combines static display and dynamic use of agricultural tools vividly showing the agricultural tools. To address the problems of rigid interaction and terrible experience in the process of human-computer interaction, four human-computer interaction technologies are proposed to design and construct a more humanized system including intelligent scenes switching technology, multichannel introduction technology, interactive virtual roaming technology, and task-based interactive technology. The evaluation results demonstrate that the system proposed achieves good performance in fluency, instructiveness, amusement, and practicability. This human-computer interaction system can not only show the wisdom of Chinese traditional agricultural tools to the experiencer all over the world but also put forward a new method of digital museum design.