Mathematical Problems in Engineering

Volume 2015 (2015), Article ID 671419, 21 pages

http://dx.doi.org/10.1155/2015/671419

## Human Motion Estimation Based on Low Dimensional Space Incremental Learning

School of Electronic and Information Engineering, South China University of Technology, Guangzhou, Guangdong 510641, China

Received 28 September 2014; Revised 5 January 2015; Accepted 9 January 2015

Academic Editor: Mohamed Djemai

Copyright © 2015 Wanyi Li and Jifeng Sun. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

#### Abstract

This paper proposes a novel algorithm called low dimensional space incremental learning (LDSIL) to estimate the human motion in 3D from the silhouettes of human motion multiview images. The proposed algorithm takes the advantage of stochastic extremum memory adaptive searching (SEMAS) and incremental probabilistic dimension reduction model (IPDRM) to collect new high dimensional data samples. The high dimensional data samples can be selected to update the mapping from low dimensional space to high dimensional space, so that incremental learning can be achieved to estimate human motion from small amount of samples. Compared with three traditional algorithms, the proposed algorithm can make human motion estimation achieve a good performance in disambiguating silhouettes, overcoming the transient occlusion, and reducing estimation error.

#### 1. Introduction

Human motion estimation has become a hot researching topic [1–3], but it is a challenging task. Unusually, we are very interested in estimating the human motion in 3D from the silhouettes of human motion multiview images. The challenges are mainly as follows: firstly, it is hard to build the mapping between multiview silhouettes and human motion in 3D; secondly, the matching is ambiguous between multiview silhouettes and human motion in 3D; finally, it is hard to determine the spatial position information of the human motion depicted in the multiview images. In the past few years, a number of algorithms have been proposed to estimate the human motion. In the works of Sigal et al. [4] and Deutscher and Reid [5], they use some improved particle filters to estimate the human motion in 3D. It cannot work well because of searching (sampling) in high dimensional (HD) space for many times. There are some problems if we directly search in HD space, for example, searching in large scale will get the invalid data and searching in small scale will not get the target data. Moreover, if searching many times in small scale, it also brings about the invalid data. Li et al. use principal component analysis (PCA) to reduce the dimension of the HD data samples and simulated annealing particle swarm optimism (SAPSO) to estimate the human motion in 3D [6]. This algorithm is time-consuming and its performance is not so well because the HD data converted from corresponding low dimensional (LD) data will be quite different from original HD data. Besides, it does not consider the spatial position information of the human motion. Some traditional Monte Carlo methods [7–9] have drawbacks, which can not ensure collecting the best sample each time during searching, thus stochastic extremum memory adaptive searching (SEMAS) is proposed to solve this problem. In the work of Wang et al. [10], Gaussian process dynamical model (GPDM) can be used to reduce the dimension of the HD data to acquire the corresponding LD data and build the mapping from LD space to HD space, but GPDM can not quickly reduce the dimension of the new HD data sample and acquire the new corresponding LD data; thus incremental probabilistic dimension reduction model (IPDRM) is proposed to solve this problem based on GPDM. Some improved incremental or nonincremental learning algorithms in [11–14] can not satisfy our need. The limitations are that output data denoting the class label or other simple information have only one dimension, which can not satisfy the description of some output data, and they can not carry out the unsupervised incremental learning of HD data. Inspired by the researches stated above, the key to estimate the human motion in 3D depends on generating the better prior information. The human motion in 3D can be estimated more accurately through searching around the better prior information in small scale only once. In this paper, we mainly focus on the regular human motion cycle (walking or running).

Our task is how to use the small amount of samples of HD data as the prior information to estimate the human motion in 3D which matches the multiview images. Based on the works of the researches above, we proposed a novel algorithm called low dimensional space incremental learning (LDSIL). The LDSIL mainly carries out through SEMAS and IPDRM to collect the new HD samples and updates the mapping from the LD space to HD space through the selection of new HD samples, thus the searching in the LD space can generate the more accurate HD data to estimate the corresponding human motion in 3D. Then, SEMAS is used to find the spatial position of human motion in 3D, and it can find the best data sample during searching more easily. IPDRM is used to reduce the dimension of the new HD data sample and acquire the new corresponding LD data, and it can help to select the new HD samples through the mapping of incremental dimension reduction. Moreover, it provides the LD space to generate the valid HD data. Based on IPDRM, the method of selecting the HD data samples for incremental learning can be achieved by comparing corresponding LD data.

The main contribution of this paper is listed as follows:(1)SEMAS is proposed to find the spatial position of human motion model. It can get the best data sample better than the traditional Monte Carlo methods.(2)IPDRM is proposed to reduce the dimension of the new HD data sample and acquire the new corresponding LD data. It can promote the incremental learning in LD space to update the mapping from the LD space to HD space. Besides, it provides the LD space to generate valid HD data through searching.(3)The method of selecting the HD data samples is proposed; it is used to update the mapping from LD space to HD space.

Overall, due to LDSIL being able to make use of LD data, it can solve these problems mentioned above and contribute a lot to estimating the human motion in 3D, which has the better performance than other traditional algorithms, including disambiguating silhouettes, overcoming the transient occlusion, and reducing estimation error.

The rest of this paper is organized as follows: Section 2 introduces the corresponding data and models, and they are used to estimate the human motion. Section 3 proposes SEMAS algorithm to find the spatial position of human motion model. Section 4 proposes IPDRM to achieve the incremental dimension reduction. Section 5 proposes orthogonal least squares learning of multiple outputs (OLSLMO) to learn the mapping (HD space to LD space and LD space to HD space). Section 6 discusses LDSIL based on Sections 2–5: this section mentions how to select the new HD data sample from the estimated human motion models to achieve the incremental learning in the LD space. Section 7 proposes the method of searching in LD space to estimate the human motion model; the method is taking the advantage of the SEMAS and IPDRM. Section 8 shows the validity of proposed algorithm (LDSIL) through the experiments and evaluations. Section 9 discusses the limitation of LDSIL algorithm and the improvement in the future.

Let us give the more detailed discussion in the following sections.

#### 2. Corresponding Data and Models

We introduce the corresponding data and models in the works [4, 5]. All image data can be found in HumanEva-I dataset [4], as shown in Figures 1(a)–1(d). Figure 1(a) shows the human motion model denoting the human motion in 3D, which is described by HD data. The model is our estimated object, which need match the limbs in the multiview images. Figure 1(b) shows the multiview images, which are depicting human motion and its spatial position. After using some image segmentation algorithms [15–17] to process the multiview images, we can get the silhouettes as shown in Figure 1(c). Then, we project the model to the corresponding views and obtain the projection images in Figure 1(d). The images in Figure 1(d) are used to compare with the images in Figure 1(c).