Department of Environmental Engineering, Faculty of Environmental Engineering, The University of Kitakyushu, Kitakyushu-shi 808-0135, Japan
Abstract
We propose two compression methods for the human motion in 3D space, based on the forward and inverse kinematics. In a motion chain, a movement of each joint is represented by a series of vector signals in 3D space. In general, specific types of joints such as end effectors often require higher precision than other general types of joints in, for example, CG animation and robot manipulation. The first method, which combines wavelet transform and forward kinematics, enables users to reconstruct the end effectors more precisely. Moreover, progressive decoding can be realized. The distortion of parent joint coming from quantization affects its child joint in turn and is accumulated to the end effector. To address this problem and to control the movement of the whole body, we propose a prediction method further based on the inverse kinematics. This method achieves efficient compression with a higher compression ratio and higher quality of the motion data. By comparing with some conventional methods, we demonstrate the advantage of ours with typical motions.
1. Introduction
3D motion capture systems
have been widely used in CG amusement
and human motion analysis such as games and athlete training. To use these human motion capture data to
produce compelling animation, the users
need a motion library to store the existing motion data. Large motion databases
do not accept the uncompressed forms, since the motion data are often huge. For example, the size of only a 3-second sample of the
skeleton motion capture data with 200 frames, in a typical motion format, is about 200 KB. On the other hand, motion
data transmission often requires compactly coded motion [1]. Studying these issues, we believe the
motion compression is essential for all these tasks.
In this paper, we propose two compression
methods for the human motion in 3D space, based on the forward and inverse
kinematics. Analyzing the human
motion signals, such as walking, dancing, and kicking, we find that these motions contain
mainly low-frequency components and discarding some small wavelet coefficients
will not bring great effects on the motion. Thus, in the first method, we propose a compression algorithm for
motion data which combines wavelet transform and forward kinematics (FK)
to achieve a progressive motion compression. Due to appearance of distortion caused by the quantization,
the
error is propagated from higher to lower levels in a motion chain hierarchy. To
reduce this effect, we compensate the error by a forward kinematics fashion.
The method also gives a hierarchical description of the motion by virtue of the
wavelet transform so that the progressive encoding/decoding is possible, which is efficient for motion editing and
key framing [2–5].
The second method uses a prediction, based on
the inverse kinematics (IK). Although this is
not a hierarchical coding, more efficient compression in a sense of accuracy
can be given. This method exploits two redundancies, the correlation between frames and the correlation between joints.
For motion compression, these two should be taken into account simultaneously.
However, few techniques specially address them. In our method, the correlation between the
joints is efficiently reduced by the inverse kinematics.
In the general motion formats, the motion of
all joints in a frame is described by three rotation angles with respect to
,
,
and
axes. In this paper, we apply a converted format, which includes two angles of
transformation and one angle of orientation for each joint, to be compressed instead of the three rotation angles. This approach brings the advantages as the first,
to control the position more precisely by assigning more bits than orientation that is less important than the position in many cases; secondly, to save computation time and improve the compression efficiency, two angles of transformation are utilized to get a
closed form expression of the Jacobian matrix.
The remainder of the
paper is organized as follows. After a review of previous work in Section 2, we
present the introduction about the transformation from the
general motion format to the converted format in Section 3 and give a brief
description of wavelet-based compression algorithm and using forward kinematics
for data optimization in Section 4. In Section 5, we explain the inverse
kinematics-based compression algorithm in detail. The motion compression
procedure with a predicting technique is also assigned in this section. Illustrative motion examples are
given in Section 6 to show the advantage of our approach. In Section 7, we conclude the method.
2. Background
Many works in computer graphics
address the problem of huge motion data and present compact expression of the
motion [1–3, 6]. These
previous methods focus on the reduction of the number of motion samples. Liu
et al. introduced a system for analyzing and indexing motion databases [7]. Their
method reduces the size of the database by selecting the principal markers and
constructing simple models to describe groups of similar poses. Furthermore,
recently, the researchers in motion identification, extracting, analysis, and
classification pay more attention to controlling the motion signals in
low dimensionality [8–10]. Although the need for compressing the large motion database exists
in many fields, their studies are only located in using key-poses to represent
the action synopsis by a sequence of motions. For a monotone and periodic
sample, the key-poses can synopsize the action well, while they are not sufficient
to the complicated action. In this case, only the compressed representation of
the whole motion fulfills the requirement of the users.
While
the motion compression attempts to represent the whole motion by fewer amounts
of data without subsampling, the conventional key
framing methods subsample it and interpolate the key
frames smoothly while rendering. The key framing realizes a compact
representation as well. However, if one applies the key framing to the
compression, some problems arise. First, essentially selected key frames should
not be correlated. It is difficult to compress a lot
even though the number of frames is smaller. Moreover, when the key
frames are regularly sampled, it is difficult to compensate sudden changes by
the interpolation and it often results in over- and under-shoots. When
irregular key frame sampling is applied (which is much more common in CG), one
needs to encode not only the data but also the time
indexes, resulting in increase of data. Lim and Thalmann achieve motion
compression by the key-posture extraction of motion data using the motion curve
simplification in [6]. Based on the method in [6], Etou et al. proposed the
use of only five joints as the important joints and applied the motion curve simplification
method in these selected joints to reduce the dimensionality [2]. Ahmed et al. utilized the wavelet technique
to compress each sample [11]. In [12] Arikan
introduced a lossy compression algorithm. They approximated the short clips of motion using Bezier curves and
clustered principal component analysis. To avoid solving the nonlinear problem of
the orientations, they represent the motion by 3 times more storage (3 virtual
marker positions for each frame instead of 3 joint angles). Obviously, this
representation introduces a problem of space complexity. To reduce the high dimensionality,
they group the similar looking clips into clusters and use PCA in each cluster.
This processing brings another problem of time complexity. For motion compression, one
should take into account (1) the correlation between joints and (2) the
correlation in time domain. In other words, the level of accuracy should be
accordingly changed, depending on frames and joints such as end effectors which
often require higher precision than other general types of joints due to motion
characters. Most of these conventional methods [2, 6, 11, 12] focus on the correlation
only in time domain.
To solve those two problems, we proposed a compression algorithm for motion data which combines wavelet
transform and forward kinematics. To reduce the distortion in hierarchy chain, we compensate the
error by a forward kinematics fashion. The method also gives a hierarchical
description of the motion by virtue of the wavelet so that progressive encoding/decoding is possible, which is efficient for motion editing and
key-framing.
However, according to the forward kinematics, in a
motion chain, the distortion of a joint that comes from the quantization introduces the warp of the position to its child joint.
The distortion of this child joint affects its grand child joint in turn, and
so on. The warp may be accumulated to the end effector which is usually treated
as the most important joint for some motion feature. To reduce
the propagation of the warp to the end effectors, we have to minimize the errors
between the actual positions and the compressed results in the lower level of
hierarchy. The forward kinematics cannot address this
problem perfectly.
To control the movement of the whole body, it is common to use the
inverse kinematics [13]. It is presumed that the specified joints, called the end effectors
are assigned in target positions from preceding positions. By position changes of the end effectors one may get variations of the motion of the entire body
using the inverse kinematics algorithm. Therefore, for most joints, when
recovering motion in a decoder, we only require a
series of small modifications to the corresponding values. The author of
[12] realizes the importance of the end effectors and compresses them separately
by DCT. Unfortunately, in his paper we cannot find a solution for exploiting correlation
between joints.
A linear prediction has been widely used and successfully applied in compression of time series data. If we can predict every next frame, we only
need to save the first frame and the difference between real value and its prediction.
The better the predictions, the more common corrections we can get and the more bits we can save. The inverse kinematics
based approach solves the above problems efficiently [14].
We further present an inverse kinematics based compression
method in this paper. In our work, we improve the calculation for the
prediction of next frame by adding the constraint of minimizing the acceleration
of the motion instead of minimizing the velocity which is described in [14]. Because the constraint of minimizing
the velocity, which deduces the Moore-Penrose pseudoinverse, relates a series
of smallest changes of the rotation angles to a small displacement
in the end effector, it is not adapted to the motion compression.
3. Preliminary
3.1. General Format
In CG application, a human figure is modeled by a hierarchical
chain, in which connections between two neighboring joints are rigid, for
example, the joints of a shoulder and an elbow move, but the distance between
them is not changed. In this framework, the motion data is expressed by a series of rotations instead of
coordinates.
A motion chain, which is hierarchically
constructed by some linked joints,
has one end that is free to move, which is called an end effector. The other
end of the chain is fixed and
called a terminator (see Figure 1). In
Figure 1, a joint defined as an origin of coordinates is called a root. The root may have
multiple trees and the several end
effectors. Kinematics based motion data processing is to handle the motion
chains, such as trunk, upper limbs and lower limbs.
The motion capture data format,
such as BVH format which is employed in our experiment, typically
includes the position of the root and orientation of other joints [15]. For the
orientation of the joints, the three Euler rotation angles are adopted rather
than the quaternion.
In the motion chain, to calculate
the position of a joint we need to create a rigid transformation matrix by
local translation and rotation information. A rotation matrix
is composed of three Euler rotation matrices with respect to
axes [16]. Suppose a rotation order is
, by concatenating the Euler rotation
matrices, we can get
. By applying a matrix
which is a homogeneous matrix to represent both the translation and the
rotation by one common equation, the position of a joint in global coordinate
can be described by
(1)where
,
is the position of this
joint in local coordinate, and
,
(2)and
is a translation vector
.
Once the local transformation of a joint is created, it will be concatenated
with the transformation of its parent, then its grand parent, and so on. The
position of this joint in world coordinate can be obtained by
(3)
3.2. Converted Angle Format
We assume the
motion chains are expressed by the rigid
transform except for the root joint with the position,
, in the world coordinate. We first convert the three Euler rotation
angles to two rotations and one orientation. The two rotations perfectly
specify the position in 3D space, and the rest represents its orientation.
Since in most applications the position is more important than the orientation, by assigning more bits during the
compression one can have a degree of freedom to reconstruct the position more
precisely than the orientation. This
format also realizes the scalability of data. That is, if motions are described
only by the positions (i.e., orientation is not included) like data captured
from most of the motion capture equipments or a user in the decoder needs only
the positions (i.e., orientation is not required), it is possible to
transmit the two
and
of
the three angles, which saves much more information to send. Moreover, as we will explain later, this
converted format gives a closed form expression of Jacobian matrix in the
inverse kinematics algorithm which is the partial derivative of the function
about the position and the rotation angles of the joint with respect to a set
of angles and saves computation
time.
Suppose the length of a link
connecting a child joint and a parent joint is r, and the two positions are
related by a rotation
:
(4)
is
represented by two angles:
(5)where
and
can substitute for three Euler angles to
be compressed in the IK algorithm in
Section 3. The Figure 2 shows the direction of these spherical angles of a 3D
coordination system.
Figure 2: Direction of the spherical angles in a 3D coordinate system.
To represent the motion, the positions of the joints in world coordinate
can be represented by two angles
and
sufficiently in (5), and these positions can represent
the skeleton
motion instead of the orientation of the joints. However, to apply our compression algorithm to more general CG animations, we further need a parameter, the orientation
angle
, to
retrieve the three Euler angle since a joint may present different orientation
in a same position in the world coordinate. We can calculate the orientation angle
by a standard matrix of rotation around an
arbitrary axis [17]. Suppose
is the matrix that rotates by angle
about the axis
is
(6)where
and
and
are the components of a unit vector on the axis
though the origin and
. In our case,
holds. Suppose
is the rotation matrix which is built by current
position
and target position
. Combine (4), (5), and (6), then we can easily get
(7)Note that
,
, and
at each frame are data to be encoded. Since the variance of
is usually small, this angle format often
improves compression efficiency in practice.
The
three Euler angles may exhibit discontinuity when the angles are limited in
.
This is easily addressed by the conventional phase unwrapping technique. We
actually use “unwrap” in the MATLAB library. In
the decoder, after reconstructing the rotation matrix
, the orientation angle can be retrieved by limiting three
Euler angles in a quadrant and the absolute value of them between
.
4. Wavelet Coding Algorithm
4.1. General Wavelet Coding for Motion
Analyzing the human motion signal, such as walking, dancing
and kicking, we find these motions contain mainly low frequency components and
discarding some small wavelet coefficients will not introduce the large effects
on the motion. In [11], the author also gives a report about it.
However, using a constant quantization for all motion signals may introduce
visible error such that motion appears dithering or other unnatural manner.
Thus, variable stepsizes in quantization are required to encode motion signals of different joints.
As applied in image compression
and other computer graphics applications [18, 19], the general wavelet-based
compression steps are given as follows. In an encoder phase, we decompose a
signal into a sequence of wavelet coefficients
, then quantize them with multiple stepsizes to convert
to a sequence
in quantization, finally apply entropy coding to compress
into a sequence
. In decoder phase, the contrary operations are performed. Therefore,
the motion data are quantized adaptively, so that the decoder receives a compressed data without visible artifacts.
In our compression algorithm, we
perform the 9/7 tap wavelet transform, and our aim is to compress curves of all the joints represented by series of rotations in all frames.
4.2. ROI Coding
of
Motion
In motion compression, to keep some special characteristics,
we have to consider two constraints which are also specified in motion editing
[4, 5].
The one is used to describe an
articulated figure [13], such as the elbow and knee joints should not bend
backward, that is, the rotation angles about these types of joints should be in
some ranges. The second constraint
is used to guarantee that the end effector is placed at a particular position
in some frames. For example, considering a motion in which a human puts a box
on a desk, in last some frames, the box should be precisely put on the desk. In
this paper, we call these frames “constraint frames.” The constraint frames are
derived automatically from the interaction between the figure and environment
or specified by user. In the encoding
steps, it is useful that the user can adaptively compress the joints. For
example, smaller quantization stepsize is used for important joints. The
important joints, which are located more precisely than others, are called “constraint joints”
in this paper. This can be done by region-of-interest (ROI) coding.
To make this ROI coding
possible, we apply the max
shift method [18]. The max shift method is used in image compression for defining the ROI which is encoded and transmitted with better quality than the rest
of the image and decoded first before any background information [20]. Before the quantization, the constraint frames are
scaled up. Thus, the frames are
quantized by a different stepsize to implement different compression ratio as motion behavior.
In motion reconstruction, the signals are scaled down after dequantization. The decoder can distinguish these scaled signals from general signals. More accurate frames will be gained and hence the motion feature can
be preserved.
4.3. Forward
Kinematics for Data Optimization
ROI-based approach preserves a large amount of features of
the motion. However, the lossy compression for rotation angles produces some
quantization error. In Figure 3(a), the position of
is warped to
and in turn
is warped to
.
Finally, the warp is accumulated to the end effector.
To reduce the
propagation of the warp, we have to minimize the error of position between
and
.
Utilizing the quantized position of the root joint and a correct position of
the child joint, we can obtain an optimal position of the child joint instead
of its warped one (Figure 3(b)).
Since the length between
and
is fixed, suppose this link is
, rotate the link
with
respect to
to meet the line
, we get an intersection
.
In triangle
of Figure 3(b),
,
we can easily educe
.
That means
is closer to
than
, that is,
,
where
denotes the error of distance between
and
,
while
denotes the error of distance between
and
.
is clearly the optimal position. Calculate
corresponding to
.
indicates the rotation of
about
and can be calculated by
and
.
is original position of
in its parent coordinate with origin
and
is optimal position of
in its parent coordinate with origin
.
Then in each motion chain, the
optimal rotation angles of a child joint can be gotten by warped position of
its parent sequentially and encoded instead of the original data [21].
A motion data
compression algorithm is shown in Figure 4.
Figure 4: Adaptive algorithm for optimization-based motion data compression.
5. Inverse Kinematics Based Algorithm
5.1. Inverse Kinematics
According to the
forward kinematics, in a motion chain, the transformation of a parent joint
causes a change of its child joint position. The change of this child joint in
turn affects its grand child joint, and so on. Finally the changes are accumulated
to the end effector. Motion is inherited down the hierarchy from the parents to
the children (Figure 5(a)). For simplicity, we discuss two-dimensional case. In the frame
, the position of the end effector
in two dimensional can be determined:
(8) where
is composed of all angles
for each joint in frame
.
In the inverse kinematics, motion is inherited up the hierarchy, from
the extremities to the root (Figure 5(b)). The role of the IK algorithm is to
automatically work out how each joint in a chain should be transformed so that
the end effector can reach the goal. To find the
set of the changes of the angles which satisfy a given displacement of the positions
of the end effectors, we need to solve
(9) However, this inverse is,
in most
cases, difficult to solve. Instead of this, Jacobian-based method is utilized [22–24]. Equation (8)
is written in differential form:
(10)
is the Jacobian matrix of the displacement
of the position of the end effector
with respect to the changes of the joint
angles
and
(11) To get the desire
, one has to solve
(12) Since the natural
human body motion typically is represented over 30 degrees of freedom (DOF)
which is larger than rank of
, the set of (12) are underdetermined. To solve this, some constraints are needed. Meanwhile for motion compression, the
constraint used to make adjustments of the joint angles must be considered to match the
motion characters. Traditionally, one minimizes the norm of the velocity of the joint angl
under the constraint
.
Then, (12) is written by
(13) where
(14) and it is called
Moore-Penrose pseudoinverse. Equation (13)
linearly relates the displacement of the end effectors to the change of joint
angles. However, it relates a series of
smallest change of the rotation angles to a small displacement in the end effectors.
Obviously it is not desirable for motion compression. It is possible to
consider another solution than this constraint. In this paper, we employ the
constraint of minimizing the norm of the differential acceleration
and get the solution in our prediction
algorithm:
(15) where
,
is the parameter that determines
the weighting between the solution and the error. It is clear that this method
leads to satisfying the natural motion character better than traditional
minimizing the velocities of the joint angles. When
, which is a displacement of end
effector from previous position to current position, is given, the change of
each joint can be determined by (13).
Our IK algorithm consists of the following steps.
(1)
Calculate the increment
of the position of the end effector
from the
frame
to the frame
:
(16)
(2)
Calculate Jacobian matrix 
using the angles of last frame
, where
and
are the two angles in (5). Since
(17)where,
(18) by (5) and
is the length of link
, then by (10),
and
(19) where
(20)
(3)
Get the pseudoinverse of
by (14).
(4)
In each motion chain, obtain the changes in frame
for
by (15).
is the change of the angles in frame
and is given by
. We obtain the good results with
.
In step (2), if the
general format, which is composed of rotation angles about
axes, is applied to calculate the Jacobian matrix, each element of the
Jacobian matrix almost involves
all the related trigonometric functions of the corresponding angles in the motion chain. While in our converted format, since
and
are
calculated by the product of all the related rotation matrices from current
joint to root joint, using converted
format, the Jacobian
is given in a closed form. Although two angles
format is sufficient to represent the position of the joints in the world coordinate, the constraints on the joints generally controlled by the orientation of
the joints. To apply our compression algorithm to more general CG animations, we further using the parameter
that is the orientation angle and calculated
by the standard matrix of rotation around an arbitrary axis introduced in
Section 3.2. Therefore, initial orientation of the three Euler angles of the joint is retrieved perfectly.
Finally, the algorithm is stated in Algorithm 1.
Algorithm 1: IK Algorithm.
5.2. Compression with Prediction Technique
Considering motion
characters, we have to assign more bits to some special joints such as the end
effectors than other general types of joints. An adaptive quantization approach
preserves features of the motion greatly. To achieve this, the hierarchical
stepsize for different joints can be implemented in quantization step.
Meanwhile, since the amount of motion data is considerable, high
compression rate is needed. Prediction
based techniques have been widely and successfully applied in compression of
series of data. If we can predict every next frame, we only have to save the
first frame and the difference between real value and predicting result. The
better the predictions, the more common corrections we can get and the more
bits we can save.
The
aforementioned two points characterize our IK compression approach properly
when comparing with previous works. Actually, there are no conventional algorithms
that specialize the constraints in joints, that is,
precise reconstruction of the end effectors, and achieve efficient compression
rate simultaneously.
To predict
every next frame, an intuitive method is to utilize the last frame directly.
Taking the difference
between
the current frame
and last frames
may
be one of the simplest methods. By
this method, we can decode current value
in decoder using the equation
(21)
Generally, compression
rate improvement
only depends on stepsizes. We have to explore a better prediction
method that can provide the data closer to
.
The inverse
kinematics gives a solution exactly. In our compression method, an encoder
calculates the differences of position of end effectors between two sequential
frames. More bits are assigned to them to keep the precision of the end
effectors in quantization. Then these differences will be sent to decoder.
Next, both in encoder and decoder, using these differences and the angles in
previous frame we can calculate the change of the rotation angle in each joint by IK algorithm approximately. Suppose the
difference between the value predicted by IK and the value in the last
frame is
and
(22)
can be calculated
by (14) as the
. We
need transfer
to the decoder which may reconstruct current
value
by
(23)
We adopt a prediction for orientation angles of each joint. Our
prediction method is simply done by subtracting the current frame by the last
frame. The IK based compression procedure with the prediction technique is
shown in Figure 6. In the encoder, the data of the rotation angles of the end
effectors and the other general joints are processed separately.
Figure 6: IK algorithm-based compression flow chart.
Firstly, for the general joints, the change of the rotation angle in each
frame
can
be predicted by the compressed angles
of previous frames and the change of the position
of the end effectors. Here we use the
quantized version
instead of
and
instead of
to get the same result in the decoder. After calculating the
prediction error
,
we adopt general quantization and entropy coding to the prediction error.
Meanwhile, we adopt the simple prediction method by every last frame for the
orientation angle
of each joint to get
.
For the end effectors, after the position calculation, we also adopt
the simple prediction method by every last frame. The quantization and entropy
coding are applied into the simple predicted sequence of the end effectors.
The bits of the end effectors and the other general joints are sent to
the decoder, respectively.
For the bits of the general other joints, the entropy decoding and
dequantization are implemented as the opposite processing in the encoder. Using
the IK algorithm, we predict the rotation angles of a current frame by the last
decoded frame
,
then add the compressed prediction error
and the change of the position of the end effectors
using
(24) This process is the
same as the prediction in the encoder.
For the end effectors, after the entropy decoding and dequantization of
the bits, we retrieve the position of the end effector of each frame. Finally,
the angle conversion of the end effectors is added to convert the position of
the end effectors to the rotation angles following the original data format.
6. Experimental Results
In our experiment, we adopted one of the most well-known format of the human
motion, the BVH file format [15, 25].
The BVH file has two parts, a header section which describes any number of
skeleton hierarchies and the initial pose of the skeleton by translational offsets of children segments
from their parent, and a data section which contains the position of the root
joint and the rotation information of motion of all joints in each frame. In
the BVH format, the motion is described by a series of the three orientation matrices with respect to
axes. We convert them to the two angles
and
by (5) to represent
the position and the orientation
,
and then compress them using prediction
method.
To evaluate
the efficiency, we calculate the error of the position of joint
of the compressed motion comparing with the original one by
(25)and the error of all joints in all frames by
(26)where
and
are the 3D positions of joint
in
frame
of original and compressed motions,
respectively in world coordinate. And
is the number of frames,
while
is the number of
total joints.
We also calculate the error of three orientation angles of joint
of the compressed motion
comparing with the original one by
(27) and the error of all joints in
all frames by:
(28) where
and
are orientation angles of joint
in frame
of original and compressed
motions, respectively, and
is the number of frames, while
is the number of total joints. Note that the three orientation angles of our method
are
,
and
, while the original format consists of three angles with respect
to
,
axes.
We compare
the proposed FK-based compression (FKBC) and IK based prediction (IKBP) method with other three methods, the simple
prediction by last frame (SPBLF) based on (21) in Section 5, the
interpolation between the key frames (IBKF) and the motion compression technique using the discrete
wavelet transform (DWT) appearing in [11]. (The wavelet transform and interpolations
are implemented by Matlab software which
is trustworthy implementation.) In IBKF based compression method we apply the Piecewise Cubic Hermite (PCH) interpolation into the key frames which are obtained by curve
simplification in [6], while in DWT method we adopt the 9/7 wavelet transform
to compression the motion data. We give
the RD curves of four motions in Figure 7. The
-axis represents the summation of the entropies of the three angles obtained by those five methods, respectively.
Figure 7: RD curve of the position of the walk, ballet, throw, and kick motions; IKBP—IK-based prediction method, FKBC—FK-based compression, SPBLF—simple prediction by last frame, IBKF—interpolation between the key frames, DWT—discrete wavelet transform.
The entropy coding is widely used to estimate the bit rate of the coded
stream, which gives theoretically minimal bit rate. The entropy of a random
variable
, which is defined as
can be interpreted as the average amount of
information conveyed by the outcome of the random variable
.
is known
as the alphabet elements.
is the probability of the outcome
. In our experiment, since we use the same Arithmetic coding in all methods,
the entropy values the compression ratio of each method.
In Figure 7, the results of the five methods including the proposed
methods are shown. The proposed algorithm IKBP can get the more common
corrections and save more bits than the general algorithm. The variance of the
error of the joint position calculated by IKBP method is smaller than the results
by FKBC, SPBLF, and IBKF methods. It demonstrates that our proposed algorithm
IKBP can get the more common corrections and save more bits than the general
algorithm.
For the curves generated by the IBKF method, we change the number of the key frames
which are obtained by the IBKF method introduced in [2] to get the different
compressed form of the motion data. Since the error of the position of the
walking motion by IBKF is more than 100 when the entropy is smaller than 1.0
and the position of the ballet motion by the IBKF is more than 50 when entropy
is smaller than 1.5, we abridge these large
values of the error to give an
observable comparison in the range from 0 to 35. Same processing
is used in other two motions.
At low bit rates, FKBC is inferior to SPBLF,
FKBC gains better compression in low compression rate. Note that FKBC is a
hierarchical coding scheme and a progressive decoding is possible by virtue of
the wavelet, while IKBP and SPBLF do not have this property.
Figure 8 shows the RD curves about the error of the three orientation
angles. The
-axis represents the summation of the entropies of three angles. For the curve generated by the IBKF method, since the error of the
rotation angle of the walking motion by the IBKF is more than 100 when the
entropy is smaller than 3, we also abridge the data of the large error by
IBKF to give a comparison in the error range from 0 to 9.
Figure 8: RD curve of the rotation angle of the walk, ballet, throw, and kick motions;
IKBP—IK-based prediction method, FKBC—FK-based compression, SPBLF—simple prediction by last frame, IBKF—interpolation between the key frames, DWT—discrete wavelet transform.
We also present the error of positions and the orientations of each joint in a limb chain corresponding
to the different
entropy value of the walk motion in Tables 1 and 2 and the ballet
motion in Tables 3 and 4, respectively, to
demonstrate the advantage of our approach. Since different motion has different hierarchy skeleton, in the walk motion the left shoulder chain includes 4 joints (left shoulder, left humerus, left radius, and
left hand), while in the ballet motion, the same motion chain has 3
joints (left shoulder, left elbow and left wrist). When
the entropy is smaller than 1, the error of the positions of some joints by
IBKF is larger than 100 and the recovered motion is totally distorted. In the
same compressing ratio, the proposed algorithm IKBP produces small error of the
positions in these joints than the general algorithms.
Table 1: Error of position of each joint in a limb chain for walking motion. This table records almost the same entropy of different methods for compression ratio and the error of the positions of several joints for the quality of the compressed motion correspondingly. In the walking motion, the left shoulder chain includes 4 joints (left shoulder, left humerus, left radius, and left hand).
Table 2: Error of angle of each joint in a limb chain for walking motion. This table records almost the same entropy of different methods for compression ratio and the error of the orientation angles of several joints for the quality of the compressed motion correspondingly. In the walking motion, the left shoulder chain includes 4 joints (left shoulder, left humerus, left radius, and left hand).
Table 3: Error of position of each joint in a limb chain for ballet motion. This table records almost the same entropy of different methods for compression ratio and the error of the position of the joints for the quality of the compressed motion correspondingly. In the ballet motion, the left shoulder chain includes 3 joints (left shoulder, left elbow, and left wrist).
Table 4: Error of angle of each joint in a limb chain for ballet motion. This table records almost the same entropy of different methods for compression ratio and the error of the orientation angle of the joints for the quality of the compressed motion correspondingly. In the ballet motion, the left shoulder chain includes 3 joints (left shoulder, left elbow and left wrist).
Figure 9 shows series of samples
of the original motions and the decoded one by IKBP method correspondingly. These series of samplings present a period of
the motion. In walk motion, when entropy is larger than 1.1206 it is
difficult to discovery the difference between the original motion and the decoded
one. We also show the results of other three motions, “ballet,” “throw” and “kick.” Table 5 gives the description of these four motions.
Table 5: Four original motion data.
Figure 9: Series of samplings of the motions.
Finally, we compare the compression and decompression
times in Table 6. The specification
of the hardware of PC is that CPU is Pentium(R) 4 3.00 GHz and Memory is 0.99 GB.
We record the encoding and decoding times of one frame of the walk motion
with 580 frames. Although our proposed method appears about 0.3 milliseconds lower than SPBLF method and about 0.18 milliseconds lower than IBKF method in compressing a
frame, considering the compression ratio and quality of the recovered motion
our algorithm is much better than IBKF and better than other methods.
Table 6: Compression and decompression times for one frame.
7. Conclusion
The compression of captured motion data is an important issue in motion storing, retrieval, editting and transmitting.
For the motion compression, some specific types of joints such as end effectors
often require higher precision
than other general types of joints
in, for
example, CG animation and robot manipulation. There are no conventional algorithms
specialize the constraints in joints and achieve efficient compression rate
simultaneously. Using forward kinematics, we implemented a constraint based compression
algorithm for
motion data with special characteristics which can be indicated by motion
behavior or specified by user. However, to solve the problem that the
distortion of parent joint coming from quantization in turn affects its child
joint and is accumulated to the end effector, the forward kinematics
cannot work perfectly.
Inverse kinematics is a common approach to
control the movement of the whole body. The position of end effector can be
specified in a target position from preceding position. By the changes of the position of the end effectors we may get variations
of the motion of the entire body. The inverse kinematics, on the other hand,
supports a prediction-based compression. To predict motion in decoder, we only save the first
frame and a series of small differences
between real value and prediction gotten by the variations of the positions of
the end effectors. Therefore, it can solve the problems of specific joints and
achieve efficient compression of the motion data.
We applied the FK based compression and IK based compression in our
reduced two-angle format. Some experimental results of example motions
demonstrate the advantage of our methods compared with conventional methods.
Acknowledgments
This
work was partly supported by a Grant-in-Aid for Young Sciences (no. 14750305) of
Japan Society for the Promotion of Science, fund from MEXT via Kitakyushu
innovative cluster project, and Kitakyushu IT Open Laboratory of National Institute
of Information and Communications Technology (NiCT).
References
- A. Bruderlin and L. Williams, “Motion signal processing,” in Proceedings of the 22nd Annual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH '95), pp. 97–104, Los Angeles, Calif, USA, August 1995.
- H. Etou, Y. Okada, and K. Niijima, “Feature preserving motion compression based on hierarchical curve simplification,” in Proceedings of IEEE International Conference on Multimedia and Expo (ICME '04), vol. 2, pp. 1435–1438, Taipei, Taiwan, June 2004.
- S. Li, M. Okuda, and S.-I. Takahashi, “Embedded key-frame extraction for CG animation by frame decimation,” in Proceedings of IEEE International Conference on Multimedia and Expo (ICME '05), pp. 1404–1407, Amsterdam, The Netherlands, July 2005.
- J Lee and S. Y. Shin, “A hierarchical approach to interactive motion editing for human-like figures,” in Proceedings of the 26th Annual Conference on Computer Graphics and Interactive techniques (SIGGRAPH '99), pp. 39–48, Los Angeles, Calif, USA, August 1999.
- M. Gleicher, “Motion editing with spacetime constraints,” in Proceedings of the Symposium on Interactive 3D Graphics, pp. 139–148, Providence, RI, USA, April 1997.
- I. S. Lim and D. Thalmann, “Key-posture extraction out of human motion data by curve simplification,” in Proceedings of the 23rd Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC '01), vol. 2, pp. 1167–1169, Istanbul, Turkey, October 2001.
- G. Liu, J. Zhang, W. Wang, and L. McMillan, “A system for analyzing and indexing human-motion databases,” in Proceedings of the ACM International Conference on Management of Data (SIGMOD '05), pp. 924–926, Baltimore, Md, USA, June 2005.
- J. Assa, Y. Caspi, and D. Cohen-Or, “Action synopsis: pose selection and illustration,” ACM Transactions on Graphics, vol. 24, no. 3, pp. 667–676, 2005.
- A. Safonova, J. K. Hodgins, and N. S. Pollard, “Synthesizing physically realistic human motion in low-dimensional, behavior-specific spaces,” ACM Transactions on Graphics, vol. 23, no. 3, pp. 514–521, 2004.
- J. Chai and J. K. Hodgins, “Performance animation from low-dimensional control signals,” in Proceedings of International Conference on Computer Graphics and Interactive Techniques (SIGGRAPH '05), pp. 686–696, Los Angeles, Calif, USA, July-August 2005.
- A. Ahmed, A. Hilton, and F. Mokhtarian, “Adaptive compression of human animation data,” in Proceedings of the Annual Conference of the European Association for Computer Graphics (Eurographics '02), Saarbrücken, Germany, September 2002.
- O. Arikan, “Compression of motion capture databases,” in Proceedings of the 33rd International Conference and Exhibition on Computer
Graphics and Interactive Techniques (SIGGRAPH '06), pp. 890–897, Boston, Mass, USA, July-August 2006.
- M. Naganand and F. Stuart, “Specialised constraints for an inverse kinematics animation system applied to articulated figures,” in Proceedings of the Annual Conference of the European Association for Computer Graphics (Eurographics '98), pp. 215–223, Leeds, UK, 1998.
- S. Li, M. Okuda, and S. Takahashi, “Kinematics based motion compression for human figure animation,” in Proceedings of the 30th IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '05), vol. 2, pp. 1077–1080, Philadelphia, Pa, USA, March 2005.
- J. Lander, “Working with motion capture formats,” January 1998, Darwin 3D, LLC.
- M. Z. Vladimir, Kinematics of Human Motion, Human Kinetics, Champaign, Ill, USA, 1998.
- A. S. Glassner, Graphics Gems, Academic Press, Boston, Mass, USA, 1990.
- C. Christopoulos, J. Askelof, and M. Larsson, “Efficient region of interest coding techniques in the upcomingJPEG2000 still image coding standard,” in Proceedings of IEEE International Conference on Image Processing (ICIP '00), vol. 2, pp. 41–44, Vancouver, BC, Canada, September 2000.
- A. Secker and D. Taubman, “Motion-compensated highly scalable video compression using anadaptive 3D wavelet transform based on lifting,” in Proceedings of International Conference on Image Processing (ICIP '01), vol. 2, pp. 1029–1032, Thessaloniki, Greece, October 2001.
- I. Ueno and W. A. Pearlman, “Region-of-interest coding in volumetric images with shape-adaptive wavelet transform,” in Image and Video Communications and Processing 2003, vol. 5022 of Proceedings of SPIE, pp. 1048–1055, Santa Clara, Calif, USA, January 2003.
- S. Li, M. Okuda, and S. Takahashi, “Hierarchical human motion compression with constraints on frames,” in Proceedings of the 47th Midwest Symposium on Circuits and Systems (MWSCAS '04), vol. 1, pp. 253–256, Hiroshima, Japan, July 2004.
- S. R. Buss, “Introduction to inverse kinematics with jacobian transpose, pseudoinverse and damped least squares methods,” April, 2004.
- C. H. Huang and C. A. Klein, “Review of pseudoinverse control for use with kinematically redundant manipulators,” IEEE Transaction on Systems, Man and Cybernetics, vol. 13, no. 2, pp. 245–250, 1983.
- M. Girard and A. A. Maciejewski, “Computational modeling for the computer animation of legged figures,” ACM SIGGRAPH Computer Graphics, vol. 19, no. 3, pp. 263–270, 1985.
- M. Meredith and S. Maddock, “Motion capture file formats explained,” Tech. Rep. CS-01-11, Department of Computer Science, University of Sheffielf, Sheffield, UK, 2001.