Abstract

Dynamic texture classification has attracted growing attention. Characterization of a dynamic texture is vital to address the classification problem. This paper proposes a dynamic texture descriptor based on the dual-tree complex wavelet transform and the Gumbel distribution. The method takes out the median values of coefficient magnitudes in each nonoverlapping block of a detail subband and models them with the Gumbel distribution. The classification is realized by comparing the similarity between the estimated distributions of all detail subbands. The experimental results on the benchmark dynamic texture database demonstrate better histogram fitting and promising classification performance of the dynamic texture descriptor compared with the current existing methods.

1. Introduction

Dynamic texture is a spatially repetitive, time-varying visual pattern that forms an image sequence. It exhibits some spatiotemporal stationary properties [1]. There are lots of dynamic textures, such as fire, smoke, sea waves, tree waving in the wind, moving escalator, and a walking crowd. Dynamic texture classification is to identify the types of regions or objects using the dynamic texture properties. Some significant work has been done in recent years. Despite the effort, it is still an interesting and challenging research field.

Dynamic texture exhibits two basic properties, motion and appearance. To address the dynamic texture classification problem, it should describe the two properties and obtain dynamic texture feature. The existing approaches can be roughly classified into two categories: method based on motion analysis [1] and method combining motion and appearance properties [24]. The first kind of methods estimates the motion field such as optical flow field and block-based motion vectors. And then the features are extracted from the estimated motion field for dynamic texture classification. However, due to the rapid change of the grey level around a location in some dynamic texture, it is difficult to estimate the real motion with the traditional methods, which should affect the classification performance. The model-based classification approach is the most widely used method in the second category. A state-of-the-art pixel-level method is the autoregressive moving average model-based dynamic texture model [3], which has also been extended by Chan [4].

The wavelet transform can perform time-frequency localization and multiresolution analysis on the signal and has been successfully applied to texture analysis. Wavelet-domain statistical method that models wavelet detail subbands with the probability distributions shows potential performance in texture classification [5]. In [6], authors extended the wavelet transform-based image texture processing method, and made use of the Weibull distribution to model each wavelet subband for dynamic texture classification.

Although the traditional DWT possesses excellent properties, it suffers from shift dependency, lack of directional selectivity and phase information. The dual-tree complex wavelet transform (DT-CWT) [7] could eliminate disadvantages of DWT at the cost of very limited redundancy. Many researchers have developed the statistical models of the complex wavelet coefficients for image texture classification [8, 9]. Motivated from these methods, we perform experiments to model the magnitudes of the DT-CWT wavelet coefficients with the known probability distribution, such as generalized Gaussian distribution and gamma distribution. However, the results demonstrate that there exists weak fitting at the peak or near the tail of the histogram. Therefore, this paper proposes a possible solution that selects the median values of the nonoverlapping blocks of a detail subband and models them with the Gumbel distribution.

The Gumbel distribution has been applied in ocean, structural and hydraulics engineering, meteorology, and the study of material strength, traffic, corrosion, and pollution. But to the best of our knowledge, this model has not been used for dynamic texture analysis. We aim to combine this statistical model and the properties of wavelet coefficients to characterize the dynamic texture. The similarity between two dynamic textures is measured via the distance based on the Kullback-Leibler divergence. Comparative experimental results reveal the superior performance of the approach over existing methods.

This paper is organized as follows. In the next section, we present a brief overview of dual-tree complex wavelet transform. The dynamic texture feature extraction using the DT-CWT and Gumbel distribution is introduced in the third section. The experimental results and conclusion are presented in the fourth and the fifth sections, respectively.

2. Dual-Tree Complex Wavelet Transform

The wavelet transform is a powerful tool for the multiscale analysis, which has been successfully used to describe the image texture. Hence, a natural idea is to characterize the dynamic texture in the spatiotemporal wavelet domain. Due to the good properties of the dual-tree complex wavelet transform, this paper will extract the dynamic texture descriptor by using the spatiotemporal DT-CWT.

The dual-tree complex wavelet transform is originally proposed in 1998 and further improved by Selesnick et al. [7], and it is a relatively recent enhancement to the DWT. Approximate shift invariance, good directional selectivity properties, and availability of phase information of the DT-CWT compared with standard DWT make it an excellent candidate for representing the dynamic texture.

The DT-CWT employs two identical real DWTs; the first DWT gives the real part of the transform while the second DWT gives the imaginary part. The two filter banks can be well constructed by considering the following desired properties: approximate half-sample delay, orthogonal, finite support, vanishing moments and linear phase, and each DWT is implemented by iterating the 2-channel analysis filter bank. Figure 1 shows the 1 D dual-tree complex wavelet transform spanned over two levels. It is evident from the filter bank structure of DT-CWT that it resembles the filter bank structure of standard DWT with twice the complexity. Here, and (for convenience, we omit the superscript and subscript) are one-dimensional lowpass and highpass decomposition filters associated to the scaling function and the mother wavelet, respectively. We denote and . The 2D DT-CWT produces six complex detail subbands at each decomposition level, each of which is strongly oriented at orientations ±15°, ±45°, and ±75°. In addition to being spatially oriented, the 3D DT-CWT is also motion selective (each subband is associated with motion in a specific direction) and implemented as eight critically sampled separable 3D DWTs operating in parallel.

3. Dynamic Texture Classification

Dynamic texture classification is to identify the category to which a new dynamic texture belongs. It is comprised of a dynamic texture training stage and a dynamic texture classification stage. In the training stage, the dynamic texture features are extracted and the classifiers are designed. In the classification stage, the unknown dynamic texture is classified with the designed classifier after feature extraction.

3.1. Feature Extraction

In order to characterize a dynamic texture, a traditional wavelet transform-based method is to extract the energy features of each subband, such as variance and entropy [8]. The other method is to model the wavelet coefficients with the known probability distribution models, and then the classification is realized by comparing the similarity between the estimated distributions of different dynamic textures.

Motivated from the methods for the image texture, the generalized Gaussian distribution (GGD), Gamma distribution, and Weibull distribution can be used to model the magnitudes of the DT-CWT wavelet coefficients. However, due to lots of near-zero coefficients and a few coefficients with large values, there are a high peak near the original (zero) point and a long tail in the histogram, which would lead to weak fitting as shown in Figure 2. A possible solution is to select a part of coefficients in a subband, which cover (almost) all necessary information for dynamic texture classification and could be modeled with a known probability distribution. In this paper, we choose the median values of the nonoverlapping blocks of a detail subband and model them with the Gumbel distribution.

The Gumbel distribution is a special case of the generalized extreme value distribution (GEVD). The general formula for the probability density function of the Gumbel distribution is where is the location parameter and is the scale parameter. The case where and is called the standard Gumbel distribution. It has been applied in ocean, structural and hydraulics engineering, meteorology, and the study of material strength, traffic, corrosion, pollution, and so on [10].

The Gumbel distribution is always used to model the distribution of the maximum (or the minimum) of a number of samples of various distributions. However, it is well known that the magnitudes of majority coefficients of a detail subband approach the zero value. If we took out the minimum value in each block, they would not include the necessary information for classification. On the other hand, we could not directly use the Gumbel distribution to model the maximum values of the blocks in a subband, because those values may be affected by unstable factors, such as noises and nonintrinsically transient changes of dynamic textures. Therefore, we choose the median values of the nonoverlapping blocks in the detail subband.

Given a 3D complex wavelet detail subband with the size of , we break it into nonoverlapping blocks with the size of as shown in Figure 3 and calculate the median of magnitudes of the DT-CWT wavelet coefficients in each block. Then those median values are modeled with the Gumbel distribution. Figure 2(d) shows the corresponding histogram and the fitted Gumbel distribution of a particular dual-tree complex wavelet subband for a dynamic texture. It can be seen that the proposed method is better than the fitted models with gamma and Weibull distributions. We also compute the chi-Square test statistic to test the fitting performance, where and are the values of the histogram and distribution function on the th bin, respectively. The corresponding chi-Square test statistic values of Gumbel, gamma, and Weibull distributions are 0.14, 0.24, and 0.57, respectively, which also demonstrates that the Gumbel distribution is a better fitting model.

3.2. Parameters Estimation

In order to obtain the Gumbel distribution fitting to the histogram, the parameters should be estimated. The typical estimation methods are the moments matching (MM) method and maximum likelihood estimation (MLE). In our practical implementation, the estimated parameters with MM method serve as the initial values for finding the MLE.

The moment estimators [10] of the parameters of the Gumbel distribution are where and are the mean and standard deviation, respectively. They can be estimated from the samples of size as follows: According to the MLE method, the log-likelihood function is given by Differentiating (4) with respect to parameters, and equating to zero, we obtain From (5), there are Because (7) is a nonlinear equation and only related with the scale parameter , the estimated parameter can be found by solving this equation with an iterative algorithm. The location parameter results from (6) with the estimated scale parameter . The estimated model parameters of wavelet detail subband coefficients form the feature vector denoted by “FV.”

3.3. Similarity Measurement

In this paper, we use the Gumbel distribution as a dynamic texture signature to model the histogram of wavelet detail coefficients, which has been shown to be an effective characterization in Figure 2. For classifying a new dynamic texture, the similarity between two dynamic textures should be measured by comparing the discrepancies between their corresponding signatures. The Kullback-Leibler (KL) distance [11] is a measure of the distance (similarity) between two probability density functions and and is defined as The KL distance can be shown to be always nonnegative. However, from a mathematical point of view, it is not a true distance measure, since it is not symmetric. The KL distance is also referred to as the KL divergence.

Manuel [12] has proved that where , and are parameters of the Gumbel distribution . Thus, the KL divergence between and is

Given the Gumbel distribution model, the probability density function for each subband can be completed defined with two parameters. The similarity measurement between two wavelet subbands can be computed accurately and efficiently via the model parameters. To measure the similarity between two dynamic textures, we define the th dynamic texture feature as , where and are the location and scale parameters in the th subband, respectively, and is the number of the subbands. Then the similarity between the th and th dynamic textures is defined as the sum of the KL divergence between all corresponding wavelet detail subbands by assuming all subbands are statistically independent:

3.4. Classifier

(1) K-Nearest Neighbor (kNN) Classifier

The classifier kNN is a good reference method and has been used in various classification problems. Its theory is straightforward: for a new feature vector, search for its closest training vectors according to some similarity measure, and then assign it to the class to which the majority of these nearest neighbors belong. The similarity measure plays an important role in the kNN classifier. Two distances are used in the paper. The first is the city-block distance (CBD) where and are the feature vectors. The second one is (11) deriving from the Kullback-Leibler distance.

(2) Support Vector Machines (SVMs)

SVM classifier has often been found to provide higher classification accuracies than other widely used pattern recognition techniques. An important advantage of the SVMs is that they are based on the principle of structural risk minimization. Besides, unlike other pattern recognition methods, SVMs do not depend explicitly on the dimensionality of the problem. LIBSVM [13] is one of the best SVM softwares, from which readers may find some interesting matters. The Gaussian kernel is used in SVM classifier where and are the feature vectors.

4. Experimental Results

The experiments of dynamic texture classification are conducted with the benchmark dynamic texture database DynTex [14], which contains more than 650 classes of various dynamic textures, such as waves, smoke, fire, a flag blowing in the wind, and walking crowd. In the DynTex dataset, all sequences consist of at least 250 frames, and the size of a frame is . Figure 4 shows selected thirty examples from the DynTex database.

The average correct classification rate (ACCR) is used to measure the classification performance, which is defined as where is the number of correctly classified samples, is the total number of test samples for each class, and is the number of dynamic texture categories.

The selected dynamic textures are converted and cropped into grey videos with size in space to show the representative properties. Each dynamic texture is divided into a set of nonoverlapping subvideos (clips) of 32 frames, which generates 5 video clips of size in the first dataset. The second and third datasets are built by dividing the video clips of the first dataset into subvideos of size and , respectively. Thus, there are 600 and 2400 subvideo samples in the second and third datasets, respectively.

To evaluate the performance of the proposed classification scheme, we compare it with the ordinary DWT-based method and the texture feature in [8] over the three datasets. In the first experiment, all subvideos are decomposed with the two-level wavelet transform. The leave-one-out method is used, together with the kNN classifier with different similarity measures. The classification results (as listed in Table 1) show that the discrimination power of DT-CWT is higher than DWT, and the proposed dynamic texture feature achieves higher ACCRs than the other methods.

To ensure the ACCRs are not affected by the different choices of samples, we randomly select some subvideos to design the kNN and SVM classifiers, and the others are adopted as test samples. The experimental results for the second dataset are presented in Tables 2 and 3. The result also shows that the proposed method is better than DWT-based method and the feature introduced in [8].

For further verifying the effectiveness of the proposed method, we perform different level complex wavelet decompositions on the dynamic textures. The ACCRs are listed in Tables 4 and 5. It can be seen that the introduced method achieves the best classification performance among those methods.

5. Conclusion

In this paper, we propose a new method for dynamic texture classification based on the Gumbel distribution in the complex wavelet domain. The location and scale parameters of the Gumbel distribution serve as the new texture feature in the dynamic texture classification. The experiments are performed on the benchmark dynamic texture database DynTex. The results suggest that the proposed classification method is much better than the existing energy-based feature and the method based on the real discrete wavelet transform.

Acknowledgments

This work is supported partially by National Natural Science Foundation of China 60902064 and 61172159, Postdoctoral Science-Research Developmental Foundation of Heilongjiang Province LBH-Q09128, and Fundamental Research Funds for the Central Universities HEUCFT1101 and DL12BB07. The authors also gratefully thank the reviewers for their helpful comments and suggestions, which have improved the quality of this paper.