Abstract

Texture classification is an important research topic in image processing. In 2012, scattering transform computed by iterating over successive wavelet transforms and modulus operators was introduced. This paper presents new approaches for texture features extraction using scattering transform. Scattering statistical features and scattering cooccurrence features are derived from subbands of the scattering decomposition and original images. And these features are used for classification for the four datasets containing 20, 30, 112, and 129 texture images, respectively. Experimental results show that our approaches have the promising results in classification.

1. Introduction

The texture is one of the main contents of the image. Texture segmentation, texture classification, and shape recovery from texture are three primary issues in texture analysis [1]. Among them, texture classification plays an important role in many tasks, ranging from remote sensing and medical imaging to query by content in large image data bases, and so forth [2]. Texture analysis is one of the most important techniques when images which consist of repetition or quasi repetition of some fundamental image elements are analyzed and interpreted (e.g., [3]). Various feature extraction and classification techniques have been suggested for the purpose of texture analysis in the past. Since there are many variations among nature textures, to achieve the best performance for texture analysis or retrieval, different features should be chosen according to the characteristics of texture images. It is well recognized that these texture analysis methods capture different texture properties of the image.

There are four major stages in texture analysis, that is, feature extraction, texture discrimination, texture classification, and shape from texture [4]. The first stage of image texture analysis is feature extraction. Texture features obtained from this step are used to discriminate textures, classify image textures, or determine object shape. Feature extraction computes a characteristic that can describe texture properties of a digital image. The process that partitions a textured image into regions, each corresponding to a perceptually homogeneous texture, is texture discrimination. In the stage of texture classification, a rule, which classifies a given test image of unknown classes to one of the known classes, is designed. Shape from texture reconstructs 3D surface geometry from texture information. Feature extraction techniques mainly include first-order histogram based features, cooccurrence matrix based features, and multiscale features [4]. First-order histogram based features, according to the shape of the histogram of intensity levels, provides a number of clews as to the character of the image. The second-order histogram is considered as the cooccurrence matrix [5]. Cooccurrence matrix based features are the estimate of the joint probability distributions of pairs of pixels. In order to calculate multiscale features, many time-frequency methods are adopted [6]. The common methods are Wigner distributions, Gabor functions, wavelet transform, and ridgelet transform. Wigner distributions can produce inference terms which lead to wrong signal interpretation. Gabor filter results in redundant features at different scales or channels [7]. Wavelet transform is a linear operation and possesses a capability of time localisation of signal spectral features. For these reasons, it is interesting in application to texture analysis for wavelet transform. Ridgelet transform can deal effectively with line singularities in 2D. It is well known that texture classification based on ridgelet statistical features (RSFs) and ridgelet cooccurrence features (RCFs) has been done by Arivazhagan et al. [8].

In the last few decades, wavelet theory has been widely used for texture classification purposes [911]. However, wavelet transform is not translation invariant. In 2012, Mallat advanced scattering transform which is invariant to translations and Lipschitz continuous relatively to deformations [12]. Scattering transform can overcome the weakness of wavelet transform, that is, not translation invariant. The idea is that scattering transform is computed by iterating over successive wavelet transforms and modulus operators. Scattering transform maps high frequency information of images to low frequency. Then, scattering transform can provide a stationary representation. Scattering transform has found applications in texture classification (e.g., [13, 14]). These classification tasks are based on original scattering vectors.

In this paper, the scattering transform is applied on a set of texture images. Statistical features and cooccurrence features are extracted from original images and each of scattering subbands. These features are used for classification. For the sake of comparative analysis, classification tasks are done using RSFs, RCFs, wavelet statistical features (WSFs), and wavelet cooccurrence features (WCFs), respectively. The experimental results show that the success rate of our feature extraction techniques is promising but unsatisfactory. But it is considered as a proof of concept for scattering statistical features (SSFs) and scattering cooccurrence features (SCFs).

The rest of this paper is organized as follows. In Section 2, the theory of scattering transform is briefly reviewed. The feature extraction and texture classification are explained in Section 3. In Section 4, texture classification experimental results are discussed in detail. Finally, concluding remarks are given in Section 5.

2. Scattering Transform

Wavelet transform is a process which was applied to original signal by a filter [15]. denotes a discrete, finite rotation group in . A wavelet function is a band-pass filter. The following formula is rotation and dilation of : with and , is rotation parameter; for is dilation parameter.

A texture is modeled as a realization of a stationary process. So the wavelet transform of is written as follows: is the wavelet modulus of The high frequency coefficients of wavelet transform are mapped to the low frequency form by the modulus operator [16].

The result of the convolution of texture and zoom function is low frequency information; that is, where . The resulting wavelet modulus operator is

The high frequency information which is lost by the wavelet modulus operator could be recovered by next [12], resulting in the scattering propagators along different paths as follows: In particular, when ,

The wavelet modulus operator is iteratively applied to progressively map the high frequency information to the low frequency information. Thus scattering operator is defined from and [12]. The information of texture is scattered to different paths in the iterative process. The scattering operator implements a sequence of wavelet convolutions and modulus, followed by a convolution with : If , then

The scattering transform is thus computed with a cascade of wavelet transform and modulus. The scattering transform process could be described by a deep network architecture (see, e.g., [17, 18]), as shown in Figure 1. Mallat has proved that the energy of the deepest layer converges quickly to zero as the length of path increases in [12]. Bruna and Mallat [19] have illustrated that most of the energy is concentrated in . Further details about scattering transform are presented in [12].

3. Feature Extraction and Texture Classification

The steps involved in texture training and texture classification are shown in Figure 2.

Texture Training. At the stage of the texture training, the known texture images are decomposed by using scattering transform. Then, mean and standard deviation of original images and subbands of two layers decomposed images are calculated as features using the formulas given in the following:where is the transformed valued in for any image of size [20]. These features are stored in features library as scattering statistical features (SSFs) which are further used in the texture classification phase.

In addition, in order to further verify the classification rate, cooccurrent matrix () [21] is formed for each subband of scattering transform and each image, respectively. From the cooccurrence matrix, the features such as cluster prominence, cluster shade, contrast, and local homogeneity are given by Arivazhagan and Ganesan [9]. These features are obtained by (9)–(12). These features are stored in the feature database as scattering cooccurrence features (SCFs):where , , and is the th element of the cooccurrence matrix .

Texture Classification. Here, the unknown texture images are decomposed using scattering transform. Then, SSFs and SCFs of original images and subbands of scattering decomposed images are extracted using (7)–(12), respectively. These features are compared with the corresponding feature values stored in the features library using a distance formula, given as follows:where is an unknown texture, indicates number of features, represents the features of while is a known th texture in the library, and is the features of known th texture. If the distance is minimum among all textures which is available in the library, then the known texture is classified as th texture. This classification approach is very simple, efficient, and effective in many fields [22]. This rule is widely used in object recognition [23], text categorization [24], pattern recognition [25], and so on.

Performance of the feature sets is tested with success rate. Let be the number of subimages correctly classified and let be the total number of subimages, derived from each texture image. Then classification success rate is calculated using

4. Experimental Results and Discussion

In this section, several experiments are carried out on texture databases from Brodatz texture album [26] and VisTex color image database [27]. Four experiments are conducted with only one objective which is investigation of the texture classification performance based on the proposed methods of feature extraction. For the purpose of comparison, the classification experiment is repeated with RSFs, RCFs, WSFs, and WCFs, respectively. In order to verify performance of our feature extraction methods on large amounts of data and small amounts of data, VisTex color image database is used thrice, the first two times with a small number of images and the third time with a large number. Furthermore, the efficiency of feature extraction approaches proposed is demonstrated with the average success rate and image regions correctly classified.

Since Bruna and Mallat have illustrated that most of scattering energy is concentrated in , we mainly consider the first three layers in the current work. It is noted that the computing cost of the first three layers is larger than that of the first two layers. It is a pity that the classification performance of the first three layers is slightly better than that of the first two layers. Therefore, the maximum number of scattering layers is in our experiments. In addition, in order to get optimal values of the number of orientations and the maximum scale, we try to change the values of these parameters and do a large number of experiments. Comprehensively considering the computation complexity and classification performance, the number of scattering orientations is 4 and the maximum scale of scattering transform is 2. There is a various number of scattering matrices in different layers of scattering transform. As a result, the resulting number of scattering matrices in the zeroth layer of scattering transform is only one, there are scattering matrices in the first layer of scattering transform, and the number of scattering matrices of the second layer is .

Firstly, Dataset-1 contains monochrome images which are obtained from VisTex color image database, each of size . Texture image classification is done for Dataset-1 using SSFs and SCFs. Here, each texture image is subdivided into sixty-four , sixteen , and four nonoverlapping image regions. So, there are a total of subimages regions in the database. By decomposing an image using scattering transform, 25, 25, and 25 subbands are obtained for the image of size , , and , respectively. SSFs and SCFs are calculated over all the scattering decomposed subbands. Furthermore, SSFs and SCFs of the regions of size , , and are also obtained.

The experimental results are summarized in Table 1. From Table 1, it is found that, compared with RSFs and WSFs, when classification is carried out with statistical features that is, mean and standard deviation of original images and subbands of transform decomposed images, the mean success rate obtained from SSFs is the highest, that is, It can be seen that SCFs perform better than RCFs but poorer than WCFs. Using the feature vectors which contain the combination of statistical features and cooccurrence features, the mean success rate for feature vectors F3, R3, and W3 is , , and , respectively.

Next, Dataset-2 containing thirty size monochrome images which are obtained from VisTex color image database is used for analysis. In a similar manner, for Dataset-2, each texture image is subdivided into four , sixteen , and sixty-four nonoverlapping image regions. Therefore, there are subimage regions, respectively, in the database. SSFs and SCFs are extracted from original images and subbands of scattering transform decomposed images.

The classification results which are obtained for all the subimage regions derived from each texture image in Dataset-2 are given in Table 2. Table 2 shows the following: (i) using the feature vector F1, the success rate achieved is ; (ii) using SSFs as feature vector F1, a mean success rate is about more than the average success rate using F2 whose mean success rate is only ; (iii) the mean success rate obtained using F3 is which is about less than the average success rate obtained using F1.

In addition, our proposed approaches are compared with R1, R2, R3, W1, W2, and W3 in terms of the classification performance. Compared with RSFs and WSFs, the performance of SSFs is the best. For cooccurrence features, SCFs get better classification performance than WCFs, while its mean accuracy is slightly lower than that from RCFs. From Table 2, it is found that when classification is carried out with W3 and R3, the mean success rate is and , respectively. But when classification is done with F3, the mean success rate is slightly reduced to .

Then, Dataset-3 containing one hundred and twelve monochrome images, obtained from Brodatz texture album, is used for analysis. Size of each image in Dataset-3 is . Each texture image is subdivided into four and sixteen nonoverlapping image regions. Hence, the database includes a total of subimage regions, respectively. The feature vectors SSFs and SCFs for each image are calculated from the subbands of scattering transform decomposed image and the original image.

The classification results are summarized in Table 3. The mean success rate of feature vectors F1, F2, and F3 is , , and , respectively. As shown in Table 3, for statistical features, it is noted that the highest mean success rate is obtained using SSFs. Comparing with RCFs and WCFs, the performance of SCFs is better than WCFs and worse than RCFs. Likewise, the mean classification accuracy obtained using F3 is higher than that achieved using W3 and lower than the mean score got using R3, when the performance of F3 is compared with that of feature vectors R3 and W3.

Finally, Dataset-4 is created from one hundred and twenty-nine monochrome images from VisTex color image database. The database is constructed by dividing each image into nonoverlapping four and sixteen image regions. There are image regions in the database. SSFs and SCFs are extracted from subbands of scattering transform decomposed image and the original image. F1 contains mean and standard deviation. F2 includes SCFs, that is, cluster prominence, cluster shade, contrast, and local homogeneity. F3 is the combination of SSFs and SCFs. Classification is done using three different feature vectors (F1, F2, and F3). F1, F2, and F3 are calculated from scattering subbands and original images.

The classification results are summarized in Table 4. The classification is implemented using feature vector F1 and a mean success rate achieved is Using F2, the mean success rate is Then, using F3, the mean success rate obtained is only The mean success rate obtained using F1 is about more than the average success rate obtained using F2. The mean success rate obtained using F3 is about more than the average success rate obtained using F2.

Comparing with the performance of RSFs and WSFs, the average correct classification rate achieved using SSFs is the highest. Comparing with RCFs and WCFs, the performance of SCFs is better than that of RCFs and worse than the performance of WCFs. Likewise, the mean classification gain obtained using F3 is higher than that achieved using W3 and lower than the average classification rate of R3, when the performance of F3 is compared with that of R3 and W3.

From experiment results of this section, it is found that a joint phenomenon is that F2 is much worse than F1, whilst F3 is a little bit worse than F2; we speculate that it may be due to high variance in the estimation of cooccurrence features. Through the comparison with wavele transform and ridgelet transform, the classification performance based on scattering statistical features is the best in the four datasets. For cooccurrence features, the mean classification accuracy of SCFs is comparable with that of RCFs and WCFs in this study. When combining statistical features and cooccurrence features, the average classification accuracy obtained by F3 is lower than that achieved by feature vectors R3 and W3 for small amounts of datasets. For large amounts of datasets, the experimental results obtained by F3 are better than that achieved using W3, but worse than the outcomes of R3.

5. Conclusion

In this present work, the highest mean success rate achieved using scattering statistical and cooccurrence features is , , , and in Dataset-1, Dataset-2, Dataset-3, and Dataset-4, respectively. Our methods may not be competitive to state-of-the-art feature extraction methods using significant image knowledge and heuristics. However, we find that these results are promising and view them as a proof of concept for SSFs and SCFs. From the exhaustive experiments conducted with texture image datasets, it is inferred that statistical features in the context of scattering representations provide a good compromise between discriminability and good feature properties, whereas cooccurrence features come with nonhigh discriminability.

Our current work has so far focused on algorithmic development and experimental justification. More thorough theoretical analysis of feature extraction methods proposed is expected in the future. Furthermore, this work can be extended for an efficient classification system design with excellent success rate of classification.

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

Acknowledgments

The first author would like to express gratitude to her advisor Professor Jiangshe Zhang for his valuable comments and suggestions which lead to a substantial improvement of this paper. This work was supported by the National Basic Research Program of China (973 Program) under Grant no. 2013CB329404 and the Major Research Project of the National Natural Science Foundation of China under Grant nos. 91230101, 11131006, 11201367, and 61572393.