Abstract

Recognizing cloud type of ground-based images automatically has a great influence on the weather service but poses a significant challenge. Based on the symmetric positive definite (SPD) matrix manifold, a novel method named “manifold kernel sparse coding and dictionary learning” (MKSCDL) is proposed for cloud classification. Different from classical features extracted in the Euclidean space, the SPD matrix fuses multiple features and represents non-Euclidean geometric characteristics. MKSCDL is composed of three steps: feature extraction, dictionary learning, and classification. With the learned dictionary, the SPD matrix of the cloud image can be described with the sparse code. The experiments are conducted on two different ground-based cloud image datasets. Benefitting from the sparse representation on the Riemannian matrix manifold, compared to the recent baselines, experimental results demonstrate that MKSCDL possesses a more competitive performance on both grayscale and colour image datasets.

1. Introduction

Clouds play an essential role in the circulation of water vapour and affects the earth’s energy balance [13]. In the study of weather forecasting and climate change, clouds are always regarded as the core factor [4]. The traditional cloud observation is much dependent on the observers’ experience, and thus, it is time-consuming. The substantial and noticeable development of hardware and digital imaging techniques makes it possible to observe the cloud automatically and continuously. Compared with satellite images, the ground-based images hold a high spatial resolution at a local scale [5].

Attributing to the application of sky imagers and ceilometer, automatic observation has been realized in cloud cover and cloud base height [6]. To identify the cloud type accurately and effectively, many attempts have recently been made to address this challenging issue [517]. Buch et al. [7] adopted texture, position, and pixel brightness features together from the whole-sky images and classified the data with decision trees. To test the texture feature extraction approach, autocorrelation, cooccurrence matrices, edge frequency, Law’s features, and primitive length were applied for cloud recognition [8]. Calbó and Sabburg [5] presented statistical texture features, Fourier transform features, and thresholded image features to identify the images taken by the whole-sky imager (WSI) and total sky imager (TSI). Heinle et al. [9] extracted 12 statistical features to represent the image’s colour and texture and then employed the k-nearest-neighbour (KNN) classifier to distinguish seven different sky conditions. As a clear distinction exists in texture orientation between the satellite images and the ground-based images, Gabor-based multiple features were utilized for classification with support vector machine (SVM) and achieved an overall accuracy of 88.3% [10]. Different from the typical local binary patterns (LBPs), weighted local binary patterns (WLBPs) [11] were presented by fusing the variance of a local patch to enhance the contrast for recognizing cloud types. Cheng and Yu [12] combined statistical features and LBPs with the Bayesian classifier to perform block-based classification (BC). In addition to texture features, Liu et al. [13] employed 7 structure features from the edge image to describe the structure characteristic of the infrared clouds. Zhuo et al. [14] indicated that simply using texture or structure features separately may not produce excellent classification performance. Hence, both texture and structure features were captured to obtain the cloud type with SVM. Furthermore, Xia et al. [6] and Xiao et al. [15] proposed to make use of multiple features together, including colour, texture, and structure features for cloud-type recognition, and the experiments validated that the integration of various features performed better than other cases. The physical features are also of great importance to represent clouds. Kazantzidis et al. [16] introduced the solar zenithal angle, the total cloud coverage, the visible percentage of sun, and the existence of rain in sky images to describe the physical property. Besides 12 image features extracted from the sky camera image, Tato et al. [17] combined another 7 cloud layer features from the ceilometer and adopted random forests for classification.

To represent the cloud image more effectively, Li et al. [18] put forward a discriminative model based on a bag of microstructures (BoMS), and it showed competitive performance in the cloud-type recognition. To make up the weakness of BoMS that it could not describe these complex categories well, the duplex norm-bounded sparse representation model [19] was reported. This model demonstrated promising classification performance and was validated to be capable of capturing the most prominent patterns from complex cloud categories and naturally to attain a higher accuracy.

Recently, the symmetric positive definite (SPD) matrix manifold has gained much popularity in the action recognition, object detection, face recognition, etc. [20, 21]. In addition, sparse representation on SPD matrix manifolds has been applied to these areas to achieve better performances [22, 23]. In spite of its effectiveness, the matrix manifolds method is seldom investigated to address the task of cloud classification [24].

In this paper, manifold kernel sparse coding and dictionary learning (MKSCDL) on the SPD matrix manifold is proposed for ground-based cloud classification. The rest of this paper is organized as follows: Section 2 introduces the dataset, and Section 3 describes the methodology of MKSCDL. Section 4 reports the experimental results and discussions. Ultimately, conclusions are summarized in Section 5.

2. Dataset

2.1. Zenithal Dataset

The zenithal dataset is provided by the National University of Defense Technology in China and acquired from historical ground-based infrared images taken by the whole-sky infrared cloud-measuring system (WSIRCMS) [25]. The images are grouped into five categories: stratiform, cumuliform, waveform, and cirriform clouds and clear sky according to the morphology and generating mechanism of the cloud [26]. This dataset is composed of 100 cloud images in each category. The typical samples with a size of pixels from each category are demonstrated in Figure 1.

2.2. SWIMCAT Dataset

The SWIMCAT dataset [27] contains 784 images taken from a daytime WSI called the wide-angle high-resolution sky-imaging system (WAHRSIS). The images are classified into 5 distinct categories: clear sky, patterned clouds, thick dark clouds, thick white clouds, and veil clouds. The number of images in per category is 224, 89, 251, 135, and 85, correspondingly. The images were obtained over a period from January 2013 to May 2014. They were selected based on visual characteristics and were categorized with the help of experts from Singapore Meteorological Services. The representative samples with a size of pixels from each category are demonstrated in Figure 2.

3. Method

In this section, the methodology is introduced mainly in three parts: feature extraction, dictionary learning, and classification, which is illustrated overall in Figure 3.

3.1. Feature Extraction and Stein Kernel

Given an image with a size of , the feature image is defined by computing d-dimensional features at every pixel:where is the feature mapping, for example:where is the pixel location; represents the pixel gray value; , , , and denote the first- and second-order derivatives of in the direction of and , respectively; and means the modulus of the gradient.

The covariance descriptor (CovD) of the feature image is computed by the following equation:where represents the mean feature vector and is the pixel amount of the image.

In general, the CovD is an SPD matrix. All SPD matrices form a Riemannian manifold when endowed with a Riemannian metric. Note that the SPD matrix is adopted as the extracted feature to describe the image; therefore, it is different from traditional features used for classification in the Euclidean space.

In this paper, we adopt the Stein divergence as a Riemannian metric, and the SPD matrix manifold is mapped into the reproducing kernel Hilbert space (RKHS). The Stein divergence is defined as follows:where and are the points on the SPD matrix manifold, and measures the distance between these two points.

The Stein kernel is defined as follows:

It is a positive definite kernel for certain choices of [28]. With the Stein kernel, we can map the SPD manifold into RKHS:

Specifically, .

3.2. Kernel Sparse Coding and Dictionary Learning

In this section, we give a framework for manifold kernel sparse coding and dictionary learning (MKSCDL), which is outlined in Algorithm 1. Let denote points on and be a dictionary with atoms. With the Stein kernel, we update the dictionary by two steps iteratively: kernel sparse coding and kernel dictionary learning. The model is an optimization problem with constraint:where is a sample from , represents the sparse coefficient, is a regularization parameter, and denotes the reconstruction error.

Input: SPD matrices and atom number of the dictionary
Output: updated dictionary
Initialize the dictionary by -means on Riemannian manifolds using the Frechet mean [29].
while not converge do
 Kernel Sparse Coding
 while not converge do
  
  
 end
 Kernel Dictionary Learning
 for to
  
  while not converge do
   
   
  end
 end
end
Return
3.2.1. Kernel Sparse Coding

When the dictionary is fixed, the sparse coding problem is formulated as follows:

The iterative shrinkage-thresholding algorithm (ISTA) [30] is adopted for the optimization solution.

Let , then the sparse vector is updated as follows:where is the step size, represents the sparse coefficient at the -th iteration, and the shrinkage operator is defined as follows:

Equation (10) is equal to .

Now, the problem is transformed to calculate the gradient of with respect to :

As , the gradient of with respect to is

The first term of Equation (12) is

Similarly, the second term of Equation (12) is

As a result, the gradient is obtained by adding the right-side terms of Equations (13) and (14). The sparse code is implemented with Equation (9).

3.2.2. Kernel Dictionary Learning

First, the initial dictionary is achieved by -means on Riemannian manifolds using the Frechet mean [29]. It selects k points as the initial cluster centers from the training data randomly. Then measured by Stein divergence, every point is allocated to the closest cluster center to recompute the corresponding cluster center iteratively based on the Frechet mean via the following equation:where is a set of SPD matrices and is the updated cluster center. Ultimately, the forms the initial codebook .

When the sparse coefficient is fixed, the dictionary is updated atom by atom, and the dictionary learning problem is formulated as follows:

We use ISTA [30] to update the dictionary as well. The Euclidean gradient of is

As proved in [22], the Riemannian gradient is

The i-th atom is updated as follows:where is the step size and represents the  -th atom at the -th iteration.

3.3. Classification

After the dictionary is learned from the training set, the sparse coefficient and the reconstruction errors “RE” of the testing set are obtained to predict the cloud type. Algorithm 2 details the procedure of classification.

Input: testing sample and learned dictionary
Output: predicted cloud type
for to
 Kernel Sparse Coding
 Computing Reconstruction Error
end
Return

Since there are types of the samples, the dictionary as well has types, which are updated independently, as detailed in Section 3.2.2.

The sparse coefficient of the testing sample is computed as follows:

With the sparse coefficients on each kind of dictionary, the smallest reconstruction error indicates the class that the testing sample belongs to.

In this paper, the number of cloud classes in the two datasets is both 5, and the reconstruction errors on 5 classes of the dictionary are compared to decide the cloud type.

4. Results and Discussions

In this section, compared with the baselines [11, 12, 27], the performance of MKSCDL is evaluated with the same experimental setup on two different image datasets: zenithal dataset, captured by WSIRCMS, and SWIMCAT dataset, gathered by WAHRSIS. Each experiment is implemented three times with 10-fold cross validation, and average values are treated as final results.

Note that the feature map is defined independently on two different datasets due to the different nature between the grayscale and colour images. is defined in Equation (2) to the grayscale image in the zenithal dataset, while in the SWIMCAT dataset to the RGB image, where , , and represent the intensity images in three channels, respectively. The parameters used in our experiments are set empirically as follows: , , , and .

4.1. Results of the Zenithal Dataset
4.1.1. Performance of MKSCDL

The first experiment is carried out on the zenithal dataset. With diverse choices of the atom number in the dictionary, the performance of MKSCDL varies correspondingly. Figure 4 shows the overall accuracy on the updated dictionary with different sizes. It is illustrated that when the dictionary size equals 14, the overall accuracy achieves up to 96.33%, which outperforms the other cases.

Figure 5 reports the confusion matrix when . The element of row i and column j in the confusion matrix means the percentage of the i-th class recognized as the j-th class. As a result, the diagonal elements correspond to the recognition rates of all categories. It is illustrated all of the stratiform clouds and 99.3% clear sky images are identified correctly, which means that these two cloud types possess prominent characteristics to be distinguished. Likewise, the recognition rates of cumuliform, waveform, and cirriform clouds achieve more than 93%. On the whole, MKSCDL reveals an ideal performance in identifying the ground-based grayscale cloud images with a rather high accuracy.

4.1.2. Performance Comparison with the Baselines

To assess the effectiveness of MKSCDL further, WLBP [11] and BC [12] are applied for comparison:(i)WLBP [11] is the method fusing the variance of a local patch into LBP. The KNN classifier is employed for cloud classification based on the chi-square distance.(ii)BC [12] integrates statistical and local texture features and adopts the Bayesian classifier with regularized discriminant analysis. Note that the statistical features have only 8 dimensions because the infrared images are lack of colour information.

Table 1 presents the comparison results of 10-fold cross validation. The performance of MKSCDL exceeds that of the other two methods especially with regard to cumuliform, waveform, and cirriform clouds. MKSCDL achieves the highest overall accuracy of 96.33% among them. That means the dictionary is learned well and the samples are described adequately on the corresponding dictionary instead of different types of dictionary, which makes contribution to the competitive performance.

4.2. Results of the SWIMCAT Dataset
4.2.1. Performance of MKSCDL

The second experiment is conducted on the SWIMCAT dataset. Similar to the first experiment, Figure 6 exhibits the overall accuracy on the learned dictionary with different sizes. As the dictionary size increases, the overall accuracy improves in general. When is 20, MKSCDL performs best with a result of 98.34%. In consideration of excellent classification performance and computational cost of MKSCDL, we confirm that can satisfy the experimental requirement.

Figure 7 demonstrates the confusion matrix when . Patterned clouds and thick white clouds possess obvious characteristics for discrimination with an accuracy of 100%. Likewise, the results of the clear sky and thick dark clouds achieve over 98%. In addition, the challenging veil clouds resembling clear sky [27] attain a decent result of 92.94%. It is shown that the misclassification rate for each class is rather low, which means that on the whole, the proposed method works well in categorizing the ground-based RGB images.

4.2.2. Performance Comparison with the Baselines

Besides the two baselines mentioned in Section 4.1.2, the texton-based method [27] integrating both colour and texture features is used for comparison as well. Note that different from the grayscale images, the statistical features extracted from colour images in the BC method have 12 dimensions.

Table 2 presents the comparison results of 10-fold cross validation. By contrast, MKSCDL performs better than WLBP and BC overall. It is clear that MKSCDL has a strong power for the task of cloud categorization of the SWIMCAT dataset.

To compare with the texton-based approach well, we also implement the experiment with the same configuration as that in [27], which chose a training set with 40 images per category while another 45 ones for testing randomly (40/45). Table 3 lists the results. In comparison with Table 2, the overall performance of different methods changes little. The texton-based method achieves perfect classification accuracy for all categories except veil clouds, whose accuracy remains to be improved. Viewing the results of MKSCDL, each cloud type acquires a fair high recognition rate.

Comprehensively, in comparison with the other three methods, MKSCDL is validated to be the most effective one in recognizing the ground-based colour images.

5. Conclusions

In this paper, a novel cloud classification named “MKSCDL” on manifolds is proposed. The SPD matrix is extracted from each image and acts as the image feature. To maintain the non-Euclidean geometrical characteristics of the SPD matrix, kernel sparse coding and dictionary learning are conducted to obtain a representative dictionary. The testing sample’s reconstruction errors on different categories of dictionary are calculated to identify the specific cloud type. By comparing recent methods on grayscale and colour datasets, it is interesting to find that WLBP performs better in grayscale images, while BC performs better in colour images. Comparatively speaking, MKSCDL is suitable for both grayscale and colour images and is equipped with high capacity for automatic ground-based cloud classification.

The proposed method MKSCDL can be applied to the real life, and the results provided by the visual observation are adopted for comparison and evaluating the automatic classification method. With limited images, it may not produce perfect performance in cloud-type recognition. As the dataset becomes more representative and more adequate, it will work better and can satisfy the task of cloud classification well. In the future, multiple improvements should be considered for better automatic cloud-type classification. As for the feature extraction, more features like the gray-level cooccurrence matrix and Gabor filtering could be fused into the feature mapping to describe the image well. In terms of dictionary learning, the interclass difference can be taken into consideration for a more discriminative dictionary. What is more, samples’ sparse coefficients can be applied to build the SVM model. In addition, complex sky condition existing in multiple cloud categories is supposed to attract our attention in the next work.

Data Availability

The SWIMCAT dataset is linked to http://vintage.winklerbros.net/swimcat.html, and the zenithal dataset used to support the findings of this study is available from the first author ([email protected]) upon request.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Acknowledgments

The authors would like to thank Prof. Yee Hui Lee for providing the SWIMCAT dataset and Dr. Lei Liu for polishing the manuscript. This work was supported by the National Natural Science Foundation of China (Grant nos. 61473310, 41174164, and 41575024).