Abstract

In recent years, dictionary learning has received more and more attention in the study of face recognition. However, most dictionary learning algorithms directly use the original training samples to learn the dictionary, ignoring noise existing in the training samples. For example, there are differences between different images of the same subject due to changes in illumination, expression, etc. To address the above problems, this paper proposes the dictionary relearning algorithm (DRLA) based on locality constraint and label embedding, which can effectively reduce the influence of noise on the dictionary learning algorithm. In our proposed dictionary learning algorithm, first, the initial dictionary and coding coefficient matrix are directly obtained from the training samples, and then the original training samples are reconstructed by the product of the initial dictionary and coding coefficient matrix. Finally, the dictionary learning algorithm is reapplied to obtain a new dictionary and coding coefficient matrix, and the newly obtained dictionary and coding coefficient matrix are used for subsequent image classification. The dictionary reconstruction method can partially eliminate noise in the original training samples. Therefore, the proposed algorithm can obtain more robust classification results. The experimental results demonstrate that the proposed algorithm performs better in recognition accuracy than some state-of-the-art algorithms.

1. Introduction

In recent years, dictionary learning has been widely applied in various fields due to its excellent performance, such as face recognition [13], image denoising [4, 5] and blurring [6, 7], image segmentation [8, 9], and image recognition [10]. For face recognition [11, 12], the conventional dictionary learning method first learns a dictionary through the training samples. Then, given a test image, the image is represented by the atoms in the dictionary. Finally, the image is classified according to the result of the representation.

According to previous studies, using a dictionary obtained by the training samples to represent and classify test samples can lead to a higher accuracy than directly using training samples to represent and classify the test samples [1315]. The dictionary learning method has achieved very significant performance in face recognition applications so that researchers have proposed various dictionary learning methods. Luo et al. proposed a multiresolution dictionary learning method [16], which mainly resolves the problem that the conventional dictionary learning method only focuses on a single resolution. The method can effectively reduce the influence of noise and has better robustness. At the same time, the performance of face recognition has also been greatly improved. The multiview-like multiple vector representations can provide supplementary information for the representation object, and a robust dictionary learning method is proposed in [17]. Wang et al. used the low-rank structure of training data to construct a dictionary and proposed a new dictionary learning method, namely, Discriminative and Common hybrid Dictionary Learning (DCDL) [18]. The method overcomes two problems with the Sparse Representation-based Classification (SRC). One is that there is no major damage to the training samples, and the other is that each subject should have enough training samples. Foroughi et al. proposed a multimodal structured low-rank dictionary learning method [19], which also has a good recognition rate and robustness under severe illumination changes and occlusions. In addition, sparse representation has achieved significant performance in face recognition. For example, Xu et al. proposed an improved collaborative representation [20]. The novel image representation and score fusion scheme is effective for classifying deformable objects, such as face images. Qian also proposed a method for merging virtual images and training samples [21]. Tang et al. proposed a novel algorithm, Distance Weighted Regression Classifier (DWLRC) [22], which solves the problem that current algorithms cannot fully utilize sample information.

To improve the discriminative ability of the dictionary, the researchers proposed various constraint models, and the most important ones are the local structure constraint model and the label constraint model [23]. In the field of sparse coding and dictionary learning, the use of the local structure of data is considered to be an important way to improve performance. In fact, the locality of the data can lead to the sparseness of the data, but not vice versa [24]. Therefore, many dictionary learning methods have added local constraints. For example, to maintain the structural characteristics of the training samples, Zheng et al. [25] proposed a Graph-SC algorithm to generate a Laplace graph of the training samples and used it to design the discriminant of the dictionary. Haghiri et al. [26] also proposed a discriminative dictionary learning method that preserves the local structure of the training samples. Zhang et al. [27] proposed the Locality-Constrained Projective Dictionary Learning (LC-PDL) algorithm, which adds the local constraint of atoms to maintain local information. Similarly, the label constraint also plays an important role in dictionary classification. More and more dictionary learning algorithms improve performance by adding the label constraint. For example, Shrivastava et al. [28] proposed a nonlinear discriminative dictionary learning method. The method can learn a dictionary by using both marked and unmarked data. The D-KSVD [29] algorithm adds the class label of the training samples to the discriminant, which ensures that the coding coefficients corresponding to the training samples of the same class also have a similarity. The algorithm greatly improves the discriminative ability of the dictionary. At the same time, the class label of the atom has also received more attention from researchers. The K-SVD [30] algorithm associates each atom with label information and introduces a new label consistency constraint. Considering the importance of the local structure and the label constraint, Li et al. [31] proposed the Locality-Constrained and Label Embedding Dictionary Learning Algorithm (LCLE-DL).

At present, most dictionary learning algorithms use the original training samples to generate a dictionary directly. However, the original training samples are generally collected at different angles, illuminations, and facial expressions. In other words, there are differences between different face images of the same object, and we can think that there is noise in the original image, which may reduce the discriminative ability of the dictionary. Actually, noise almost always exists [32]. Therefore, this paper proposes the dictionary relearning algorithm (DRLA) based on Locality-Constrained and Label Embedding Dictionary Learning Algorithm (LCLE-DL), which can partially eliminate noise in the original image. The dictionary reconstruction method is as follows: first, assume that we have obtained the initial dictionary and encoding coefficient matrix. Then, the product of the initialized dictionary and the coding coefficient matrix replaces the original training samples matrix. Finally, the dictionary learning algorithm uses the updated training samples matrix to obtain the reconstructed dictionary and coding coefficient matrix, and the reconstructed result is used for subsequent image classification. We will explain the proposed method in detail in the subsequent chapters. The results of face recognition experiments show that the proposed method is feasible, which can reduce the influence of noise in the image and improve the accuracy of face recognition.

The other parts of the paper are organized as follows. Section 2 introduced the work related to the proposed algorithm. Section 3 explains the complete steps of the dictionary relearning algorithm (DRLA). Section 4 presents and analyzes the experimental results of the face database. Section 5 provides the conclusions.

In this section, we introduce the work related to the dictionary relearning algorithm proposed in this paper. It mainly explains how to construct the local structure and the label embedding constraint. Besides, we introduce a discriminant dictionary model with both atomic local structure and label embedding constraints [31]. We also define some important symbols and definitions. We represent the training samples as . represents the number of classes of training samples. is the number of training samples, and the dimension of each training sample is n. The dictionary is represented as , is the number of atoms. The coding coefficient matrix is recorded as , represents the coding coefficient corresponding to the training sample , and . We refer to the transpose of the coefficient matrix as the profiles matrix, denoted as . That is, each row of the coding coefficient matrix represents a profile [33].

2.1. Locality Constraint of Atoms

Previous research [27, 30, 31] results show that the local constraint of atoms can effectively improve the discriminative ability of the dictionary and increase the robustness of the dictionary. There is a similar relationship between atoms and profiles. If two atoms are similar, then their corresponding profiles are similar, and vice versa. Moreover, all elements in the same profile correspond to the same atom. If the values of the th and th elements of the vector are not zero, then the training sample is reconstructed by the atoms and , and other atoms do not participate in the process. If profiles is a nonzero vector and the th and th elements are not zero, this means that the training samples and are represented by the atom . To maintain the local structural characteristics of the atom and inherit the structural features of the training samples, the paper [31] designed a discriminant dictionary model of the local constraint of atoms. Since the graph can effectively represent the similar relationship between the atoms and the profiles, a neighbor graph can be constructed by the atoms. We can calculate the weight matrix of graph by equation (1) [34], denoted as :where refers to the weight between the atoms and , which represents the degree of similarity between the two atoms, represents the atoms closest to the atom , and is a parameter and is equal to 4 in all experiments. Through the previously constructed neighboring graph , we can generate a Laplacian graph . The equation for constructing a Laplacian graph is as follows:where is a diagonal matrix, denoted as , and .

As the Laplace graph is generated by the dictionary, it has better robustness and can better reflect the similarity of atoms [31]. According to the similar relationship between atoms and profiles, the profile matrix and Laplacian graph can be used to construct the discriminant of the dictionary, which can effectively maintain the local structure of atoms. The construction method of the discriminant dictionary model based on the locality constraint of atoms is as follows [31]:

2.2. Label Embedding of Atoms

The label constraint of atoms can improve the discriminative ability of the dictionary, which has been proved in previous studies [30, 31]. The atoms in the dictionary should have a different reconstruction performance when reconstructing the training samples. If some atoms reconstruct only one class of the training samples, then these atoms can be considered to belong to the same class [35]. The K-SVD algorithm can generate a specific class dictionary for each class of the training samples. Therefore, using K-SVD on the training samples can generate a dictionary containing the atoms of the C classes, denoted as . If the atom belongs to , then the atom can be assigned a class label of , is a C-dimensional row vector, and the value of the th element is one, and the values of the other elements are zero. Finally, we can get the class label vector for each atom and construct the label matrix of dictionary , denoted as , where is the number of atoms in the dictionary.

After obtaining label matrix , the weighted label matrix of dictionary can be obtained according to the following equation [36]:

According to [31], the profiles matrix and the weighted label matrix are used to construct the label embedding constraint model of the atom as follows:where V is the coding coefficient matrix and U is called the scaled label matrix. From equation (5), we can see that . Furthermore, U is a block diagonal matrix, which causes matrix V to be a similar structure. This structure increases the discrimination of coding coefficients. The label embedding constraint of atoms promotes the similarity of profiles corresponding to atoms of the same class.

Through the above locality and label embedding constraint, a discriminant dictionary model with both constraints can be constructed as follows:where α, β, and γ are parameters. The second and fourth terms correspond to the locality constraint and the label embedding constraint of atoms, respectively. The fifth term ensures that the coding coefficient matrices X and V should be as close as possible so that the structural features and the discriminating information can be converted to each other.

3. The Dictionary Relearning Algorithm

In this section, we will explain in detail the proposed method, i.e., the dictionary relearning algorithm (DRLA).

The specific steps of the algorithm are as follows:Step 1: the K-SVD algorithm is exploited to obtain the initial dictionary D and the coding coefficient matrix X based on the training samples Y. In particular, we apply the K-SVD algorithm to training samples of each class to generate a specific class dictionary . Therefore, dictionary D is defined as , C is the number of classes, and K is the number of atoms. Meanwhile, the coding coefficient matrix obtained by the K-SVD algorithm can be expressed as , where N is the number of the training samples.Step 2: we record the label matrix of the training samples as , where is a C-dimensional column vector and N is the number of the training samples. represents the class label vector of training sample , a member of the training sample set of the th class, then . Only the value of the th element of is one, and the values of other elements are zero. When using the K-SVD algorithm to generate a specific class dictionary for each class of the training samples, we construct the class label of the atom of this class dictionary through the matrix T. Finally, we can get the label matrix R of dictionary D, and Section 2.2 describes the structure of R. Step 3: we can construct the weight label matrix H of the dictionary D according to the label matrix R. The construction formula is equation (4), that is, . At the same time, the scaled label matrix U is obtained, and the calculation method is . Step 4: we construct the Laplacian graph L using the initialized dictionary D. The construction methods are equations (1) and (2). Step 5: in this step, we explain how to update the dictionary D and the sparse coding matrices X and V. According to the objective function (6) of the locality constraint and the label embedding constraint, we only regard the coding coefficient matrix V as a variable, and other variables in the equation are regarded as constants. Therefore, the formula for calculating V can be solved as follows:Here, is an identity matrix.Similarly, we can get the update formula of the dictionary D and the coding coefficient matrix X as follows:After obtaining the updated dictionary D, the Laplacian graph L is updated by equations (1) and (2). The step should be performed continuously until the termination condition is satisfied. Step 6 (reconstruction of the training samples): the reconstruction method is implemented as follows: Step 7: using reconstructed training sample matrix Z, we reobtain dictionary D and coding coefficient matrices X and V by equations (7)–(9), respectively. Step 8: classification parameter B is calculated and the test samples are classified. We record the label matrix of the test set as and use it and the reobtained coding coefficient matrix X above to calculate classification parameter B. The calculation equation is as follows:

For each test sample y, we use the OMP algorithm [37] to obtain its corresponding coding coefficient, denoted as . The classification result vector F is calculated by the following equation:

Suppose that . If is the maximum value of F, then the test sample is classified into the th class. The algorithm only needs to iterate from step 1 to step 8 once to obtain the classification result.

The scheme presented above for reconstruction of the training samples is inspired by the algorithm proposed in [38]. However, the algorithm in [38] is proposed for rerepresentation of a test sample by virtue of all training samples, whereas our scheme is deviced for reconstruction of the training samples. In other words, the algorithm proposed in [38] uses the rerepresentation of a test sample for sparse representation, while the reconstruction of training samples in our algorithm is used for dictionary learning.

4. Experimental and Results

In this section, we conducted experiments on several widely used face databases. These face databases are AR face database (AR) [39], Extended Yale B face database [40], Labeled Faces in the Wild database (LFW) [41], and CMU PIE face database (PIE) [42]. To better reflect the advantages of dictionary reconstruction for the algorithm, we compare the proposed algorithm with other excellent algorithms, including classical dictionary learning and sparse representation algorithms. In all experiments, the proposed algorithm was compared with seven algorithms. They are LCLE-DL [31], LC-KSVD2 [30], D-KSVD [29], K-SVD [43], LLC [44], Sparse Representation-based Classification algorithm (SRC) [45], and Linear Regression Classification algorithm (LRC) [46]. The following is the experimental results and analysis of face databases.

In our experiments, all algorithms were executed ten times and then the average recognition rate and standard deviation were calculated. In the experiment table, the number in parentheses following the algorithm indicates the number of atoms, and the symbol represents the standard deviation of the average recognition rate.

4.1. Experiments on the AR Face Database

The AR face database contains 126 people with more than 4,000 color images. The images are obtained under different angles, lighting, and facial expressions. Besides, the image of each subject is divided into occlusion and no occlusion, in other words, whether the subject has glasses or a scarf. In this experiment, we selected a subset of the AR face database for experimentation. The subset contains a total of 3120 images of 120 subjects, and each subject selects 26 images. Figure 1 shows examples of face images of a subject in the AR database.

In the experiment, for each object, we randomly selected eight images (including the first five images) as training samples. The remaining images of each subject are used as test samples. We set the parameters of the algorithm as , , and . The experimental results of the average recognition rates of different algorithms are shown in Table 1. Here, we give the case where the number of atoms is 840 and 960.

From Table 1, we can see that the DRLA achieves the best classification performance when the number of atoms is 960 or 840. For example, when the number of atoms is 960, the average recognition rate of the DRLA is 83.1%, which is 3.7% and 4.3% higher than LCLE-DL and K-SVD, respectively. Similarly, the DRLA algorithm is superior to other algorithms in terms of average recognition rate. When the number of atoms is 840, the average recognition rate of the DRLA is 82.6%. However, the average recognition rates of the LCLE-DL, LC-KSVD2, D-KSVD, and K-SVD algorithms were 79.3%, 78.5%, 69.3%, and 25.5%, respectively. It can be seen that the DRLA achieves better classification performance. Although DRLA can be viewed as one improvement to the locality-constrained and label embedding dictionary learning (LCLE-DL) algorithm, DRLA attains better recognition results.

Figure 2 shows the average recognition rates of the DRLA, LCLE-DL, LC-KSVD2, D-KSVD, and K-SVD algorithms as the number of atoms increases (K = 120, 240, …, 600, 720). It can be clearly seen that the DRLA algorithm has obtained the best recognition accuracy.

4.2. Experiments on the LFW Face Database

In this section, we select a subset from the LEW database as the experimental dataset. The subset contains 1215 images with a total of 86 people and approximately 11–20 images per person. We adjusted each image to a 32 × 32 pixel size. Figure 3 shows examples of face images in the LEW database.

In the experiment, we selected the first five images of each subject as training samples and randomly selected three images from the remaining images of each subject to join the training samples. In other words, each subject has eight training samples and the remaining samples as the test image. We set the parameters of the algorithm as , , and . The experimental results on the LFW database are shown in Table 2, which shows the average recognition rates of different algorithms. Table 2 mainly shows the results when the number of atoms is 688. We also give the average recognition rate of the proposed algorithm when the number of atoms is 602.

From Table 2, we can see that when the number of atoms is 688, the average recognition rate of the DRLA is higher than other algorithms. For example, the average recognition rate of the DRLA is 38.9%, which is slightly higher than 38.8% of the LCLE-DL algorithm. Moreover, the recognition rate of the DRLA is significantly higher than 32.2% of LC-KSVD2, 33.4% of D-KSVD, and 34.8% of LLC. In addition, when the number of atoms is 602, the average recognition rate of the DRLA is 38.4%, which is 1.6% higher than LCLE-DL. The above results show that the DRLA has good classification performance.

When the number of atoms (K = 86, 172, …, 430, 516) changes, the average recognition rates of the DRLA, LCLE-DL, LC-KSVD2, D-KSVD, and K-SVD algorithms are shown in Figure 4. Obviously, the DRLA algorithm performs better than the other four algorithms.

4.3. Experiments on the Yale B Face Database

The images of The Extended Yale B database are collected under different lighting and facial expressions. The database contains 38 people with a total of 2,414 images and approximately 59–64 images per person. Each image is adjusted to a 32 × 32 pixel size. Figure 5 shows examples of face images in the Yale B database.

In the experiment, for each subject, we randomly selected 20 images (including the first five images) as training samples. The remaining images of each subject are used for testing. We set the parameters of the algorithm as , , and . On the Yale B database, the average recognition rates of different algorithms are shown in Table 3. We mainly compare the case when the number of atoms is 760.

From Table 3, when the number of atoms is 760 and 722, the average recognition rate of DRLA is improved by 0.2% compared with LCLE-DL. Similarly, the recognition rate of the DRLA is still the highest compared to other algorithms. For example, when the number of atoms is 760, the DRLA is 0.7% and 3.6% higher than SRC and LRC, respectively.

Figure 6 shows the average recognition rates of the DRLA, LCLE-DL, LC-KSVD2, D-KSVD, and K-SVD algorithms with different numbers of atoms (K = 38, 76, …, 760). When the number of atoms increases, the average recognition rates of the DRLA, LCLE-DL, and LC-KSVD2 algorithms also gradually increases, but K-SVD performs unstable.

4.4. Experiments on the CMU PIE Face Database

We select the C05, C07, C09, C27, and C29 subsets of the PIE database as experimental data. The face database subset contains a total of 68 people, each with 170 images. The images of each subject were collected under different lighting and facial expressions. We resized all images to 32 × 32 size. Figure 7 shows examples of face images in the PIE database.

In the experiment, we randomly selected ten images (including the first 5 images) of each subject as training samples, and the rest as test samples. We set the parameters of the algorithm as , , and . The average recognition rates of different algorithms are shown in Table 4. When the number of atoms is 680, the DRLA algorithm achieves the best performance, which improves the average recognition rate by 5.4% compared to LCLE-DL.

As the number of atoms (K = 68, 136, …, 408, 476) increases, the average recognition rates of the DRLA, LCLE-DL, LC-KSVD2, D-KSVD, and K-SVD algorithms are shown in Figure 8. Compared with the other four algorithms, the performance of the DRLA algorithm is very superior.

5. Conclusions

There are always noises in the original training samples. For example, differences in face images of the same object, which are caused by changes in lighting, facial expressions, and posture can be viewed as noise too in terms of viewpoint of pattern classification. We believe that these noises may reduce the discriminative ability of the dictionary. To reduce the influence of noise in the original training samples on dictionary performance, this paper proposes a dictionary relearning algorithm, which is implemented based on the locality constraint and the label embedding constraint. A large number of face experiments show that our proposed algorithm can eliminate some noise in the original training samples. This algorithm not only can effectively improve the discriminative ability of the dictionary but also enhance the robustness of the dictionary learning algorithm. In addition, we believe that the idea of dictionary reconstruction has a certain versatility, and it can be applied to a variety of different dictionary learning algorithms. The method is also an effective way to reduce the impact of image noise on the performance of the algorithm.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Acknowledgments

This work was supported by the Research Foundation for Advanced Talents of Guizhou University under Grant (2016) no. 49, Key Disciplines of Guizhou Province-Computer Science and Technology (no. ZDXK [2018]007), Key Supported Disciplines of Guizhou Province-Computer Application Technology (no. QianXueWeiHeZi ZDXK[2016]20), and National Natural Science Foundation of China (nos. 61462013 and 61661010).