Mathematical Problems in Engineering

Mathematical Problems in Engineering / 2020 / Article

Research Article | Open Access

Volume 2020 |Article ID 9356935 |

E. Zhu, M. Xu, D. Pi, "A Novel Robust Principal Component Analysis Algorithm of Nonconvex Rank Approximation", Mathematical Problems in Engineering, vol. 2020, Article ID 9356935, 17 pages, 2020.

A Novel Robust Principal Component Analysis Algorithm of Nonconvex Rank Approximation

Academic Editor: Francesc Pozo
Received21 Jun 2020
Revised04 Sep 2020
Accepted17 Sep 2020
Published30 Sep 2020


Noise exhibits low rank or no sparsity in the low-rank matrix recovery, and the nuclear norm is not an accurate rank approximation of low-rank matrix. In the present study, to solve the mentioned problem, a novel nonconvex approximation function of the low-rank matrix was proposed. Subsequently, based on the nonconvex rank approximation function, a novel model of robust principal component analysis was proposed. Such model was solved with the alternating direction method, and its convergence was verified theoretically. Subsequently, the background separation experiments were performed on the Wallflower and SBMnet datasets. Furthermore, the effectiveness of the novel model was verified by numerical experiments.

1. Introduction

As fuelled by the advancement of big data and artificial intelligence industry, the data in the image processing, semantic analysis, and other application fields exhibit extremely large scale and dimension, leading to a significant difficulty of data processing and analysis. It has been reported from research that the nature of the mentioned data is commonly characterized by low-rank subspace [1]. For this reason, the problems in the fields mentioned can be transformed into the problems such as low-rank matrix representation [2] or recovery [3]. In industrial production, the collected data usually contain outliers or noises for the factors (e.g., environment and equipment). For instance, as impacted by cloud or other objects’ covering, the geomorphological map collected by the sensors contains shadows; face image exhibits differences in brightness for the effect of the light; the quality of nuclear magnetic resonance image degrades because of object motion, and the defects of imaging systems, as well as inherent noise and external interference of recording equipment. Under the impact of the mentioned practical problems, image denoising, image reconstruction, image restoration, and image completion have become research hotpots.

Image reconstruction and image denoising aim at discovering the low-rank part of the image-related data, i.e., the background in the image. Moreover, image recognition of face recognition and motion segmentation can exploit the subspace clustering method to recover the low-rank part of the image data, i.e., seeking the low-rank matrix representation of the data. Seeking the low-rank part and low-rank matrix representation in the data, denoising, and completing can be essentially considered the low-rank matrix recovery from the perspective of mathematics. Accordingly, the low-rank matrix recovery model creates a unified framework for the mentioned problems existing in image processing.

From the mathematic perspective, each piece of data is arranged into a column vector, and then a data matrix is formed. It is assumed that can be decomposed aswhere ; denotes the low-rank matrix, representing the intrinsic low-rank subspace characteristics of the real data; and is the sparse matrix, representing the outliers or noises in the real data.

To solve the low-rank matrix recovery problem, classical principal component analysis and robust principal component analysis can be conducted, whereas numerous problems may exist in the problem-solving process.

Problem 1. Classical principal component analysis [4, 5] method is adopted to solve the model below:where denotes the Frobenius norm of the matrix and represents the rank of the matrix. First, the rank of low-rank matrix was assessed, which was not larger than . Second, should be the only small independent identically distributed Gaussian noises. Equation (2) exploits the singular value matrix decomposition to determine the singular value matrix decomposition of data matrix , main singular values, and corresponding left singular vectors, which were subsequently employed to indicate the major features of the analysis or recognition pattern. Principal component analysis (PCA), as an effective tool for data analysis and dimensional reduction, has been extensively applied in various fields. However, if the data are seriously polluted (e.g., large Gaussian noise or only sparse non-Gaussian noise), conventional PCA will fail.

Problem 2. In conventional subspace learning models, principal component analysis (PCA), linear discriminant analysis (LDA), and independent component analysis [6] (ICA) can significantly denoise small Gaussian noise contained in data, whereas they are very sensitive to the data containing outliers or sparse large noises. To recover the low-rank structure from the data containing sparse large noises, Chandrasekaran et al. [7] and Wright et al. [8] also built a robust principal component analysis model, which is expressed below:where denotes the -norm of matrix , i.e., the number of nonzero elements in , and represents the positive trade-off parameter.
The objective function of equation (3) is a nonlinear nonconvex function, and its solution is NP problem and relatively difficult. The compression sensing theory was introduced in this problem to solve the convex relaxation problem:where denotes the kernel norm of matrix (trace norm or Ky-Fan norm), i.e., , where denotes the rank of and is the i-th singular value of Y. represents the -norm of matrix , i.e., is the sum of the absolute values of all elements of .
When the singular vector of the low-rank matrix distributes reasonably and the nonzero elements of the sparse matrix distribute uniformly, equation (4) recovers the low-rank part and the noisy part from the matrix of the real data exhibiting a probability close to 1. However, in the definition of nuclear norm, all singular values are treated unequally. All singular values in the problem should upregulate the rank by 1 regardless of their values. However, in terms of the nuclear norm, the contribution of the nonzero singular values to the nuclear norm is directly determined by its value. Accordingly, the nuclear norm is commonly regulated by some large singular values, even if there are only a few large singular values, whose effect cannot be overlooked. Thus, in the presence of a large singular value, the nuclear norm will deviate from the rank of the low-rank matrix, so it cannot accurately approximate the rank, and the degree of inaccuracy is upregulated with the increase in the singular value.

Problem 3. The situation that the nuclear norm cannot well approximate the rank of the matrix can be considered a problem arising from the compression sensing. Since -norm is the minimum convex envelope of -norm [9], conventional compression sensing technology exploits -norm instead of -norm to recover the original data. However, under several conditions, is both a low-rank matrix and a sparse matrix, while denotes a sparse matrix exhibiting low-rank properties. It was demonstrated that -norm is not an exact approximation of -norm [10, 11], and in most cases, this approximation bias is not negligible.
Over the past few years, to improve the approximation rank of nuclear norm, numerous scholars have been committed to develop nonconvex rank approximation functions [1214] to remedy the defect of nuclear norm convex rank approximation functions, and they have been extensively employed. Considerable experiments proved that the nonconvex rank approximation can more accurately and fastly recover the low-rank part of the data than the nuclear norm. For instance, Sun et al. [11] proposed a nonconvex model of robust principal component analysis with capped trace norm and capped -norm; Kang et al. [15] built a more accurate nonconvex rank approximation function, termed as log-determinant function. In literature [16], Kang et al. defined a novel norm as the nonconvex rank approximation, which is termed as -norm, and the experimental results revealed that it is a better rank approximation than the nuclear norm. Xie et al. [17] built a nonconvex regularization tool, termed as weighted Schatten -norm, which acted as the rank approximation to achieve a better approximation. Lu et al. [12] introduced a class of nonconvex approximation functions of -norm to the singular values of low-rank matrices and built a nonconvex low-rank minimization problem with the nonconvex approximation functions thus obtained. Yu et al. [18] achieved a threshold representation as a recursive function for general regularization problems; subsequently, they designed a filtering algorithm suitable for image reconstruction. Xu et al. [19] introduced an algorithm of background subtraction based on low rank and structured sparse decomposition. The algorithm process fully complied with Gao’s algorithm [20], block-sparse RPCA. The difference between algorithms of Gao and Liu was that the section of first-pass RPCA introduced structure sparse norm considering the foreground structural continuity. It utilized the adaptive regularization parameter based on the significant adjustment of motion detection method to do group-sparsity operation, and thereby better results were obtained. Liu et al. [21] proposed an efficient numerical solution, Grassmannian Online Subspace Updates with Structured-Sparsity (GOSUS), for online subspace learning in the context of sequential observations involving structured perturbations. Xu modelled both homogeneous perturbations of the subspace and structural contiguities of outliers by Grassmannian fluid, and after certain manipulations, they solved via an alternating direction method of multipliers. Javed et al. [22] introduced a spatiotemporal low-rank modelling method to overcome occlusions by foreground objects and redundancy in video data. The proposed method encodes spatiotemporal constraints by regularizing spectral graphs and uses optical flow information to remove redundant data and to create a set of dynamic frames from the input video sequence. Through their experiments, it can be confirmed that the algorithm has a high efficiency in processing background reconstruction. However, the 11 parameters required by the two models proposed by them need to be optimized, which greatly affects the robustness of the algorithm.
Since the rank approximation of the low-rank matrix refers to the critical problem in the low-rank matrix recovery, how to build a more accurate approximation function should be studied [12, 13, 23], and on that basis, the corresponding low-rank matrix recovery model is built, which is of great practical significance to solving image processing problems (e.g., image reconstruction, image denoising, and image recognition).
Based on the mentioned inspirations, starting from the construction of nonconvex rank approximation function of low-rank matrix, the present study built a low-rank matrix recovery model suitable for nonconvex approximation in different image processing problems and then designed the corresponding algorithm to obtain more accurate low-rank matrix part from the original data, as an attempt to solve the specific problems (e.g., image reconstruction and image denoising).

2. Relevant Work

The so-called low-rank matrix is presented as follows: assuming , its rank is , and . Subsequently, X is the low rank. Each row and column of low-rank matrix covers considerable redundant information, which can be exploited for image recovery and image feature extraction.

2.1. Evolution of Low-Rank Matrix Recovery Model

The present study involved the relevant application of low-rank representation; its model evolution and solving algorithm were first presented. Starting from the simplest case, assuming that the data were clean, i.e., the data were not polluted or not covered with any noise and outliers, considering the following rank minimization problemswhere is termed as the dictionary matrix and can also be considered the basis matrix of the data space; denotes the low-rank matrix; and the lowest rank of the data matrix was expressed by solving equation (5). The general solution of equation (5) is to replace the rank with the nuclear norm, which is transformed into the following problem:

Liu et al. proved the properties of the solution of equation (6) in literature [24].

2.1.1. Optimal Solution Is Unique

Since the kernel norm is not a strictly convex function, equation (6) may achieve multiple optimal solutions, whereas the theorem below can still prove that equation (6) achieves a unique optimal solution.

Theorem 1 (see [24]). Assuming and shows a feasible solution, i.e., , then

denotes the unique solution to equation (6), where expresses the generalized inverse of .

2.1.2. Optimal Solution Is Block Diagonal

In the selection of a right dictionary, the lowest rank representation will reveal the true segmentation result. In other words, when the columns of and are indeed from independent subspaces, the optimal solution of equation (6) can reveal the subspace relationship between sample points. The mentioned conclusion can be expressed by the theorem below.

Theorem 2. (see [24]). Set as the set of subspaces, and the rank (or dimension) of is , and . Assuming that denotes the set of sample points of subspace , represents the set of sample points of subspace , and satisfies . If the subspace is independent, the optimal solution of problem (6) is block diagonal:where is the coefficient matrix and .

Clean data building is mentioned above. Besides, a general situation is given below. For data with noise, the following rank minimization model should be built:where denotes a positive compromise factor, i.e., the equilibrium parameter. represents ‐norm, where denotes a parameter of norm, i.e., can be taken as (1,2), (2,1) etc., and denotes the noise matrix. Obviously, when ( denotes the unit matrix), equation (9) is converted into the form of equation (4). Accordingly, the low-rank representation can be considered as the generalization of robust principal component analysis with orthonormal basis as a dictionary. By taking a proper basis, the data from different subspaces can be extracted with low-rank representation, thereby remedying the defect that the robust principal component analysis can only process the data from the identical subspace.

For equation (9), the kernel norm acts as the convex approximation of the rank function; since it is more robust to noises and outliers, it is converted into the following model:

Equation (10) can be solved by the augmented Lagrange multiplier method. For the separation of variables, a novel variable matrix is introduced, and constraints are added. Equation (10) has the following equivalent form:

The augmented Lagrange function of equation (11) is

Equation (11) can be solved with precise and imprecise Lagrange multiplier methods. The convergence of the former has been verified in literature [25]. For the latter, convergence is not ensured for cases with more than two pieces for Y. The primary steps to solve equation (12) with the precise Lagrange multiplier method are presented below:Step 1: with singular value thresholding operator [26] (SVT operator), the optimal solution of the (k + 1)-th iteration is solved byStep 2: the optimal solution of the (k + 1)-th iteration is solved:Step 3: based on the conclusion in literature [27], the optimal solution of the (k + 1)-th iteration is solved by using the following formula:Step 4: the multiplier matrices and and parameter are updated:where and denote the predetermined positive numbers; generally,  = 1.1 and  = 106.According to equations (13)–(16), each variable is alternately updated until the termination condition of the algorithm is reached. For instance, a previously selected small positive number (generally,  = 10−7, 10−8) can be taken in the following form [24]: , .

2.2. Principle of Constructing Nonconvex Rank Approximation Function

Since the kernel norm is the minimum convex envelope of the rank function, the conventional low-rank matrix recovery model employs the kernel norm as the rank approximation function of the low-rank matrix; as a result, the problem is transformed into a convex optimization problem, e.g., equations (4), (6), and (10). However, the kernel norm approximation rank function has some defects, so the conventional low-rank matrix recovery model should be optimized. Since -norm is only loose approximation of -norm, only the suboptimal solution of the original problem can be obtained. Thus, a function better than the -norm should be developed to replace the -norm. Under this context, scholars proposed numerous nonconvex approximations of the -norm, and some nonconvex approximate functions are presented in literature [12] (e.g., -norm [18] (), Geman and Young [28], smoothly clipped absolute deviation (SCAD) [29], minimax concave penalty (MCP) [30], Laplace [31], capped [32], exponential-type penalty (ETP) [33], and logarithmic function [34]). The specific function forms are listed in Table 1.

Name of alternate functionExpression of alternate function , ,

Geman and Young [28]
SCAD [29]
MCP [30]
Laplace [31]
Capped [32]
ETP [33]
Logarithm [34]

The images of the eight nonconvex approximate functions are shown in Figure 1.

As shown in Figure 1, the graphs of the mentioned functions show a similar trend, which also inspires us to develop novel nonconvex approximate functions. Moreover, by combining with the problem of approximate rank of nuclear norm, to develop the approximate function of the rank function more accurately than the nuclear norm, our principle aims to satisfy the following properties:(1)The contribution of a singular value complying with a large value the rank approximation function should be bounded; its contribution should be significantly reduced compared with the nuclear norm, i.e., it satisfieswhere denotes the function of singular value , representing the contribution of the singular value, and is a positive constant.(2) denotes the nonsubtractive function of singular value .(3)When the singular value , the contribution is 0, i.e., .

2.3. A Novel Low-Rank Matrix Recovery Model with Nonconvex Rank Approximation

According to the definition of nuclear norm, under a too large value of a certain , the rank estimation will be excessively large, affecting the approximation effect. The present section will give a novel nonconvex approximation function to approximate the rank function of the matrix. Lu et al. [35] summarized the existing nonconvex approximation model and developed the following general form model:where denotes a monotonically increasing continuous concave function defined in . The gradient function of Lipschitz is continuous.

In terms of the matrix , , where denotes a certain form of nonconvex approximation function. Subsequently, the rank approximation function in equations (3) and (5) can be approximately replaced with . Thus, a model based on nonconvex rank approximation can be developed.

For the robust principal component analysis model, the nonconvex rank approximation model of equation (3) is expressed aswhere ; denotes the -norm of matrix ; and is the positive compromise factor.

A general low-rank matrix recovery model based on nonconvex rank approximation is mentioned above. For instance, equation (4) indicates a model with sparse noise only in the original data, while equation (5) builds the model with no noise in the original data. For the low-rank matrix recovery model under other noises, the rank function of the low-rank matrix in the model can also be replaced by several appropriate nonconvex rank approximation functions.

In the nonconvex rank approximation model, by adopting nonconvex rank approximation function , exhibiting higher accuracy than the nuclear norm, the use of the nonconvex rank approximation model can more effectively restore low-rank matrix part in the original data, which can enhance the application effect of the original low-rank matrix recovery model in image processing (e.g., image reconstruction, image denoising, and image recognition).

Subsequently, a novel low-rank matrix recovery model with nonconvex rank approximation was proposed, and the rank function of low-rank matrix in the existing model was substituted with the nonconvex rank approximation function. Next, a correlation algorithm was developed to address the problem.

2.4. Nonconvex Rank Approximation Function

Consistent with the approximation function of -norm given by Geman and Young [28], a novel nonconvex rank approximation function was developed in the present study:where denotes the equilibrium factor, and ; represents the convergence rate parameter of , and . Next, it will be verified that satisfies the three properties in the previous section:(1) is assumed as the singular value of matrix , and , and :Moreover, ,In other words, the contribution of the singular value exhibiting a large value to the rank approximation function should be bounded.(2):In other words, denotes a nondecreasing function of singular value .(3).

That is, under the zero singular value , the contribution is zero.

satisfies the property of nonconvex approximate rank function. Based on the function in equation (20), the following -norm can be obtained.

Definition 1. (-norm) , is assumed as a parameter. Subsequently, it is defined aswhere denotes the i-th singular value of matrix .
When parameter takes different values, the true rank of the matrix is approximated by the nonconvex function derived from Definition 1, as presented in Figure 2.
In Figure 2, given the rank and the rank approximation function, the horizontal coordinate x denotes the singular value, while the true rank is 1 under the nonzero singular value, and the rank is zero under the zero singular value; thus, it is expressed as a horizontal line with the function value of 1 in the figure. The curve indicates the nonconvex approximation function . With the increase in the singular value , its function value will not surge, and it is relatively consistent with the real rank. The difference from the rank can indicate the approximation effect. With the increase in the singular value , the function value of is close to 1, which differs slightly from the real rank. Accordingly, as a rank approximation function, its approximation effect is significantly high. Under the relatively small value of , for instance, or , a relatively large difference is identified between the curve obtained and the real rank. However, with the increase in the value of , the obtained curve approximates the real rank more efficiently. For instance, when , the curve’s approximation effect has been prominent.
Moreover, Figure 2 presents the simplification in the presence of only one singular value. The nuclear norm complies with a line starting from the origin with a slope of 1. Since the nuclear norm is the sum of all singular values, under only one singular value, the nuclear norm is exactly the singular value. Furthermore, the figure indicates the obvious defect of approximate rank of nuclear norm. The greater the singular value, the greater the degree of deviation of nuclear norm from true rank. Thus, when the nuclear norm is adopted to approximate the rank, even if there are only a few large singular values, the deviation of approximation should not be overlooked.
Next, this study proves theoretically and tests equation (24) as a more accurate nonconvex rank approximation function based on experimental results. It is known that the mentioned norm equation (24) does not satisfy the direct proportionality of the matrix norm (for arbitrary constant , ), whereas it is only a virtual norm, or a pseudonorm. However, it can also be verified that they satisfy the properties below.

Property 1. defined in equation (24) exhibits the following properties:(1)(2) denotes unitary invariant, i.e., any two orthogonal matrices and meet (3)Positive definitiveness: for any , and if and only if

Proof. (1)Forthe following is obtained:In other words, when , the function value is 1 for the nonzero singular value, and 0 for the singular value of zero. On the contrary, given the rank of the matrix , it is also the case that each singular value with the value of nonzero upregulates the rank by 1, while each zero singular value does not increase the rank of the matrix. is derived by accumulating the function value of all singular values, while the number of nonzero singular values of all singular values is expressed as . Thus, the formula is proved to be true.(2)It is assumed that , , and , respectively, denote the conjugate transpose, inverse, and eigenvalues of the matrix and represents the unit matrix of the appropriate order. Given the characteristic polynomial,Based on the derivation, it is suggested that the matrices and show the identical characteristic polynomial, so they achieve the same eigenvalue. Subsequently, the singular value of and refers to the nonnegative square root of the eigenvalues and , respectively, and the singular value of is equal to that of .Thus, is true, and is unitarily invariant.(3)First, the eigenvalue of singular value is the nonnegative square root, so . Next, . Accordingly,Then,So,End.

2.5. RPCA of New Nonconvex Rank Approximation

As impacted by the limitations of PCA and RPCA, in the present study, a novel nonconvex rank approximation robust principal component analysis (NNCRPCA) was developed, capable of effectively dealing with the overall corrosion and exhibiting a good generalization ability. For any novel data damaged by gross error, the learning projection can effectively eliminate the possible damage by projecting the data into its low-rank subspace. As revealed from the results of considerable experiments performed on three static continuous image datasets and one dynamic video dataset, NNCRPCA is robust to serious corruption and can effectively process novel data.

In the low-rank matrix recovery model, the rank function is approximately replaced by -norm of to obtain the following model:where , and is a positive compromise factor.

For equation (31), we can change it into an unconstrained optimization form by convex relaxation:where and are nonnegative parameters used to balance data consistency.

For the unconstrained optimization problem (32), we can use the alternating direction method to solve large-scale optimization problems that are commonly encountered and dealt with in the field of machine learning.

Using the alternating direction method, in the (k + 1)-th iteration, equation (32) requires solving the following two subproblems:

To solve equation (33), first of all, its objective function is the combination of nonconvex functions and convex functions is needed to be noticed: the first term is the nonconvex function, and the second term is the convex function. Thus, the solution can be obtained with the help of the difference of convex function (DC) [36]. The convex function algorithm proposed by Tao in Literature [37] refers to an algorithm based on duality theory and local optimality conditions of convex function optimization, and the convergence proof is presented in Literature [36].

Lemma 1. The subdifferential of :where the columns of and , respectively, denote left and right singular vectors of , and the following formula is yielded from equation (25):

According to Lemma 1, the (k + 1)-th iteration of in equation (33) is written as

In equation (37), the partial derivative of the objective function is with respect to Y, and then it is set as zero, so the following formula can be obtained:

Equation (38) refers to the iterative formula for solving Y.

Next, the solution of E is to be discussed. Soft thresholding of singular values of E can be adopted iteratively. Accordingly, the (k + 1)-th iteration in equation (34) can be expressed aswhere and matrix .

The iterative formula of matrix is as follows:

In equation (40), to keep the data consistency, the novel is derived by subtracting the residuals from .

The mentioned solution process is summarized in Algorithm 1.

Input data: initial data ;
dictionary matrix;
, thresholds.
Initialization: ,,.
Output data:
Iteration steps:
(1)Update variables : according to the following formula−
(2)Update variables : according to the following formula−
(3)Update variables : according to the following formula−
(4)Stopping condition:
Repeat Steps 1–3; otherwise, stop iteration and return to ,.
2.6. Algorithm Convergence

In the present section, the convergence proof of the algorithm is provided.

Lemma 2. Set and as the sequence generated by the algorithm, then and are bounded.

Proof. First, the objective function of equation (32) is denoted as the following form:Based on equation (33) and (34),Then,Thus, the sequence is nonincreasing.
Next, it is verified that the sequences and are bounded, and the boundedness of is subsequently proved by the proof by contradiction.
Assuming is unbounded, then for any real number , there is always a positive integer that satisfies for and any norm , i.e., . Accordingly, the first term of equation (41) is .
As the second term in equation (41) is a pseudonorm, next, we will discuss the boundedness of this term under the following two conditions:(1)Under unbounded , if is unbounded, according to equation (24), the following formula can be obtained:In a review of the nonconvex rank approximation, to better approximate the rank function, the value of approaches , so it is suggested that it is bounded in this case.(2)When is unbounded, if is bounded, then according to equation (24), it is obvious that, in this case, is also bounded.In conclusion, is always bounded.
Consistent with the proof method in , assuming is unbounded, the first and third terms in equation (41) are then unbounded as follows:
and are unbounded.
In brief, both and are unbounded, so is unbounded. Combined with its nonnegative property, it is suggested that , which is not consistent with the conclusion in equation (43) that the sequence is nonincremental. Accordingly, the sequences and are bounded.
Based on the mentioned lemma, the below theorem can be developed.

Theorem 3. Assuming that and are the sequences generated by the algorithm, then any clustering point of sequence is a local minimum point of equation (32).

Proof. ased on Lemma 2, it can be known that is bounded. So has at least one gathering point, i.e., . Combining with the conclusion that is a nonincremental sequence in equation (43), it can be known that there is , and the following formula is true:Thus, refers to the local minimum point of problem (32).

3. Results and Discussion

In the present study, using Wallflower ( and SBMnet (, the novel model was adopted to solve background separation problem; subsequently, it was compared with existing GoDec [38], ALM, RegL1-ALM [24], NCRPCA, GOSUS [19], and SLM [22] methods. Video background separation, a process to remove foreground and separate background from video shot with a fixed camera, has been extensively employed in computer moving target detection. In the present study, the experimental operating environment is Intel(R) Core™ i7-8565U CPU @1.80ghz 2.00ghz and memory 16.00 GB. All algorithms are written by MATLAB R2018b.

3.1. Parameter Analysis

The novel algorithm was tested and analysed on Wallflower and SBMnet, respectively. By referencing the stop criterion of the four comparison algorithms, the relative error value of this algorithm in the Wallflower background separation experiment reaches and in the SBMnet background separation experiment, which is expressed as

For Wallflower, the effect of the parameter values on the experimental effect was analysed. To be specific, a common quantitative index, i.e., root-mean-square error (RMSE), was adopted as the measurement standard, which is defined as follows:where , , and , respectively, denote the original observation data matrix, low-rank matrix, and sparse matrix. The value of encoding operator refers to the unit matrix.

Next, with Wallflower as an example, the relationship between the value of parameter and RMSE is illustrated. The initial value , step size = 0.01, and then stops when .(1)When , the value of RMSE approaches 1, suggesting that the algorithm cannot recover low-rank matrix from data at this time, i.e., the experimental effect is not ideal(2)When , the value of RMSE maintains stably between 0.5 and 0.9, and meantime, the experimental effect of the algorithm is ideal

Accordingly, a conclusion was also drawn that the larger the , the better the approximation degree of the approximation rank, whereas the image reconstruction effect of the algorithm in the actual implementation is unsatisfactory. Among the experiments on Wallflower, when the RMSE value is the smallest, and the rank is approximately the optimal, .

The statistical information of Wallflower exploited in the experiment is listed in Table 2 (e.g., image dimension, image frame number, and dimension of stretch matrix).

DatasetsDimensionsFramesDimension of stretch matrix

Bootstrap160 × 120305519200 × 3055
Camouflage160 × 12035319200 × 353
Lightswitch160 × 120271519200 × 2715
Waving trees160 × 12028719200 × 287
Time of day160 × 120589019200 × 5890
Moved object160 × 120174519200 × 1745
Foreground aperture160 × 120211319200 × 2113

The statistical information of SBMnet exploited in the experiment is listed in Table 3 (e.g., image dimension, image frame number, and dimension of stretch matrix).

Category nameChallengesFramesDimension of stretch matrix

HighwayCars moving1700320 × 240 × 1700
Bus stationBackground objects moving away, objects stopping for a short while, and then moving away617360 × 240 × 617
Group campusClutter videos300320 × 240 × 300
Camera parameterVideos with strong illumination changes451320 × 240 × 451
Advertisement boardVideos exhibiting dynamic background motion598504 × 336 × 598
Bus stop morningVideos containing more than 3,500 frames3766320 × 240 × 3766
ToscanaVideos containing a limited number of frames (less than 20) with a very low framerate6800 × 600 × 6

3.2. Experimental Results

Besides the experimental renderings, the quantitative indexes to test the algorithm consist of calculation time, iteration times, and other common indexes. Moreover, since the premise of the low-rank matrix recovery model adopted for image reconstruction is that the matrix corresponded by the background part is low rank, the rank of matrix Y recovered by the algorithm should be maximally small, and the most ideal result is that the rank is 1. By equation (33), the RMSE was calculated after applying various algorithms to Wallflower and SBMnet, as an attempt to indicate the background separation effect. Besides the mentioned indexes, the RE is another vital index to reflect the accuracy of the algorithm.

Let us start with Wallflower first. Many challenges are placed in the dataset to evaluate the performance of the algorithms. The Bootstrap refers to a scene of taking food from a restaurant. The characters in the scene are constantly updated with frequent activities, and there exist considerable characters. The Camouflage refers to a scene of electronic computer screen. When recording the Camouflage dataset, the computer screen refreshes at a certain frequency, and someone flashes by in front of the computer screen, so the background is difficult to separate. In LightSwitch, both the electronic computer refreshing, and the behaviour process of characters are identified; meantime, the transformation of lights is added in the video shooting process. The Waving Trees refers to a scene in which a man passes through the shaking trees. The Foreground Aperture refers to a scene of one person sleeping on the table in a low-light environment. The Move Object refers to a scene in which a person enters the classroom, picks up the phone to talk, and leaves. The Time of Day refers to a scene in which a living room passes from night to day and then from day to night. We used many algorithms (i. e. GoDec, ALM, RegL1-ALM, NCRPCA, GOSUS, SLM, and NNCRPCA) to do background separation experiments on these datasets and evaluated their performance. Table 4 lists the algorithms’ experimental quantitative index for Wallflower, and Figure 3 illustrates the algorithms’ rendering for the background separation experiment.

AlgorithmsEvaluation indexesBSCFLSWTTODMOFA

GoDec [38]RE2.325e − 033.625e − 032.402e − 051.0996e − 042.5267e − 036.7650e − 048.2345e − 03
Rank ()133481675
CPU time(s)28.67023.534419.24570.6691815.46684.36426.2168

ALMRE3.5567e − 043.5567e − 061.4453e − 052.2310e − 062.2121e − 042.2223e − 052.3512e − 05
Rank ()66342268321356
CPU time(s)241.592534.561246.747118.5634568.3478167.2456356.2323

RegL1-ALM [24]RE2.426e − 051.390e − 051.255e − 053.5624e − 073.4011e − 042.5144e − 072.8995e − 07
Rank ()589237164989831153
CPU time(s)231.500228.287424.6420.5867430.5623205.8264284.3467

NCRPCARE1.0996e − 071.2994e − 072.1757e − 073.6489e − 085.3841e − 072.3394e − 072.2697e − 07
Rank ()8341234
CPU time(s)83.83761.427939.79351.0232128.584722.71954.980

GOSUS [19]RE7.2312e − 025.2562e − 034.2231e − 023.231e − 013.2548e − 004.4423e − 032.4323e − 02
Rank ()65631856
CPU time(s)2.23122.82701.24321.405813.75784.62855.1747

SLM [22]RE9.6320e − 086.2156e − 087.5632e − 096.5432e − 084.5621e − 087.3230e − 082.3120e − 08
Rank ()14582322
CPU time(s)29.23122.321010.23543.342138.265413.621824.6654

NNCRPCARE9.990e − 087.6623e − 073.2243e − 092.5643e − 083.5478e − 092.7865e − 083.6542e − 09
Rank ()6431223
CPU time(s)51.45322.542614.67454.221448.552323.224635.6240

BS: Bootstrap; CF: Camouflage; LS: LightSwitch; WT: Waving Trees; TOD: Time of Day; MO: Moved Object; FA: Foreground Aperture. The bold represents the minimum value of the same index in the same subject experiment, and the underline represents the maximum value.

Table 4 indicates that the performance of the algorithms mentioned above is quite different in the background separation experiment. The earlier algorithms, ALM and RegL1-ALM, have higher iteration times and more running time, but they own better relative error and RMSE than GoDec and GOSUS. GoDec and GOSUS are superior in execution time and have fewer iterations. SLM has better performance in all aspects of background separation experiment, with less iterations and less execution time. NCRPCA is inferior to SLM in execution time and iteration. Although NNCRPCA is worse than SLM in execution time, it is better than SLM in other aspects. RMSE and relative error are kept with a good level; especially the rank of tensor matrix of dataset is kept at a low level, which is lower than other algorithms. Figure 3 shows the processing effect of background separation by different algorithms, and it is suggested that the background calculated by the NNCRPCA algorithm is clearer.

In real life, influenced by the variations in quantity, lighting, weather, and other conditions, video background changes constantly. The rank of the background matrix reaches over 1, and the separation process is relatively complex, which is beyond the reach of ordinary algorithms. In the following section, the approximate model proposed in the present study was applied in SBMnet, and the feasibility of the model was verified by numerical experiments.

In the SBMnet dataset, many challenging scenarios which were illustrated in Table 3 were set up and imposed great challenges on the calculation of the algorithm. Through many experiments, we can observe how the backgrounds of scenarios could be separated by the above algorithms and the performance of the algorithms verified again. The algorithms’ background separation experimental quantitative index for SBMnet is listed in Table 5, and Figure 4 shows the algorithm’s rendering for the background separation. The Toscana dataset has only 6 frames, which makes GOSUS unable to run.

AlgorithmsEvaluation indexesABBSCPGCHHTo

GoDec [38]RE3.6720e − 053.6753e − 057.6534e − 049.6543e − 050.5632e − 040.5634e − 03
Rank ()57685226894
CPU time(s)13.58227.1464.38452.794517.92020.32478

ALMRE3.8976e − 069.6735e − 063.7865e − 063.8732e − 064.7650e − 068.6723e − 06
Rank ()33543831016910806
CPU time(s)2819.98705816.26574890.2370872.60007623.8876520.2010

RegL1-ALM [24]RE6.582e − 067.4981e − 064.159e − 063.4045e − 067.5781e − 084.7509e − 07
Rank ()2963212441649533
CPU time(s)240.8317111.115363.86339.3734458.85522.4931

NCRPCARE6.9508e − 102.0525e − 092.128e − 091.5526e − 107.9364e − 101.8266e − 11
Rank ()25713223
CPU time(s)30.072715.47698.89455.034582.17070.28524

GOSUS [19]RE3.45e − 011.69e − 012.32e+003.05e − 012.71e − 01NaN
Rank ()237308209153389NaN
CPU time(s)18.2986.9154.68143.710114.7244NaN

SLM [22]RE3.8908e − 087.6754e − 098.6543e − 088.3276e − 078.3326e − 080.6732e − 07
Rank ()48594332694
CPU time(s)23.361248.323340.639842.123246.22193.2134

NNCRPCARE5.7823e − 107.6765e − 113.6745e − 115.7628e − 126.3224e − 109.7862e − 12
Rank ()2448183
CPU time(s)28.378956.223147.389749.223852.54876.1020

AB: Advertisement Board; BS: Bus Station; CP: Camera Parameter; GC: Group Campus; HH: highway; To: Toscana.

According to Table 5, the performance of NNCRPCA is better than other algorithms. In terms of relative error and RMSE, the results of NNCRPCA are lower than or close to those of SLM, but lower than those of other algorithms. NNCRPCA is still lower in rank approximation than other algorithms, including SLM, but its CPU time is still inferior to some algorithms. Figure 4 shows the processing effect of background separation by different algorithms, and it is suggested that the background calculated by the NNCRPCA algorithm is clearer.

Tables 4 and 5 suggests that the NNCRPCA algorithm achieves prominent experimental results for RE and RMSE in the background separation application. The NNCRPCA algorithm modifies the existing convex approximation that employs nuclear norm as the low-rank matrix and adopts a more accurate approximation method of nonconvex rank approximation function. In the application of background separation, its effect is higher than that of nuclear norm, and it is significantly improved in time and rank. Besides, the improvement of the RMSE and relative error is also the result of adopting a better rank approximation. A low rank is achieved by the NNCRPCA algorithm, demonstrating that the low-rank matrix to extract the background part of the image is more accurate, so the sparse part extraction of the experiment is more complete. As revealed from Figures 3 and 4, the low-rank part extracted by our algorithm is clearer, and the information of the sparse part is more complete.

4. Conclusions

A novel algorithm for robust principal component analysis of nonconvex approximation rank function was proposed. Unlike the previous models, a novel nonconvex approximation rank function acted as the rank approximation of the low-rank matrix. Subsequently, the robust principal component analysis based on this nonconvex rank approximation function was established, and the NNCRPCA algorithm based on the alternating direction method was developed. Next, the numerical experiments were performed on Wallflower as well as SBMnet. From the perspective of image effect, as well as the main quantitative indexes (e.g., running time, iteration times, RMSE, and RE), the algorithm effectively improved the image background separation effect. Our major contributions are revealed in the following three aspects:(1)A novel nonconvex rank approximation function was constructed. Different from the conventional methods in which the nuclear norm acted as convex approximation, the new rank approximation function approximated the rank of the low-rank matrix very well, the construction and solution of the novel algorithm for robust principal component analysis were achieved, and the convergence of the algorithm was proved theoretically.(2)It is combined with the unconstrained problem of alternating direction method and convex function after solving the differential of convex relaxation. Since the objective function of the optimization problem to be solved is the combination of convex function and nonconvex function, the differential of convex function refers to an effective technique to solve this type of optimization problems.(3)The low-rank matrix recovery model was applied for background separation, and the experimental results assessed the application prospect of applying the nonconvex approximation method in this problem. Moreover, in terms of data processing mode, the background separation done in the present study requires further studies. With the continuous increase in the data size in the video dataset, the model and algorithm proposed in this study will exhibit a broader application prospect.

Data Availability

The data used to support the findings of the study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.


Some of the authors of this publication are also working on these related projects: (1) Higher Vocational Education Teaching Fusion Production Integration Platform Construction Projects of Jiangsu Province under grant no. 2019(26); (2) Natural Science Fund of Jiangsu Province under grant no. BK20131097; (3) “Qin Lan Project” teaching team in Colleges and Universities of Jiangsu Province under grant no. 2017(15); (4) High Level of Jiangsu Province Key Construction Project funding under grant no. 2017(17). Our research was supported by “Geometry problem geometry” project (the National Natural Science Foundation of China, 61073086).


  1. D. Xia, “Confidence Region of singular subspaces for Low-Rank matrix regression,” IEEE Transactions on Information Theory, vol. 65, no. 11, pp. 7437–7459, 2019. View at: Publisher Site | Google Scholar
  2. H. Yuan, J. Li, and L. L. Lai, “Low-rank matrix regression for image feature extraction and feature selection,” Information Sciences, vol. 522, pp. 214–226, 2020. View at: Publisher Site | Google Scholar
  3. P. Wang, C. D. Lin, C. Lin, X. Yang, and S. Xiong, “Low-rank and sparse matrix recovery from noisy observations via 3-block Admm algorithm,” Journal of Applied Analysis & Computation, vol. 10, no. 3, pp. 1024–1037, 2020. View at: Publisher Site | Google Scholar
  4. H. Hoteling, “Analysis of a complex of statistical variables into principal components,” Journal of Educational Psychology, vol. 24, no. 6, pp. 417–441, 1933. View at: Google Scholar
  5. I. Jolliffe, Principal Component Analysis, Springer, Berlin, Germany, 1986.
  6. D. Efimova, Encyclopedia of Social Network Analysis and Mining, Springer, Berlin, Germany, 2014.
  7. V. Chandrasekaran, S. Sanghavi, P. Parrilo et al., “Sparse and low-rank matrix decompositions,” in Proceedings of the 47th Annual Allerton Conference on Communication, Control, and Computing, pp. 962–967, IEEE, Monticello, IL, USA, September 2009. View at: Publisher Site | Google Scholar
  8. J. Wright, A. Ganesh, S. Rao et al., “Robust principal component analysis: exact recovery of corrupted low-rank matrices via convex optimization,” in Proceedings of Advances in Neural Information Processing Systems (NIPS), pp. 2080–2088, Vancouver, Canada, December 2009. View at: Google Scholar
  9. D. L. Donoho, “De-noising by soft-thresholding,” IEEE Transactions on Information Theory, vol. 41, no. 3, pp. 613–627, 1995. View at: Publisher Site | Google Scholar
  10. D. L. Donoho, “For most large underdetermined systems of linear equations the minimal 1-norm solution is also the sparsest solution,” Communications on Pure and Applied Mathematics, vol. 59, no. 6, pp. 797–829, 2006. View at: Publisher Site | Google Scholar
  11. Q. Sun, S. Xiang, and J. Ye, “Robust principal component analysis via capped norms,” in Proceeding 19th ACM SIGKDD International Conference on Knowledge Discovery Data Mining (KDD), pp. 311–319, Chicago, IL, USA, August 2013. View at: Google Scholar
  12. C. Lu, J. Tang, S. Yan, and Z. Lin, “Nonconvex nonsmooth low rank minimization via iteratively reweighted nuclear norm,” IEEE Transactions on Image Processing, vol. 25, no. 2, pp. 829–839, 2016. View at: Publisher Site | Google Scholar
  13. C. Gao, N. Wang, Q. Yu et al., “A feasible nonconvex relaxation approach to feature selection,” in Proceeding 25th AAAI Conference on Artificial Intelligence (AAAI), pp. 356–361, San Francisco, CA, USA, August 2011. View at: Google Scholar
  14. J. Fan and R. Li, “Variable selection via nonconcave penalized likelihood and its oracle properties,” Journal of the American Statistical Association, vol. 96, no. 456, pp. 1348–1360, 2001. View at: Publisher Site | Google Scholar
  15. Z. Kang, C. Peng, J. Cheng et al., “Lodged rank minimization with application to subspace clustering,” Computational Intelligence and Neuroscience, vol. 68, p. 2015, 2015. View at: Google Scholar
  16. Z. Kang, C. Peng, and Q. Cheng, “Robust PCA via nonconvex rank approximation,” in Proceeding IEEE International Conference on Data Mining (ICDM), pp. 211–220, IEEE, Atlantic City, NJ, USA, November 2015. View at: Google Scholar
  17. Y. Xie, Y. Qu, D. Tao et al., “Hyperspectral image restoration via iteratively regularized weighted schatten p -norm minimization,” IEEE Transactions on Geoscience and Remote Sensing, vol. 54, no. 8, pp. 4642–4659, 2016. View at: Publisher Site | Google Scholar
  18. H. Yu and C. Miao, “General thresholding representation for regularization,” IEEE Transactions on Image Processing, vol. 24, no. 12, pp. 5455–5468, 2015. View at: Google Scholar
  19. J. Xu, V. Ithapu, L. Mukherjee, J. M. Rehg, and V. Singh, “GOSUS: Grassmannian online subspace updates with structured-sparsity,” in Proceedings of the 2013 IEEE International Conference on Computer Vision, pp. 3376–3383, Sydney, Australia, December 2013. View at: Publisher Site | Google Scholar
  20. Z. Gao, L. F. Cheong, and Y. X. Wang, “Block-sparse RPCA for salient motion detection,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 36, no. 10, pp. 1975–1987, 2014. View at: Publisher Site | Google Scholar
  21. X. Liu, G. Zhao, J. Yao, and S. K. Jung, “Background subtraction based on low-rank and structured sparse decomposition,” IEEE Transactions on Image Processing, vol. 24, no. 8, pp. 2502–2514, 2015. View at: Publisher Site | Google Scholar
  22. S. Javed, L.-F. Cheong, Y.-X. Wang et al., “Spatiotemporal low-rank modeling for complex scene background initialization,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 28, no. 6, pp. 1315–1329, 2018. View at: Publisher Site | Google Scholar
  23. T. Zhang, “Analysis of multi-stage convex relaxation for sparse regularization,” Journal of Machine Learning Research, vol. 11, pp. 1081–1107, 2010. View at: Google Scholar
  24. G. Liu, Z. Lin, S. Yan et al., “Robust recovery of subspace structures by low-rank representation,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 35, no. 1, pp. 171–184, 2013. View at: Publisher Site | Google Scholar
  25. J.-F. Bertsekas, Constrained Optimization and Lagrange Multiplier Methods, Academic Press, Cambridge, MA, USA, 1982.
  26. J. Cai, E. J. Candès, Z. Shen, and Y. Wang, “A singular value thresholding algorithm for matrix completion,” SIAM Journal on Optimization, vol. 20, no. 4, pp. 1956–1982, 2010. View at: Publisher Site | Google Scholar
  27. J. Yang, W. Yin, Y. Zhang, and Z. Lin, “A fast algorithm for edge-preserving variational multichannel image restoration,” SIAM Journal on Imaging Sciences, vol. 2, no. 2, pp. 569–592, 2009. View at: Publisher Site | Google Scholar
  28. D. Geman and C. Yang, “Nonlinear image recovery with half-quadratic regularization,” IEEE Transactions on Image Processing, vol. 4, no. 7, pp. 932–946, 1995. View at: Publisher Site | Google Scholar
  29. C.-H. Zhang, “Hyperspectral image denoising with cubic total variation model,” ISPRS Annals of Photogrammetry, Remote Sensing and Spatial Information Sciences, vol. 7, pp. 95–98, 2012. View at: Publisher Site | Google Scholar
  30. C. H. Zhang, “Nearly unbiased variable selection under minimax concave penalty,” The Annals of Statistics, vol. 38, no. 2, pp. 894–942, 2010. View at: Publisher Site | Google Scholar
  31. J. Trzasko and A. Manduca, “Highly under sampled magnetic resonance image reconstruction via homotopic 0-minimization,” IEEE Transactions on Medical Imaging, vol. 28, no. 1, pp. 106–121, 2009. View at: Publisher Site | Google Scholar
  32. T. Zhang, “Analysis of multi-stage convex relaxation for sparse regularization,” Journal of Machine Learning Research, vol. 11, pp. 1081–1107, 2010. View at: Google Scholar
  33. H. Othman and S. E. Qian, “Noise reduction of hyperspectral imagery using hybrid spatial-spectral derivative-domain wavelet shrinkage,” IEEE Transactions on Geoscience and Remote Sensing, vol. 44, no. 2, pp. 397–408, 2006. View at: Publisher Site | Google Scholar
  34. J. H. Friedman, “Fast sparse regression and classification,” International Journal of Forecasting, vol. 28, no. 3, pp. 722–738, 2012. View at: Publisher Site | Google Scholar
  35. C. Y. Lu, J. H. Tang, S. C. Yan et al., “Nonconvex nonsmooth low rank minimization via iteratively reweighted nuclear floNn,” IEEE Transactions on Image Processing, vol. 2, no. 25, pp. 829–839, 2016. View at: Google Scholar
  36. P. D. Tao and H. A. Thi, “Convex analysis approach to dc programming: theory, algorithms and applications,” Acta Mathematica Vietnamica, vol. 22, no. 1, pp. 289–355, 1997. View at: Google Scholar
  37. P. D. Tao, “Duality in dc (difference of convex functions) optimization. Sub gradient methods,” Trends in Mathematical Optimization, Birkhäuser, Basel, Switzerland, 1998. View at: Google Scholar
  38. T. Zhou and D. Tao, “Godec: randomized low-rank & sparse matrix decomposition in noisy case,” in Proceedings of the 28th International Conference on Machine Learning, Bellevue, WA, USA, January 2011. View at: Google Scholar

Copyright © 2020 E. Zhu et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

More related articles

 PDF Download Citation Citation
 Download other formatsMore
 Order printed copiesOrder

Related articles

Article of the Year Award: Outstanding research contributions of 2020, as selected by our Chief Editors. Read the winning articles.