Abstract

The constrained rank minimization problem has various applications in many fields including machine learning, control, and signal processing. In this paper, we consider the convex constrained rank minimization problem. By introducing a new variable and penalizing an equality constraint to objective function, we reformulate the convex objective function with a rank constraint as a difference of convex functions based on the closed-form solutions, which can be reformulated as DC programming. A stepwise linear approximative algorithm is provided for solving the reformulated model. The performance of our method is tested by applying it to affine rank minimization problems and max-cut problems. Numerical results demonstrate that the method is effective and of high recoverability and results on max-cut show that the method is feasible, which provides better lower bounds and lower rank solutions compared with improved approximation algorithm using semidefinite programming, and they are close to the results of the latest researches.

1. Introduction

In recent decades, with the increase of the system acquisition and data services, the explosion of data poses challenges to storage, transmission, and processing, as well as device design. The burgeoning theory of sparse recovery reveals the outstanding sensing ability on the large scales of high-dimensional data. Recently, a new theory called compressed sensing (or compressive sampling (CS)) has appeared. And this theory is praised as a big deal in signal processing area. This approach can acquire the signals while compressing data properly. Its sampling frequency is lower than that of Nyquist, which makes the collections of high resolution signal possible. One noticeable merit of this approach is its ability to combine traditional data collection and data compression into a union when facing sparse representation signals. This merit means that sampling frequency of signals, time and computational cost consuming on data processing, and expense for data storage and transmission are all greatly reduced. Thus this approach leads signal processing to a new age. However, same with CS, Low Rank Matrix theory can also solve signal reconstruction problems that are related to theory and practice. It is also carried out through exerting dimension reduction treatments on unknown multidimension signals validly. And then use sparse concepts in a more broad sense to the problems. That is to say, reconstruct the original high-dimensional signals array completely or approximately by a small number of values measured by sparse (or reducing dimensions) sampling. Considering that both of these two theories are closely related to intrinsic sparsity of signals, we generally view them as sparse recovery problems. The main goal of sparse recovery problems is using less linear measurement data to recovery high-dimensional sparse signal. Its core mainly includes three steps: signal sparse representation, linear dimension reduction measure, and the nonlinear reconstruction of signal. The three parts for the sparse recovery theory are closely related; each part will have a significant impact on the quality of signal reconstruction. However, the nonlinear reconstruction of signal can be seen as a special case of rank minimization problem to be solved.

Constrained rank minimization problems have attracted a great deal of attention over recent years, which aim to find sparse solutions of a system. The problems are widely applicable in a variety of fields including machine learning [1, 2], control [3, 4], Euclidean embedding [5], image processing [6], and finance [7, 8], to name but a few. There are some typical examples of constrained rank minimization problems; for example, one is affine matrix rank minimization problems [5, 9, 10]where is the decision variable and the linear map and vector are known. Equation (1) includes matrix completion, based on affine rank minimization problem and compressed sensing. Matrix completion is widely used in collaborative filtering [11, 12], system identification [13, 14], sensor network [15, 16], image processing [17, 18], sparse channel estimation [19, 20], spectrum sensing [21, 22], and multimedia coding [23, 24]. Another is max-cut problems [25] in combinatorial optimizationwhere , , , , is the weight matrix, and so on. As a matter of fact, these problems can be written as a common convex constrained rank minimization model; that is, where is a continuously differentiable convex function, is the rank of a matrix , is a given integer, is a closed convex set, and is a closed unitarily invariant convex set. According to [26], a set is a unitarily invariant set ifwhere denotes the set of all unitary matrices in . Various types of methods have been proposed to solve (3) or its special cases with different types of .

When is , (3) is the unconstrained rank minimization problem. One approach is to solve this case by convex relaxation of . Cai et al. [9] proposed singular value thresholding (SVT) algorithm; Ma et al. [10] proposed fixed point continuation (FPC) algorithm. Another approach is to solve rank constraint directly. Jain et al. [27] proposed simple and efficient algorithm Singular Value Projection (SVP) based on the projected gradient algorithm. Haldar and Hernando [28] proposed an alternating least squares approach. Keshavan et al. [29] proposed an algorithm based on optimization over a Grassmann manifold.

When is a convex set, (3) is a convex constrained rank minimization problem. One approach is to solve it by replacing rank constraint with other constraints, such as trace constraint and norm constraint. Based on this, several heuristic methods have been proposed in literature; for example, Nesterov et al. [30] proposed interior-point polynomial algorithms; Weimin [31] proposed an adaptive seminuclear norm regularization approach; in addition, semidefinite programming (SDP) has also been applied; see [32]. Another approach is to solve rank constraint directly. Grigoriadis and Beran [33] proposed alternating projection algorithms for linear matrix inequalities problems. Mohseni et al. [24] proposed penalty decomposition (PD) method based on penalty function. Gao and Sun [34] recently proposed the majorized penalty approach. Burer et al. [35] proposed nonlinear programming (NLP) reformulation approach.

In this paper, we consider problem (3). We introduce a new variable and penalize equality constraints to objective function. By this technique, we can obtain closed-form solution and use it to reformulate the objective function in problem (3) as a difference of convex functions; then (3) is reformulated as DC programming which can be solved via linear approximation method. A function is called DC if it can be represented as the difference of two convex functions. Mathematical programming problems dealing with DC functions are called DC programming problems. Our method is different from original PD [26]; for one thing we penalize equality constraints to objective function and keep other constraints, while PD penalizes all except rank constraint to objective function, including equality and inequality constraints; for another each subproblem of PD is approximately solved by a block coordinate descent method, while we approximate problem (3) to a convex optimization to be solved finally. Compared with PD method, our method uses the closed-form solutions to remove rank constraint, while PD needs to consider rank constraint in each iteration. It is known that rank constraint is a nonconvex and discontinuous constraint, which is not easy to deal. And due to the remaining convex constraints, the formulated programming can be solved by solving a series of convex programming. In our method, we use linear function instead of the last convex function to formulate the programming to convex programming. We test the performance of our method by applying it to affine rank minimization problems and max-cut problems. Numerical results show that the algorithm is feasible and effective.

The rest of this paper is organized as follows. In Section 1.1, we introduce the notation that is used throughout the paper. Our method is shown in Section 2. In Section 3, we apply it to some practical problems to test its performance.

1.1. Notations

In this paper, the symbol denotes the -dimensional Euclidean space, and the set of all matrices with real entries is denoted by . Given matrices and in , the standard inner product is defined by , where denotes the trace of a matrix. The Frobenius norm of a real matrix is defined as . denotes the nuclear norm of , that is, the sum of singular values of . The rank of a matrix is denoted by . We use to denote the identity matrix, whose dimension is . Throughout this paper, we always assume that the singular values are arranged in nonincreasing order, that is, . denotes the subdifferential of the function . We denote the adjoint of by . is used to denote Ky Fan -norms. Let be the projection onto the closed convex set .

2. An Efficient Algorithm Based on DC Programming

In this section, we first reformulated problem (3) as a DC programming problem with the convex constrained set, which can be solved via stepwise linear approximation method. Then, the convergence is provided.

2.1. Model Transformation

Let . Problem (3) is rewritten asAdopting penalty function method to penalize to objective function and choosing proper penalty parameter , then (5) can be reformulated as Solving with fixed , (6) can be treated asIn fact, given the singular value decomposition, , and singular values of : ,  , where and are orthogonal matrices, the minimal optimal problem with respect to with fixed in (7), that is, , has closed-form solution withwhere . Thus, by the closed-form solution and the unitary invariance of -norm, we have Substituting the above into problem (7), problem (3) is reformulated as

Clearly, is a convex function about . Define function , , . Next, we will study the property of . Problem (3) would be reformulated as a DC programming problem with convex constraints if is a convex function, which can be solved via stepwise linear approximation method. In fact, it does hold, and equals a trace function.

Theorem 1. Let where denotes zero matrix whose dimension is determined depending on context; then, for any , the following hold:(1) is a convex function;(2).

Proof. (1) Since are eigenvalues of , then , where is Ky Fan -norms (defined in [36]). Clearly, is a convex function. Let . For and   we haveNote that ; thus they have the same nonzero singular values, and then holds. Together with (12), we have It is known that for arbitrary matrices , Further, That is, where and are symmetric matrices. It is also known that, for arbitrary Hermitian matrices , , the following holds (see Exercise II.1.15 in [24]) Thus Together with (13) and (16), (18) yields Hence, is a convex function.
(2) Let and is the rank of , and we have Moreoverwhere .
Thus, holds.

According to above theorem, (10) can be rewritten asSince the objective function of (23) is a difference of convex functions, the constrained set is a closed convex constraint, so (23) is DC programming. Next, we will use stepwise linear approximate method to solve it.

2.2. Stepwise Linear Approximative Algorithm

In this subsection, we use stepwise linear approximation method [37] to solve reformulated model (23). Let represent objective function of model (23); that is, and the linear approximation function of can be defined as where (see 5.2.2 in [36]).

Stepwise linear approximative algorithm for solving model (23) can be described as follows.

Given the initial matrix , penalty parameter , rank , , set

Step 1. Compute the single value decomposition of ,

Step 2. Update

Step 3. If stopping criterion is satisfied, stop; else set and and go to Step 1.

Remark(1)Subproblem (26) can be done by solving a series of convex subproblems. The efficient method for convex programming is multiple and it has perfect theory guarantees.(a)When is , closed-form solutions can be obtained; for example, when solving model (1), are obtained by the first-order optimality conditions (b)When is a general convex set, the closed-form solution is hard to get, but subproblem (26) can be solved by some convex optimization software programs (such as CVX and CPLEX). According to different closed convex constraint sets, we can pick and choose suitable convex programming algorithm to combine with our method. The toolbox listed in this paper is just an alternative choice.(2)The stop rule that we adopted here is with some small enough.

2.3. Convergence Analysis

In this subsection, we analyze the convergence properties of the sequence generated by above algorithm. Before we prove the main convergence result, we give the following lemma which is analogous to Lemma 1 in [10].

Lemma 2. For any and where is the pre- singular values approximate of   ().

Proof. Without loss of generality, we assume and the singular value decomposition withwhere ,   and ,   . can be written as , (). Note that , , , are orthogonal matrices; we haveAnd we note thatwhere are also orthogonal matrices.
Next we estimate an upper bound for . According to Theorem 7.4.9 in [38], we get that an orthogonal matrix is a maximizing matrix for the problem , if and only if is positive semidefinite matrix. Thus, , , and achieve their maximum, if and only if ,  ,  and are all positive semidefinite. It is known that when is positive semidefinite, Applying (32) to above three terms, we have Hence, without loss of generality, assuming , we havewhich implies that holds.

Next, we claim the value of iterations of stepwise linear approximative algorithm is nonincreasing.

Theorem 3. Let be the iterative sequence generated by stepwise linear approximative algorithm. Then is monotonically nonincreasing.

Proof. Since , we haveand is the subdifferential of , which implies Let ,   in above inequality; we haveTherefore,Consequently, We immediately obtain the conclusion as desired.

Next, we show the convergence of iterative sequence generated by stepwise linear approximative algorithm when solving (23). Theorem 4 shows the convergence when solving (23) with .

Theorem 4. Let be the set of stable points of problem (23) with no constraints; then the iterative sequence generated by linear approximative algorithm converges to some when .

Proof. Assuming that is any optimal solution of problem (23) with , according to the first-order optimality conditions, we have Since , thenEquation (41) subtracts (40): in which
By Lemma 2, we have Norming both sides of (42), by the triangle inequalities property of norms, we have which implies that the sequence is monotonically nonincreasing when . We then observe that is bounded, and hence exists, denoted by , where is the limited point of . Let in (41); by the continuity, we get , which implies that . By setting , we have which completes the proof.

Now, we are ready to give the convergence of iterative sequence when solving (23) with general closed convex .

Theorem 5. Let be the set of optimal solutions of problem (23); then the iterative sequence generated by stepwise linear approximative algorithm converges to some when .

Proof. Generally speaking, subproblem (26) can be solved by projecting a solution obtained via solving a problem with no constraints onto a closed convex set; that is, , where . According to Theorem 4, the sequence is monotonically nonincreasing and convergent when ; together with the well known fact that is a nonexpansive operator, we have where is the projection of onto . Clearly, .
Hence, , which implies that converge to . We immediately obtain the conclusion.

3. Numerical Results

In this section, we demonstrate the performance of our method proposed in Section 2 by applying it to solve affine rank minimization problems (image reconstruction and matrix completion) and max-cut problems. The codes implemented in this section are written in MATLAB and all experiments are performed in MATLAB R2013a running Windows 7.

3.1. Image Reconstruction Problems

In this subsection, we apply our method to solve one case of affine rank minimization problems. It can be formulated as where is the decision variable and the linear map and vector are known. Clearly, (47) is a special case of problem (3). Hence, our method can be suitably applied to (47).

Recht et al. [39] have demonstrated that when we sample linear maps from a class of probability distributions, then they would obey the Restricted Isometry Property. There are two ingredients for a random linear map to be nearly isometric. First, it must be isometric in expectation. Second, the probability of large distortions of length must be exponentially small. In our numerical experiments, we use four nearly isometric measurements. The first one is the ensemble with independent, identically distributed (i.i.d.) Gaussian entries. The second one has entries sampled from an i.i.d. symmetric Bernoulli distribution, which we call Bernoulli. The third one has zeros in two-thirds of the entries and others sampled from an i.i.d. symmetric Bernoulli distribution, which we call Sparse Bernoulli. The last one has an orthonormal basis, which we call Random Projection.

Now we conduct numerical experiments to test the performance of our method for solving (47). To illustrate the scaling of low rank recovery for a particular matrix , consider the MIT logo presented in [39]. The image has 46 rows and 81 columns (the total number of pixels is ). Since the logo only has 5 distinct rows, it has rank 5. We sample it using Gaussian i.i.d., Bernoulli i.i.d., Sparse Bernoulli i.i.d., and Random Projection measurements matrix with the number of linear constraints ranging between 700 and 2400. We declared MIT logo to be recovered if ; that is, SNR 60 dB (). Figures 1 and 2 show the results of our numerical experiments.

Figure 1 plots the curves of SNR under four different measurement matrices mentioned above with different numbers of linear constraints. The numerical values on the -axis represent the number of linear constraints , that is, number of measurements, while those on -axis represent the Signal to Noise Ration between the true original and recovered by our method. Noticeably, Gauss and Sparse Binary can successfully reconstruct MIT logo images from about 1000 constraints, while Random Projection requires about 1200 constraints. Binary requires the minimal constraints about 950. The Gauss and Sparse Binary measurements reach their maximal SNR of 220 dB at around 1500 constraints; the Binary measurements can reach its maximal SNR of 180 dB at around 1400 constraints. Random Projection has more large phase transition point but has the best recovery error. We observe a sharp transition to perfect recovery near 1000 measurements lower than . Compared with Figure 2, which shows results of nuclear norm heuristic method in [39] and the method that is a kind of convex relaxations, there is a sharp transition to perfect recovery near 1200 measurements which is approximately equal to . The comparison shows our method has better performance that it can recover an image with less constraints; namely, it needs less samples. In addition, we observe that SNR of nuclear norm heuristic method achieves 120 dB when the number of measurements approximately equals 1500, while our method just needs 1200 measurements, which decrease by 20% compared with 1500. If the number of measurements is 1500, accuracy of our method would be higher. Among precisions of four different measurement matrixes, the smallest one is precision of Random Projection. However, the precision value is also higher than 120 dB. The performance illustrates that our method is more accurate. Observing the curve of between 1200 and 1500, we can find that our curve rises smoothly compared with the curve in the same interval in Figure 2; this shows that our method is stable for reconstructing image.

Figure 3 shows recovered images under four different measurement matrices with 800, 900, and 1000 linear constraints. And Figure 4 shows recovered images by our method and that by SeDuMi. We chose to use this interior-point method because it yields the highest accuracy. Distinctly, the recovered images by SeDuMi have some shadow, while ours is clear. Moreover, as far as time is concerned, our method is more fast and the computational time is not extremely sensitive to compared to SeDuMi. The reason is that closed-form solutions can be obtained when we solve this problem. Comparisons are shown in Table 1.

3.2. Matrix Completion Problems

In this subsection, we apply our method to matrix completion problems. It can be reformulated as where and are both matrices and is a subset of index pairs . Up to now, numerous methods were proposed to solve the matrix completion problem (see, e.g., [9, 10, 29, 31]). It is easy to see that problem (48) is a special case of general rank minimization problem (3) with .

In our experiments, we first created random matrices and with i.i.d. Gaussian entries and then set . We sampled a subset of entries uniformly at random. We compare against singular value thresholding (SVT) method [9], FPC, FPCA [10], and the OptSpace (OPT) method [29] using as a stop criterion. The maximum number of iterations is no more than 1500 in all methods; other parameters are default. We report result averaged over 10 runs.

We use to denote sampling ratio and to represent the dimension of the set of rank matrices divided by the number of measurements. NS and AT denote number of matrices that are recovered successfully and average times (seconds) that are successfully solved, respectively. RE represents the average relative error of the successful recovered matrices. We use the relative error to estimate the closeness of to , where is the optimal solution to (48) produced by our method. We declared to be recovered if the relative error was less than .

Table 2 shows that our method has the close performance with OptSpace in column of NS. These performances are better than those of SVT, FPC, and FPCA. However, SVT, FPC, and FPCA cannot obtain the optimal matrix within 1500 iterations which shows that our method has advantage in finding the neighborhood of optimal solution. The results do not mean that these algorithms are invalid, but the maximum number of iterations is limited to 1500. When the rank of matrix is large (i.e., ), the average time of our algorithm is less than that of OptSpace. In some cases, the relative errors of DC are minimal. When the dimension of matrix becomes larger, DC method has a better performance of NS and similar relative errors with that of OptSpace. In terms of time, our time is longer than that of OptSpace, but shorter than that of FPC and SVT. Tables 3 and 4 show the above results. In conclusion, our method has better performance in “hard” problems; that is, . The problems involved matrices that are not of very low rank, according to [10].

Table 5 gives the numerical results for problems with large dimensions. In these numerical experiments, the maximum number of iterations is chosen by default of each compared method. We observe that our method can recover problems successfully as other four methods, while our method is slower with the increase of dimension. However, our method is faster than FPC. Though SVT, FPCA, and OptSpace outperform our method in terms of speed, they are specifically developed method for matrix completion problems while our method can be applied to more class of problems. To prove this, we will apply our method to max-cut problems.

3.3. Max-Cut Problems

In this subsection, we apply our method to max-cut problems. It can be formulated as follows: where , , , , and is the weight matrix. Equation (51) is one general case of problem (3) with . Hence, our method can be suitably applied to (51).

We now address the initialization and termination criteria for our method when applied to (51). We set ; in addition, we choose the initial penalty parameter to be 0.01 and set the parameter to be 1.1. And we use as the termination criteria with .

In our numerical experiments, we first test several small scale max-cut problems; then we perform our method on the standard “” set problems written by Professor Ginaldi [40]. In all, we consider 10 problems ranging in size from 800 variables to 1000 variables. The results are shown in Tables 6 and 7, in which “” and “” represent corresponding dimension of graph and the rank of solution, respectively. The “obv” denotes objective values of optimal matrix solutions operated by our method.

In Table 6, “KV” denotes known optimal objective values and “UB” presents upper bounds for max-cut by the semidefinite relaxation. It is easy to observe that our method is effective for solving these problems and it has computed the global optimal solutions of all problems. Moreover, we obtain lower rank solution compared with semidefinite relaxation method.

Table 7 shows the numerical results for standard “” set problems. “SUB” denotes upper bounds by spectral bundle method proposed by Helmberg and Rendl [41]. “GW-cut” denotes the approximative algorithm proposed by Goemans and Williamson [42]. “DC” denotes results generated by our method, is the results of DC with neighborhood approximation algorithm, and the right most column, titled BK, gives best results as we have known according to [43, 44]. We use “—” to indicate that we do not obtain the rank. From Table 3 we can note that our optimal values are close to the results of latest researches, better than values given by GW-cut, and they are all less than the corresponding upper bounds. For the problems, we all get the lower rank solution compared with GW-cut; moreover, most solutions satisfy the constraints . The results prove that our algorithm is effective when used to solve max-cut problems.

4. Concluding Remarks

In this paper we used closed-form of penalty function to reformulate convex constrained rank minimization problems as DC programming, which could be solved by stepwise linear approximative method. Under some suitable assumptions, we showed that accumulation point of the sequence generated by the stepwise linear approximative method was a stable point. The numerical results on affine rank minimization and max-cut problems demonstrate that our method is generally comparable or superior to some existing methods in terms of solution quality, and it can be applied to a class of problems. However, the speed of our method needs to be improved.

Competing Interests

The authors declare that they have no competing interests.

Acknowledgments

The authors need to acknowledge the fund projects. Wanping Yang and Jinkai Zhao are supported by Key Projects of Philosophy and Social Sciences Research (Ministry of Education, China, 15JZD012), Soft Science Fund of Shaanxi Province of China (2015KRM114), and the Fundamental Research Funds for the Central Universities. Fengmin Xu is supported by the Fundamental Research Funds for the Central Universities and China NSFC Projects (no. 11101325).