Truncated Nuclear Norm Minimization for Image Restoration Based on Iterative Support Detection

Wang, Yilun; Su, Xinhua

doi:https://doi.org/10.1155/2014/937560

Mathematical Problems in Engineering

On this page

Abstract Introduction Results Conclusion References Copyright Related Articles

Research Article | Open Access

Volume 2014 | Article ID 937560 | https://doi.org/10.1155/2014/937560

Truncated Nuclear Norm Minimization for Image Restoration Based on Iterative Support Detection

Yilun Wang¹and Xinhua Su¹

Academic Editor: Chien-Yu Lu

Received20 Jun 2014

Accepted21 Aug 2014

Published17 Nov 2014

Abstract

Recovering a large matrix from limited measurements is a challenging task arising in many real applications, such as image inpainting, compressive sensing, and medical imaging, and these kinds of problems are mostly formulated as low-rank matrix approximation problems. Due to the rank operator being nonconvex and discontinuous, most of the recent theoretical studies use the nuclear norm as a convex relaxation and the low-rank matrix recovery problem is solved through minimization of the nuclear norm regularized problem. However, a major limitation of nuclear norm minimization is that all the singular values are simultaneously minimized and the rank may not be well approximated (Hu et al., 2013). Correspondingly, in this paper, we propose a new multistage algorithm, which makes use of the concept of Truncated Nuclear Norm Regularization (TNNR) proposed by Hu et al., 2013, and iterative support detection (ISD) proposed by Wang and Yin, 2010, to overcome the above limitation. Besides matrix completion problems considered by Hu et al., 2013, the proposed method can be also extended to the general low-rank matrix recovery problems. Extensive experiments well validate the superiority of our new algorithms over other state-of-the-art methods.

1. Introduction

In many real applications such as machine learning [1–3], computer vision [4], and control [5], we seek to recover an unknown (approximately) low-rank matrix from limited information. This problem can be naturally formulated as the following model: where is the decision variable and the linear map : and vector are given. However, this is usually NP-hard due to the nonconvexity and discontinuous nature of the rank function. In [6], Fazel et al. firstly solved rank minimization problem by approximating the rank function using the nuclear norm (i.e., the sum of singular values of a matrix). Moreover, theoretical studies show that the nuclear norm is the tightest convex lower bound of the rank function of matrices [7]. Thus, an unknown (approximately) low-rank matrix can be perfectly recovered by solving the optimization problem where is the nuclear norm and is the th largest singular value of , under some conditions on the linear transformation .

As a special case, the problem (2) is reduced to the well-known matrix completion problem (3) [8, 9], when is a sampling (or projection/restriction) operator. Consider where is the incomplete data matrix and is the set of locations corresponding to the observed entries. To solve this kind of problems, we can refer to [8–13] for some breakthrough results. Nevertheless, they may obtain suboptimal performance in real applications because the nuclear norm may not be a good approximation to rank operator, because all the nonzero singular values in rank operator have the equal contribution, while the singular values in nuclear norm are treated differently by adding them together. Thus, to overcome the weakness of nuclear norm, Truncated Nuclear Norm Regularization (TNNR) was proposed for matrix completion, which only minimizes the smallest singular values [14]. The similar truncation idea was also proposed in our previous work [15]. Correspondingly, the problem can be formulated as where is defined as the sum of minimum singular values. In this way, one can get a more accurate and robust approximation to the rank operator on both synthetic and real visual data sets.

In this paper, we aim to extend the idea of TNNR from the special matrix completion problem to the general problem (2) and give the corresponding fast algorithm. More importantly, we will consider how to fast estimate , which is usually unavailable in practice.

Throughout this paper, we use the following notation. We let be the standard inner product between two matrices in a finite dimensional Euclidean space, the 2-norm, and the Frobenius norm for matrix variables. The projection operator under the Euclidean distance measure is denoted by and the transpose of a real matrix by . Let be the singular value decomposition (SVD) for , where , , and .

1.1. Related Work

The low-rank optimization problem (2) has attracted more and more interest in developing customized algorithms, particularly for larger-scale cases. We now briefly review some influential approaches to these problems.

The convex problem (2) can be easily reformulated into the semidefinite programming (SDP) problems [16, 17] to make use of the generic SDP solvers such as SDPT3 [18] and SeDuMi [19] which are based on the interior-point method. However, the interior-point approaches suffer from the limitation that they ineffectively handle large-scale problems which was mentioned in [7, 20, 21]. The problem (2) can also be solved through a projected subgradient approach in [7], whose major computation is concentrated on singular values decomposition. The method can be used to solve large-scale cases of (2). However, the convergence may be slow, especially when high accuracy is required. In [7, 22], -parameterization based on matrix factorization is applied in general low-rank matrix reconstruction problems. Specifically, the low-rank matrix is decomposed into the form , where and are tall and thin matrices. The method reduces the dimensionality from to . However, if the rank and the size are large, the computation cost may also be very high. Moreover, the rank is not known as prior information for most of the applications, and it has to be estimated or dynamically adjusted, which might be difficult to realize. More recently, the augmented Lagrangian method (ALM) [23, 24] and the alternating direction method of multipliers (ADMM) [25] are very efficient for some convex programming problems arising from various applications. In [26], ADMM is applied to solve (2) with .

As an important special case of problem (2), the matrix completion problem (3) has been widely studied. Cai et al. [12, 13] used the shrinkage operator to solve the nuclear norm effectively. The shrinkage operator applies a soft-thresholding rule to singular values, as the sparse operator of a matrix, though it can be applied widely in many other approaches. However, due to the abovementioned limitation of nuclear norm, TNNR (4) is proposed to replace the nuclear norm. Since in (4) is nonconvex, it can not be solved simply and effectively. So, how to change (4) into a convex function is critical. Obviously, it is noted that , , where is the SVD of , , , and . Then , , and the optimization problem (4) can be rewritten as While the problem (5) is still nonconvex, they can get a local minima by an iterative procedure proposed in [14] and we will review the procedure in more detail later.

A similar idea of truncation, in the context of the sparse vectors, is also implemented on the sparse signals by our previous work in [15], which tries to adaptively learn the information of the nonzeros of the unknown true signal. Specifically, we present a sparse signal reconstruction method, iterative support detection (ISD, for short), aiming to achieve fast reconstruction and a reduced requirement on the number of measurements compared to the classical minimization approach. ISD alternatively calls its two components: support detection and signal reconstruction. From an incorrect reconstruction, support detection identifies an index set containing some elements of , and signal reconstruction solves where and . To obtain the reliable support detection from inexact reconstructions, ISD must take advantage of the features and prior information about the true signal . In [15], the sparse or compressible signals, with components having a fast decaying distribution of nonzeros, are considered.

1.2. Contributions and Paper Organization

Our first contribution is the estimation of , on which TNNR heavily depends (in ). Hu et al. [14] seek the best by trying all the possible values and this leads to high computational cost. In this paper, motivated by Wang and Yin [15], we propose singular value estimation (SVE) method to obtain the best , which can be considered as a special implementation of iterative support detection of [15] in case of matrices.

Our second contribution is to extend TNNR from matrix completion to the general low-rank cases. In [14], they have only considered the matrix completion problem.

The third contribution is based on the above two. In particular, a new efficient algorithmic framework is proposed for the low-rank matrix recovery problem. We name it LRISD, which iteratively calls its two components: SVE and solving the low-rank matrix reconstruction model based on TNNR.

The rest of this paper is organized as follows. In Section 2, computing framework of LRISD and theories of SVE are introduced. In Section 3, Section 4, and Section 5, we introduce three algorithms to solve the problems mentioned in Section 2.1. Experimental results are presented in Section 6. Finally, some conclusions are made in Section 7.

2. Iterative Support Detection for Low-Rank Problems

In this section, we first give the outline of the proposed algorithm LRISD and then elaborate the proposed SVE method which is a main component of LRISD.

2.1. Algorithm Outline

The main purpose of LRISD is to provide a better approximation to model (1) than the common convex relaxation model (2). The key idea is to make use of the Truncated Nuclear Norm Regularization (TNNR) defined in (4) and its variant (5) [14]. While ones can passively try all the possible values of which is the number of the largest few singular values, we proposed to actively estimate the value of .

In addition, we will consider the general low-rank recovery problems beyond the matrix completion problem. Specifically, we will solve three models, equality model (7), unconstrained model (8), and inequality model (9): where and are parameters reflecting the level of noise. Models (8) and (9) consider the case with noisy data. Here is a linear mapping such as partial discrete cosine transformation (DCT), partial discrete Walsh-Hadamard transformation (DWHT), and discrete Fourier transform (DFT).

The general framework of LRISD, as an iterative procedure, starts from the initial , that is, solving a plain nuclear norm minimization problem, and then estimates based on the recovered result. Based on the estimated , we solve a resulting TNNR model (7) or (8) or (9). Using the new recovered result, we can update the value and solve a new TNNR model (7) or (8) or (9). Our algorithm iteratively calls the estimation and the solver of the TNNR model.

As for solving (7), (8), and (9), we are following the idea of [14]. Specifically, a simple but efficient iterative procedure is adopted to decouple the , and . We set the initial guess . In the th iteration, we first fix and compute and as described in (5), based on the SVD of . Then we fix and to update by solving the following problems, respectively: In [14], the authors have studied solving the special case of matrix completion problems. For the general problems (10), (11), and (12), we will extend the current state-of-the-art algorithms to solve them in Section 3, Section 4, and Section 5, respectively.

In summary, the procedure of LRISD, as a new multistage algorithm, is summarized in Algorithm 1. By alternately running the SVE and solving the corresponding TNNR models, the iterative scheme will converge to a solution of a TNNR model, whose solution is expected to be better than that of the plain nuclear norm minimization model (2).

Algorithm 1 (LRISD based on (10), (11), and (12)). Initialization: set , which is the solution of pure nuclear norm regularized model (2). Repeat until reaches a stable value. Step 1. Estimate via SVE on . Step 2. Initialization: (the matrix form of ). In the th iteration: (a) compute and of (10) ((11) or (12)) according to the current in the same way as (5) mentioned; (b) solve the model (10) ((11) or (12)) and obtain . Go to (a) until ; (c) . Step 3. Set . Return the recovered matrix .

In the following content, we will explain the implementation of SVE of Step in more detail, and extend the existing algorithms for nuclear norm regularized models to TNNR based models (10)–(12) in procedure (b) of Step 2.

2.2. Singular Value Estimate

In this subsection, we mainly focus on Step of LRISD, that is, describing the process of SVE to estimate , which is the number of the largest few singular values. While it is feasible to find the best via trying all possible as done in [14], this procedure is not computationally efficient. Thus, we aim to quickly give an estimate of the best .

As we have known, for (approximately) low-rank matrices or images, these singular values often have a feature that they all have a fast decaying distribution (as showed in Figure 1). To take advantage of this feature, we can extend our previous work ISD [15] from detecting the large components of sparse vectors to the large singular values of low-rank matrices. In particular, SVE is nothing but a specific implementation of support detection in cases of low-rank matrices, with the aim of acquiring the estimation of the true .

(a) An image example

(b) Red channel

(c) Green channel

(d) Blue channel

Now we present the process of SVE and the effectiveness of SVE. It is noted that, as showed in Algorithm 1, SVE is repeated several times until a stable estimate is obtained. For each time, given the reference image , we can obtain the singular value vector of by SVD. A natural way to find the positions of the true large singular values based on , which is considered as an estimate of the singular value vector of the true matrix , is based on thresholding due to the fast decaying property of the singular values. The choice of should be case-dependent. In the spirit of ISD, one can use the so-called “last significant jump” rule to set the threshold value to detect the large singular values and minimize the false detections, if we assume that the components of are sorted from large to small. The straightforward way to apply the “last significant jump” rule is to look for the largest such that where is a prescribed value and is defined as absolute values of the first order difference of . This amounts to sweeping the decreasing sequence and look for the last jump larger than . For example, the selected ; then we set .

However, in this paper, unlike the original ISD paper [15], we propose to apply the “last significant jump” rule on absolute values of the first order difference of , that is, , instead of . Specifically, we look for the largest such that where will be selected below and is defined as absolute values of the second order difference of . This amounts to sweeping the decreasing sequence and look for the last jump larger than . For example, the selected ; then we set . We set the estimation rank to be the cardinality of , or a close number to it.

Specifically, is computed to obtain jump sizes which count on the change of two neighboring components of . Then, to reflect the stability of these jumps, the difference of needs to be considered as we just do, because the few largest singular values jump actively, while the small singular values would not change much. The cut-off threshold is determined via certain heuristic methods in our experiments: synthetic and real visual data sets. Note that, in Section 6.6, we will present a reliable rule for determining threshold value .

3. TNNR-ADMM for (10) and (12)

In this section, we extend the existing ADMM method in [25] originally for the nuclear norm regularized model to solve (10) and (12) under common linear mapping () and give closed-form solutions. The extended version of ADMM is named as TNNR-ADMM, and the original ADMM for the corresponding nuclear norm regularized low-rank matrix recovery model is denoted by LR-ADMM. In addition, we can deduce that the resulting subproblems are simple enough to have closed-form solutions and can be easily achieved to high precision. We start this section with some preliminaries which are convenient for the presentation of algorithms later.

When , we present the following conclusions [26]: where denotes the inverse operator of and .

Definition 2 (see [26]). When satisfies , for and , the projection of onto is defined as where In particular, when , where Then, we have the following conclusion:

Definition 3. For the matrix , have the singular value decomposition as follows: , . The shrinkage operator is defined: where .

Theorem 4 (see [13]). For each and , one has the following conclusion:

Definition 5. Let be the adjoint operator of satisfying the following condition:

3.1. Algorithmic Framework

The problems (10) and (12) can be easily reformulated into the following linear constrained convex problem: where . In particular, the above formulation is equivalent to (10) when . The augmented Lagrangian function of (25) is where is the Lagrange multiplier of the linear constraint and is the penalty parameter for the violation of the linear constraint.

The idea of ADMM is to decompose the minimization task in (26) into three easier and smaller subproblems such that the involved variables and can be minimized separately and alternatively. In particular, we apply ADMM to solve (26) and obtain the following iterative scheme:

Ignoring constant terms and deriving the optimal conditions for the involved subproblems in (27), we can easily verify that the iterative scheme of the TNNR-ADMM approach for (10) and (12) is as follows.

Algorithm 6 (TNNR-ADMM for (10) and (12)). (1)Initialization: set (the matrix form of ), , , and input .(2)For (maximum number of iterations), do the following.(i)Update by (ii)Update by (iii)Update by (3)End the iteration till .

3.2. The Analysis of Subproblems

According to the analysis above, the computation of each iteration of TNNR-ADMM approach for (10) and (12) is dominated by solving the subproblems (28) and (29). We now elaborate on the strategies for solving these subproblems based on abovementioned preliminaries.

First, the solution of (28) can be obtained explicitly via Theorem 4: which is the closed-form solution.

Second, it is easy to obtain Combining (32) and equipped with Definition 2, we give the final closed-form solution of the subproblem (29): where When , it is the particular case of (10) and can be expressed as

Therefor, when the TNNR-ADMM is applied to solve (10) and (12), the generated subproblems all have closed-form solutions. Besides, some remarks are in order.(i) can be obtained via the following form: where in [27–29]. We make to calculate in our algorithms.(ii)The convergence of the iterative scheme is well studied in [30]. Here, we omit the convergence analysis.

4. TNNR-APGL for (11)

In this section, we consider model (11), which has attracted a lot of attention in certain multitask learning problems [21, 31–33]. While TNNR-ADMM can be applied to solve this model, it is preferred for the noiseless problems. For the simple version of model (11), that is, the one based on the common nuclear norm regularization, many accelerated gradient techniques [34, 35] based on [36] are proposed. Among them, an accelerated proximal gradient line search (APGL) method proposed by Beck and Teboulle [35] has been extended to solve TNNR based matrix completion model in [14]. In this paper, we can extend APGL to solve the more general TNNR based low-rank recovery problem (11).

4.1. TNNR-APGL with Noisy Data

For completeness, we give a short overview of the APGL method. The original model aims to solve the following problem: where , meet these conditions: (i) is a continuous convex function, possibly nondifferentiable function;(ii) is a convex and differentiable function. In other words, it is continuously differentiable with Lipschitz continuous gradient ( is the Lipschitz constant of ).

By linearizing at and adding a proximal term, APGL constructs an approximation of . We have, more specially, where is a proximal parameter and is the gradient of at .

4.1.1. Algorithmic Framework

For convenience, we present model (11) again and define and as follows:

Then, we can conclude that each iteration of the TNNR-APGL for solving model (11) requires solving the following subproblems:

During the above iterate scheme, we update and via the approaches mentioned in [20, 35]. Then, based on (40), we can easily drive the TNNR-APGL algorithmic framework as follows.

Algorithm 7 (TNNR-APGL for (11)). (1)Initialization: set (the matrix form of ), , .(2)For , (maximum number of iterations), do the following.(i)Update by (ii)Update by (iii)Update by (3)End the iteration till .

4.1.2. The Analysis of Subproblems

Obviously, the computation of each iteration of the TNNR-APGL approach for (11) is dominated by subproblem (41). According to Theorem 4, we get where Then, the closed-form solution of (41) is given by

By now, we have applied TNNR-APGL to solve the problem (11) and obtain closed-form solutions. In addition, the convergence of APGL is well studied in [35] and it has a convergence rate of . In our paper, we also omit the convergence analysis.

5. TNNR-ADMMAP for (10) and (12)

While the TNNR-ADMM is usually very efficient for solving the TNNR based models (10) and (12), its convergence could become slower with more constraints in [37]. Inspired by [14], the alternating direction method of multipliers with adaptive penalty (ADMMAP) is applied to reduce the constrained conditions, and adaptive penalty [38] is used to speed up the convergence. The resulting algorithm is named as “TNNR-ADMMAP,” whose subproblems can also get closed-form solutions.

5.1. Algorithm Framework

Two kinds of constrains have been mentioned as before: , and , . Our goal is to transform (10) and (12) into the following form: where and are linear mapping, , , and could be either vectors or matrices, and and are convex functions.

In order to solve problems easily, was asked to be a vector formed by stacking the columns of matrices. On the contrary, if is a linear mapping containing sampling process, we can put into a matrix form sample set. Correspondingly, we should flexibly change the form between matrices and vectors in the calculation process. Here, we just provide the idea and process of TNNR-ADMMAP. Now, we match the relevant function to get the following results: where and and . Denote and , that is, the matrix form of . When , it reflects the problem (10).

Then, the problems (10) and (12) can be equivalently transformed to So the augmented Lagrangian function of (49) is where

The Lagrangian form can be solved via linearized ADMM and a dynamic penalty parameter is preferred in [38]. In particular, due to the special property of (), here, we use ADMMAP in order to handle the problem (50) easily. Similarly, we use the following adaptive updating rule on [38]: where is an upper bound of . The value of is defined as where is a constant and is a proximal parameter.

In summary, the iterative scheme of the TNNR-ADMMAP is as follows.

Algorithm 8 (TNNR-ADMMAP for (10) and (12)). (1)Initialization: set (the matrix form of ), , , and input .(2)For (maximum number of iterations), do the following.(i)Update by (ii)Update by (iii)Update by (iv)The step is calculated with . Update by (3)End the iteration till .

5.2. The Analysis of Subproblems

Since the computation of each iteration of the TNNR-ADMMAP method is dominated by solving subproblems (54) and (55), we now elaborate on the strategies for solving these subproblems.

First, we compute . Due to the special form of and , we can give the following solution by ignoring the constant term:

Second, we concentrate on computing . Obviously, obeys the following rule: It can be solved as where is the adjoint operator of which is mentioned in (24).

Let where ; according to (24), we have . More specifically, Thus, the adjoint operator is denoted by The left side in (60) can be shown as Then, we apply the linear mapping () on both sides of (64), and we obtain Combining (64) and (66), we get Similarly, according to the property of in (64), we can get the transformation for the right side in (60). Consider Based on the above, from (64) and (68), we achieve

Some remarks are in order.(i)The compute of begins with . In other words, the problem matches (12) when .(ii)The convergence of the iterative schemes of ADMMAP is well studied in [38, 39]. In our paper, we omit the convergence analysis.

Overall, when TNNR-ADMM and TNNR-APGL are applied to solve (10)–(12), the generated subproblems all have closed-form solutions. As mentioned before, TNNR-ADMMAP is used to speed up the convergence of (10) and (12) when there are too many constraints. When one problem can be solved simultaneously with the three algorithms, TNNR-ADMMAP is in general more efficient, in the case of matrix completion [14] and in our test problems.

6. Experiments and Results

In this section, we present numerical results to validate the effectiveness of SVE and LRISD. In summary, there are two parts certified by the following experiments. On one hand, we illustrate the effectiveness of SVE on both real visual and synthetic data sets. On the other hand, we also illustrate the effectiveness of LRISD which solves TNNR based low-rank matrix recovery problems on both synthetic and real visual data sets. Since the space is limited, we only discuss model (10) using LRISD-ADMM in our experiments. If necessary, you can refer to [14] for extensive numerical results to understand that ADMMAP is much faster than APGL and ADMM without sacrificing the recovery accuracy, in cases of matrix completion. Similarly, for the low-rank matrix recovery, we have the same conclusion according to our experiments. Since the main aim of the paper is to present the effectiveness of SVE and LRISD, here, we omit the detailed explanation.

All experiments were performed under Windows 7 and Matlab v7.10 (R2010a), running on a HP laptop with an Intel Core 2 Duo CPU at 2.4 GHz and 2 GB of memory.

6.1. Experiments and Implementation Details

We conduct the numerical experiments under the following four classes, where two representative linear mappings : matrix completion and partial DCT, are used. The first two cases are to illustrate the effectiveness of SVE. Here we compared our algorithm LRISD-ADMM with that proposed in [14], which we name as “TNNR-ADMM-TRY,” on the matrix completion problem. The main difference between LRISD-ADMM and TNNR-ADMM-TRY is the way of determining the best . The former is to estimate the best via SVE while the latter one is to try all the possible values and pick the one of the best performance.

The last two are to show the better recovery quality of LRISD-ADMM compared with the solution of the common nuclear norm regularized low-rank recovery models, for example, (2), whose corresponding algorithm is denoted by LR-ADMM as above.(1)Compare LRISD-ADMM with TNNR-ADMM-TRY on matrix completion problems. These experiments are conducted on real visual data sets.(2)Compare the real rank with which is estimated by SVE under different situations, where is a two-dimensional partial DCT operator (). These experiments are conducted on synthetic data sets.(3)Compare LRISD-ADMM with LR-ADMM on the generic low-rank situations, where is also a two-dimensional partial DCT operator. These experiments are conducted on synthetic data sets under different problem settings.(4)Compare LRISD-ADMM with LR-ADMM on the generic low-rank situations, where is also a two-dimensional partial DCT operator. These experiments are conducted on real visual data sets.

In all synthetic data experiments, we generate the sample data as follows: , where is Gaussian white noise of mean zeros and standard deviation . The Matlab script for generating is as follows: where is a prefixed integer. Moreover, we generate the index set in (5) randomly in matrix completion experiments. And, the partial DCT is also generated randomly.

In the implementation of all the experiments, we use the criterion to terminate the iteration of Step 2 (in Algorithm 1) in LR-ADMM and LRISD-ADMM. In addition, we terminate the iteration of (b) in Step 2 by the criterion . In our experiments, we set empirically, which works quit well for the tested problems. The other parameters in TNNR-ADMM are set to their default values (we use the Matlab code provided online by the author of [14]). Besides, we use the PSNR (Peak Signal to Noise Ratio) to evaluate the quality of an image. As color images have three channels (red, green, and blue), PSNR is defined as , where , , and is the total number of missing pixels. For grayscale images, PSNR has a similar definition.

6.2. The Comparison between LRISD-ADMM and TNNR-ADMM-TRY on Matrix Completion

In this subsection, to evaluate the effectiveness of SVE, we compare the proposed LRISD-ADMM with TNNR-ADMM-TRY as well as LD-ADMM on matrix completion problems. As the better recovery quality of TNNR-ADMM-TRY than LR-ADMM on the matrix completion problem has been demonstrated in [14], we will show that the final estimated of LRISD-ADMM via SVE is very close to the one of TNNR-ADMM-TRY.

We test three real clear images and present the input images, the masked images, and the results calculated via three different algorithms: LR-ADMM, TNNR-ADMM-TRY, and LRISD-ADMM. The recovery images are showed in Figure 2 and the numerical value comparisons of time and PSNR are shown in Table 1. We can see that, compared to LR-ADMM, both TNNR-ADMM-TRY and LRISD-ADMM achieve better recovery quality as expected. While the TNNR-ADMM-TRY achieves the best recovery quality as expected due to trying every possible value, its running time is extremely longer than LR-ADMM. Our proposed LRISD-ADMM can achieve almost the same recovery quality as TNNR-ADMM-TRY, but with a significant reduction of computation cost. In fact, if the best precision is expected, we use the estimated by LRISD-ADMM as a reference to search the best around it, with the reasonably extra cost of computation. Here, for convenience in notation, we name the process LRISD-ADMM-ADJUST in Table 1 and Figure 2.

(a)

(b)

(c)

(d)

(e)

(f)

(g)

(h)

(i)

(j)

(k)

(l)

(m)

(n)

(o)

Figure 2

Comparisons results of LR-ADMM, TNNR-ADMM-TRY, LRISD-ADMM, and LRISD-ADMM-ADJUST; we use three images here. The first column is original images. The second column is masked images. The masked images are obtained by covering pixels of the original image in our test. The third column depicts images recovered by LR-ADMM. The fourth column depicts images recovered by TNNR-ADMM-TRY and LRISD-ADMM-ADJUST. The fifth column depicts images recovered by LRISD-ADMM where we just use the estimated directly. Noticing the fourth column, we get the same image by applying two different methods, TNNR-ADMM-TRY and LRISD-ADMM-ADJUST. The reason is that the values of calculated in the two methods are the same. But the procedure by which they find is different. In TNNR-ADMM-TRY, they search the best via testing all possible values (1–20). In LRISD-ADMM-ADJUST, we use the estimated rank as a reference to search around for the best .

6.3. The Effectiveness Analysis of SVE in Two-Dimensional Partial DCT: Synthetic Data

In Section 6.2, we showed that the estimated by LRISD-ADMM is very close to the best by TNNR-ADMM-TRY, on matrix completion problems. Here we further confirm the effectiveness of the proposed SVE of LRISD-ADMM, by conducting experiments for the generic low-rank operator on synthetic data, where is a two-dimensional partial DCT (discrete cosine transform) operator. We compare the best (true rank) with the estimated under different settings.

In the results below, , , and denote the rank of the matrix , estimated rank, and sample ratio taken, respectively. We set the noise level and the sample ratios and choose in all of the tests. The reason of setting is that we want to well illustrate the robustness of SVE to noise. Next, we compare with under different settings. For each scenario, we generated the model by 3 times and reported the results.(i)We fix the matrix size to be , and run LRISD-ADMM to indicate the relationship between and . The results are showed in Figure 3(a).(ii)We fix the matrix size to be and run LRISD-ADMM under different . The results are showed in Figure 3(b).(iii)We fix and run LRISD-ADMM under different matrix sizes. The results are showed in Figure 3(c).

(a)

(b)

(c)

As shown in Figure 3, the proposed SVE performs the rationality and effectiveness to estimate in Step 1 in Algorithm 1. Even if there is much noise, this method is still valid; namely, is (approximately) equivalent to the real rank. That is to say, the proposed SVE is pretty robust to the corruption of noise on sample data. In practice, we can achieve the ideal results in other different settings. To save space, we only illustrate the effectiveness of SVE using the abovementioned situations.

6.4. The Comparison between LRISD-ADMM and LR-ADMM: Synthetic Data

In this subsection, we compare the proposed LRISD-ADMM with LR-ADMM on partial DCT data in general low-rank matrix recovery cases. We will illustrate some numerical results to show the advantages of the proposed LRISD-ADMM in terms of better recovery quality.

We evaluate the recovery performance by the Relative Error as . We compare the reconstruction error under different conditions: different noise levels () and different sample ratios () taken which are shown, respectively, in Figures 4(a) and 4(b). In addition, we compare the recovery ranks which are obtained via the above two algorithms in Figure 4(c). For each scenario, we generated the model by 10 times and reported the average results.(i)We fix the matrix size to be , , , and run LRISD-ADMM and LR-ADMM under different noise levels . The results are shown in Figure 4(a).(ii)We fix the matrix size to be , , and run LRISD-ADMM and LR-ADMM under different . The results are shown in Figure 4(b).(iii)We set the matrix size to be , , , and run LRISD-ADMM and LR-ADMM under different . The results are shown in Figure 4(c).

(a)

(b)

(c)

It is easy to see from Figure 4(a), as the noise level increases, the total becomes larger. Even so, LRISD-ADMM can achieve much better recovery performance than LR-ADMM. This is because the LRISD model can better approximate the rank function than the nuclear norm. Thus, we illustrate that LRISD-ADMM is more robust to noise when it deals with low-rank matrices. With the increasing of sample ratio , the total reduces in Figure 4(b). Generally, LRISD-ADMM does better than LR-ADMM, because LRISD-ADMM can approximately recover the rank of the matrix as showed in Figure 4(c).

6.5. The Comparison between LRISD-ADMM and LR-ADMM: Real Visual Data

In this subsection, we test three images, door, window, and sea, and compare the recovery images by general LR-ADMM and LRISD-ADMM on the partial DCT operator. Besides, we use the best in LRISD-ADMM. In all tests, we fix , . The SVE process during different stages to obtain is depicted in Figure 5. For three images, we set and generate in LRISD-ADMM. Moreover, the best can be obtained. Figure 6 shows the recovery results of the two algorithms.

(a) Door: the first outer iteration of LRISD

(b) Door: the second outer iteration of LRISD

(c) Door: the third outer iteration of LRISD

(d) Window: the first outer iteration of LRISD

(e) Window: the second outer iteration of LRISD

(f) Window: the third outer iteration of LRISD

(g) Sea: the first outer iteration of LRISD

(h) Sea: the second outer iteration of LRISD

(i) Sea: the third outer iteration of LRISD

(a) Original image

(b) Masked image

(c)

(d)

(e) Original image

(f) Masked image

(g)

(h)

(i) Original image

(j) Masked image

(k)

(l)

As illustrated in Figure 5, it is easy to see SVE returns a stable in merely three iterations. And, the estimated is a good estimate to the number of the largest few singular values. From Figure 6, it can be seen that LRISD-ADMM outperforms the general LR-ADMM in terms of smaller PSNR. More important, using eyeballs, we can see the better fidelity of the recoveries of LRISD-ADMM to the true signals, in terms of better recovering sharp edges.

6.6. A Note on

We note that the thresholding plays a critical rule for the efficiency of the proposed SVE. For real visual data, we can use , . For synthetic data, is denoted by , . The above heuristic, which works well in our experiments, is certainly not necessarily optimal; on the other hand, it has been observed that LRISD is not very sensitive to . Of course, the “last significant jump” based thresholding is only one way for estimating the number of the true nonzero (or large) singular values, and one can try other available effective jump detection methods [15, 40, 41].

7. Conclusion

This paper introduces the singular values estimation (SVE) to estimate an appropriate (in ) that the estimated rank is (approximately) equivalent to the best rank. In addition, we extend TNNR from matrix completion to the general low-rank cases (we call it LRISD). Both synthetic and real visual data sets are discussed. Notice that SVE is not limited to thresholding. Effective support detection guarantees the good performance of LRISD. Therefore future research includes studying specific signal classes and developing more effective support detection methods.

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

Acknowledgment

This work was supported by the Natural Science Foundation of China, Grant nos. 11201054, 91330201, and by the Fundamental Research Funds for the Central Universities ZYGX2012J118, ZYGX2013Z005.

References

J. Abernethy, F. Bach, T. Evgeniou, and J.-P. Vert, “Low-rank matrix factorization with attributes,” Tech. Rep. N-24/06/MM, 2006.
View at: Google Scholar
Y. Amit, M. Fink, N. Srebro, and S. Ullman, “Uncovering shared structures in multiclass classification,” in Proceedings of the 24th International Conference on Machine Learning (ICML '07), pp. 17–24, June 2007.
View at: Publisher Site | Google Scholar
A. Evgeniou and M. Pontil, “Multi-task feature learning,” in Proceedings of the Conference on Advances in Neural Information Processing Systems, vol. 19, p. 41, The MIT Press, 2007.
View at: Google Scholar
C. Tomasi and T. Kanade, “Shape and motion from image streams under orthography: a factorization method,” International Journal of Computer Vision, vol. 9, no. 2, pp. 137–154, 1992.
View at: Publisher Site | Google Scholar
M. Mesbahi, “On the rank minimization problem and its control applications,” Systems and Control Letters, vol. 33, no. 1, pp. 31–36, 1998.
View at: Publisher Site | Google Scholar
M. Fazel, H. Hindi, and S. P. Boyd, “Log-det heuristic for matrix rank minimization with applications to hankel and euclidean distance matrices,” in Proceedings of the American Control Conference, vol. 3, pp. 2156–2162, IEEE, June 2003.
View at: Google Scholar
B. Recht, M. Fazel, and P. A. Parrilo, “Guaranteed minimum-rank solutions of linear matrix equations via nuclear norm minimization,” SIAM Review, vol. 52, no. 3, pp. 471–501, 2010.
View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet
E. J. Candès and B. Recht, “Exact matrix completion via convex optimization,” Foundations of Computational Mathematics, vol. 9, no. 6, pp. 717–772, 2009.
View at: Publisher Site | Google Scholar | MathSciNet
E. J. Candes and T. Tao, “The power of convex relaxation: near-optimal matrix completion,” IEEE Transactions on Information Theory, vol. 56, no. 5, pp. 2053–2080, 2010.
View at: Publisher Site | Google Scholar | MathSciNet
J. Wright, Y. Peng, Y. Ma, A. Ganesh, and S. Rao, “Robust principal component analysis: exact recovery of corrupted low-rank matrices by convex optimization,” in Proceedings of the 23rd Annual Conference on Neural Information Processing Systems (NIPS '09), pp. 2080–2088, Vancouver, Canada, December 2009.
View at: Google Scholar
K.-C. Toh and S. Yun, “An accelerated proximal gradient algorithm for nuclear norm regularized linear least squares problems,” Pacific Journal of Optimization, vol. 6, no. 3, pp. 615–640, 2010.
View at: Google Scholar | MathSciNet
J.-F. Cai, R. Chan, L. Shen, and Z. Shen, “Restoration of chopped and nodded images by framelets,” SIAM Journal on Scientific Computing, vol. 30, no. 3, pp. 1205–1227, 2008.
View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet
J.-F. Cai, E. J. Candès, and Z. Shen, “A singular value thresholding algorithm for matrix completion,” SIAM Journal on Optimization, vol. 20, no. 4, pp. 1956–1982, 2010.
View at: Publisher Site | Google Scholar | MathSciNet
Y. Hu, D. Zhang, J. Ye, X. Li, and X. He, “Fast and accurate matrix completion via truncated nuclear norm regularization,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 35, no. 9, pp. 2117–2130, 2013.
View at: Publisher Site | Google Scholar
Y. Wang and W. Yin, “Sparse signal reconstruction via iterative support detection,” SIAM Journal on Imaging Sciences, vol. 3, no. 3, pp. 462–491, 2010.
View at: Publisher Site | Google Scholar | MathSciNet
M. Fazel, H. Hindi, and S. P. Boyd, “A rank minimization heuristic with application to minimum order system approximation,” in Proceedings of the American Control Conference (ACC '01), vol. 6, pp. 4734–4739, June 2001.
View at: Google Scholar
N. Srebro, J. D. M. Rennie, and T. S. Jaakkola, “Maximum-margin matrix factorization,” in Proceedings of the 18th Annual Conference on Neural Information Processing Systems (NIPS '04), pp. 1329–3136, December 2004.
View at: Google Scholar
X. Yuan and J. Yang, “Sparse and low-rank matrix decomposition via alternating direction methods,” Tech. Rep., Nanjing University, Nanjing, China, 2009, http://math.nju.edu.cn/~jfyang/files/LRSD-09.pdf.
View at: Google Scholar
J. F. Sturm, “Using SeDuMi 1.02, a MATLAB toolbox for optimization over symmetric cones,” Optimization Methods and Software, vol. 11-12, no. 1–4, pp. 625–653, 1999.
View at: Publisher Site | Google Scholar | MathSciNet
S. Ji and J. Ye, “An accelerated gradient method for trace norm minimization,” in Proceedings of the 26th International Conference On Machine Learning (ICML '09), pp. 457–464, ACM, June 2009.
View at: Google Scholar
T. K. Pong, P. Tseng, S. Ji, and J. Ye, “Trace norm regularization: reformulations, algorithms, and multi-task learning,” SIAM Journal on Optimization, vol. 20, no. 6, pp. 3465–3489, 2010.
View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet
Z. Wen, D. Goldfarb, and W. Yin, “Alternating direction augmented Lagrangian methods for semidefinite programming,” Mathematical Programming Computation, vol. 2, no. 3-4, pp. 203–230, 2010.
View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet
M. R. Hestenes, “Multiplier and gradient methods,” Journal of Optimization Theory and Applications, vol. 4, no. 5, pp. 303–320, 1969.
View at: Publisher Site | Google Scholar | MathSciNet
M. J. Powell, “A method for nonlinear constraints in minimization problems,” in Optimization, R. Fletcher, Ed., pp. 283–298, Academic Press, New York, NY, USA, 1969.
View at: Google Scholar | MathSciNet
D. Gabay and B. Mercier, “A dual algorithm for the solution of nonlinear variational problems via finite element approximation,” Computers & Mathematics with Applications, vol. 2, no. 1, pp. 17–40, 1976.
View at: Publisher Site | Google Scholar
J. Yang and X. Yuan, “Linearized augmented lagrangian and alternating direction methods for nuclear norm minimization,” Mathematics of Computation, vol. 82, no. 281, pp. 301–329, 2013.
View at: Publisher Site | Google Scholar | MathSciNet
R. Glowinski and P. Le Tallec, Augmented Lagrangian and Operator-Splitting Methods in Nonlinear Mechanics, vol. 9 of Studies in Applied and Numerical Mathematics, SIAM, 1989.
R. Glowinski, Lectures on Numerical Methods for Non-Linear Variational Problems, Springer, 2008.
C. Chen, B. He, and X. Yuan, “Matrix completion via an alternating direction method,” IMA Journal of Numerical Analysis, vol. 32, no. 1, pp. 227–245, 2012.
View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet
S. Boyd, N. Parikh, E. Chu, B. Peleato, and J. Eckstein, “Distributed optimization and statistical learning via the alternating direction method of multipliers,” Foundations and Trends in Machine Learning, vol. 3, no. 1, pp. 1–122, 2010.
View at: Publisher Site | Google Scholar | Zentralblatt MATH
A. Argyriou, T. Evgeniou, and M. Pontil, “Convex multi-task feature learning,” Machine Learning, vol. 73, no. 3, pp. 243–272, 2008.
View at: Publisher Site | Google Scholar
J. Abernethy, F. Bach, T. Evgeniou, and J.-P. Vert, “A new approach to collaborative filtering: operator estimation with spectral regularization,” Journal of Machine Learning Research, vol. 10, pp. 803–826, 2009.
View at: Google Scholar
G. Obozinski, B. Taskar, and M. I. Jordan, “Joint covariate selection and joint subspace selection for multiple classification problems,” Statistics and Computing, vol. 20, no. 2, pp. 231–252, 2010.
View at: Publisher Site | Google Scholar | MathSciNet
Y. Nesterov, “Gradient methods for minimizing composite objective function,” ECORE Discussion Papers 2007076, 2007.
View at: Google Scholar
A. Beck and M. Teboulle, “A fast iterative shrinkage-thresholding algorithm for linear inverse problems,” SIAM Journal on Imaging Sciences, vol. 2, no. 1, pp. 183–202, 2009.
View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet
Y. Nesterov, “A method of solving a convex programming problem with convergence rate O(1/k²),” Soviet Mathematics Doklady, vol. 27, pp. 372–376, 1983.
View at: Google Scholar
B. He, M. Tao, and X. Yuan, “Alternating direction method with gaussian back substitution for separable convex programming,” SIAM Journal on Optimization, vol. 22, no. 2, pp. 313–340, 2012.
View at: Publisher Site | Google Scholar | MathSciNet
Z. Lin, R. Liu, and Z. Su, “Linearized alternating direction method with adaptive penalty for low-rank representation,” in Proceedings of the Advances in Neural Information Processing Systems (NIPS '11), pp. 612–620, 2011.
View at: Google Scholar
B. S. He, H. Yang, and S. L. Wang, “Alternating direction method with self-adaptive penalty parameters for monotone variational inequalities,” Journal of Optimization Theory and Applications, vol. 106, no. 2, pp. 337–356, 2000.
View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet
A. K. Yasakov, “Method for detection of jump-like change points in optical data using approximations with distribution functions,” in Ocean Optics XII, Proceedings of SPIE, pp. 605–612, Bergen, Norway, 1994.
View at: Publisher Site | Google Scholar
J.-H. Jung and V. R. Durante, “An iterative adaptive multiquadric radial basis function method for the detection of local jump discontinuities,” Applied Numerical Mathematics, vol. 59, no. 7, pp. 1449–1466, 2009.
View at: Publisher Site | Google Scholar | MathSciNet

Copyright

Copyright © 2014 Yilun Wang and Xinhua Su. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

PDF Download Citation

Download other formats

Order printed copies

Views

1770

Downloads

1400

Citations