#### Abstract

The phase retrieval (PR), reconstructing an object from its Fourier magnitudes, is equivalent to a nonlinear inverse problem. In this paper, we proposed a two-step algorithm that traditional ER/HIO iteration plays as the coarse feature reconstruction, whereas the KSVD-based inpainting technique deals with the fine feature set accordingly. Since the KSVD allows the content of oversampled dictionary with sparse representation to adaptively fit a given set of object examples, as long as the ER/HIO algorithms provide decent object estimation at early stage, the pixels violating the object constraint can be restored with superior image quality. The numerical analyses demonstrated the effectiveness of ER + KSVD and HIO + KSVD through multiple independent initial Fourier phases. With its versatility and simplicity, the proposed method can be generalized to be implemented with more PR state-of-the-arts.

#### 1. Introduction

In most optical imaging applications, the globally spectral phase encodes critical structural information about the object. However, the optical detection devices merely record the magnitude squared of the field from a diffraction pattern and cause to loss the phase information. In order to recover an object field from its diffraction intensity, phases in the spectral domain need to be estimated. This is a basic definition about the phase retrieval (PR) problem. Such problem has been an active research topic in numerous engineering and scientific applications [1–8]. Generally, PR is a nonlinear inverse ill-posed problem; therefore, prior information of the underlying object is necessary to enable its recovery. The existing PR algorithms can be roughly categorized into the following classes: general iteration [9–18], sparse-based and compressed sensing methods [19–23], and convex and nonconvex optimization methods [24–27], respectively.

The classical PR algorithm subject to the general iteration was proposed by Gerchberg and Saxton [9], Yang and Gu [10], and Fienup [11, 12], respectively. The core idea is to start with an initial guess about the Fourier phase and then project the estimate onto the object domain (subject to spatial constraint) and the Fourier domain (subject to Fourier constraint) alternatively until the convergence criterion is reached. The Fourier constraint guarantees that the spectrum of the reconstructed object has the same Fourier magnitude as the measurement. Meanwhile, finitely supported, real-valued, and nonnegative priors of the object are usually imposed as the object constraints. The most popular and simplest PR iterative algorithms for a single measurement problem are the error-reduction (ER) and the hybrid input-output (HIO) algorithms, respectively. To successfully retrieve the Fourier phase, the oversampling condition, i.e., finer than the Nyquist interval, should be fulfilled in the Fourier dimension in both ER and HIO [1, 17]. Various extended versions, such as relaxed averaged alternating reflection algorithm [18] and difference map algorithm [13], were subsequently developed to improve either the numerical robustness or estimation performance. Recently, inspired by the compressed sensing [28], the sparsity priors were incorporated into the object constraint. Experimental results demonstrated that the sparsity-based iterative methods have superior performance over the traditional ones with no sparsity priors [20, 21]. Since the sparse prior requires an overcomplete representation, the dictionary learning for the PR algorithm was proposed based on adaptive sparse representation and self-learning [22]. To reduce the learning time, the transfer orthogonal sparsifying transform learning method was developed to avoid inefficient updates at the early iterations [23]. The last point of view about PR algorithms is to formulate the PR as a nonconvex problem. The gradient descent method is employed to solve such formulated problem [24]. These schemes can guarantee the convergence to a stationary solution. On the other hand, nonconvex problems can be transformed into convex relaxations and be solved with convex optimization methods [25–27]. However, the mathematical theory is not yet developed to guarantee convergence for the nonconvex constraints. Interested readers can refer to [29] for an extensive bibliography and complete discussion about the PR problem.

In order to exhibit the effectiveness of sparse representation, in this paper, we proposed a novel algorithm associated with image inpainting technique as the extensive versions of two classical iterative PR algorithms (ER and HIO), respectively. The activation of object constraints in ER is periodically imported into the ER and HIO algorithms so that pixels of the estimated object violating the object constraints will be assigned zero values. Zero values in an image can be taken as the missing pixels. In order to recover missing pixels, the image inpainting technique was developed. Here, the *K*-singular value decomposition (KSVD) method [30], a dictionary learning algorithm for image inpainting, is adopted to restore these missing pixels. As a consequence, the ER and HIO algorithms combining with the KSVD algorithm (ER + KSVD and HIO + KSVD) were proposed. Since all of the ER, HIO, and KSVD algorithms had been proved to converge [11, 30], the proposed methods can guarantee the convergence evidently. The cost function of relative squared error in numerical simulation validates that our method is able to avoid stagnation, get lower error values, and outperform the ER and HIO benchmark.

The rest of this paper is organized as follows. In Section 2, related work and the KSVD method are briefly reviewed. Our main framework is also provided. In Section 3, the performance of the proposed methods is compared with classical PR methods, whereas further discussion is also addressed. Finally, conclusions and future works are summarized in Section 4.

#### 2. Prior Works and Our Main Framework

For the purpose of completeness of this paper, we firstly defined the PR problem from a single intensity measurement and two classical algorithms, ER and HIO, respectively. Subsequently, the sparse representation model for restoring missing pixels though the KSVD method was introduced. Finally, performance and evaluation through the proposed algorithms ER + KSVD and HIO + KSVD were discussed accordingly.

##### 2.1. Problem Formulation

The PR problem, aiming to reconstruct an object from the measured Fourier magnitude, attracts a growing interest in science and engineering [1–8]. For the optical detection devices (e.g., charge-coupled device cameras), only the intensity distribution is available. The measured Fourier magnitude can always be obtained through the square root of the intensity and related to its counterpart in the spatial domain by the Fourier transform:where , , and represent a two-dimensional (2D) spatial coordinate, a 2D frequency coordinate, and the Fourier transform operator, respectively. Basically, PR problems focus on how to find the phase information in the Fourier domain from the Fourier measurement . Once the 2D Fourier phase is recovered, accompanying the measurement , we can apply the inverse Fourier transform :to reconstruct the original object accordingly.

##### 2.2. ER and HIO Algorithms

The ER and HIO algorithms involve alternate Fourier transform between the spatial and Fourier domains; both fields are subject to either prior constraints (e.g., real and nonnegative) or the measured data, as depicted in Figure 1.

Given a measured Fourier magnitude and a set of object constraints, the procedure of the ER algorithm consists of four steps as follows:where is the iteration and is a set of pixels at which satisfies the object constraints. In this article, we assumed the object constraint as a positive real-valued object within a finite supported region, i.e., the estimate outbound the known diameter of the object will be assigned zero. In Figure 1 and equations (3a)–(3d), and represent the first/final approximation in one iteration cycle to the original object and Fourier spectrum , respectively. is the estimate of the Fourier phase of the original object. The four steps complete one iteration cycle and continue until reaching either the maximum number of iterations or the cost function below the threshold value.

Based on the ER algorithm which discards all the violating pixel values, Fienup proposed the HIO algorithm with a memory mechanism to speed up the convergence. The iterative process of HIO at iteration is the same as the ER algorithm except for the third step in equation (3c):where is a constant feedback rate and usually selected to be less than 1, making the estimated object smoothly approaches toward the desired solution. An important feature of HIO lies in that its ability to converge to a global optimum instead of a local optimum. However, being vulnerable to noise, the image quality of recovery through the HIO algorithm is sometimes not satisfactory.

Since the Fourier magnitude is available in measurement, the error is accessible in the Fourier domain, whereas the convergence of both ER and HIO algorithms can be monitored by the relative squared error (RSE):which is used to evaluate how well the estimated Fourier magnitude matching the given Fourier magnitude . According to Fienup’s work [11], the error by which the estimated Fourier magnitude violates the given Fourier magnitude is monotonically decreasing. Therefore, the ER and HIO algorithms are convergent.

In Figure 1, the activations for forcing the estimated object to conform the object constraints in ER and HIO algorithms are denoted by and , respectively. In this paper, we regarded the pixel values violating the object constraints in equation (3c) as the missing pixels. Rather than totally discard (ER) or modification (HIO), we try to extract the useful image features through the sparse representation from the overcomplete dictionary. Such scheme has been well adopted in various image inpainting techniques.

##### 2.3. KSVD Method for Image Restoration

The image inpainting technique was developed to restore missing or damaged areas in an image. Among various image inpainting methods [30, 31], here we selected the KSVD method to restore the missing pixels which were those violating the object constraints in the PR problem. KSVD, generalizing the *K*-means clustering process, is able to represent signals as linear combinations of a few atoms from a dictionary matrix. The atom in compressive sensing is usually represented as a column vector of the dictionary matrix. A proper dictionary matrix can be obtained from updating dictionary atoms at each iteration to better fit the data, whereas the initial dictionary matrix usually utilizes the 2D discrete cosine transform. To derive combinations of a few atoms, sparse representations via pursuit algorithm are introduced. Therefore, the KSVD algorithm alternates between the sparse coding and dictionary update stages. The properties of the updated dictionary atoms set the limits on the sparse coefficient vector. The outline of the whole KSVD process is shown in Figure 2.

For the sake of paper completeness and to clarify the significance of KSVD for sparse representation, here we briefly review its math framework. For further details, interested readers can refer to the literature [30]. Using an overcomplete dictionary matrix that contains prototype atoms for columns , a signal can be represented as a sparse linear combination of these atoms in the latent space. The representation of is approximated as . The vector contains the representation coefficients of the signal . Since is smaller than , there are infinite solutions for this representation problem. With appropriate regularization [30], the imposition of the fewest number of nonzero coefficients set on the solution is likely to have a satisfactory representation.

When an image contains some missing pixels, the image can be firstly divided into patches of the same size and make up an example set , as shown in Figure 3. The extracted patches can be overlapping (or nonoverlapping) and converted to the column vectors. Each small patch , with dimension , can be approximately represented as the linear combinations of a few atoms from a dictionary matrix . The representation error per is defined as , where is the coefficient vector of linear combinations. Then, the mean square error (MSE) of the overall representation is given bywhere and is the Frobenius norm. To search for the best possible dictionary for the sparse representation of the example set , the inpainting problem is therefore formulated as follows:where denotes the number of nonzero elements in a vector. As the regularization, the coefficient vector has a maximum number of nonzero entries .

To solve the objective function expressed in equation (7), the KSVD algorithm iteratively alternates between the sparse coding and dictionary update stages, as shown in Figure 2. For the sparse coding stage, the dictionary matrix is fixed and the pursuit algorithms are employed to adequately compute the representation vector for each example [32, 33]. Subsequently, the process of updating the dictionary matrix can be done by using the SVD. Since the dictionary matrix consists of columns, KSVD updates the learning dictionary by computing SVD times, each determining one column of the matrix. Carrying out a series of steps has capability to make MSE monotonically decreasing; therefore, the convergence of KSVD can be guaranteed [30]. Here, we exhibited an example of recovering missing pixels in an image by using the KSVD method. With 25% missing pixels and PSNR = 13.77 dB shown in Figure 4(b), we successfully filled the zero flawless and improve the image quality to PSNR = 31.56 dB. The KSVD method can effectively recover missing pixels as well as retrieve the latent features from a noisy image with sparse representation. Since the missing pixels can be associated with those pixels violating the object constraints in ER, we applied the KSVD method upon ER/HIO iterations and examined its effectiveness in the PR problem.

**(a)**

**(b)**

**(c)**

**(d)**

##### 2.4. Motivation and Our Methods

Typically, a reconstruction with the least RSE is not always representative in terms of image. In this paper, in order to ensure that the ER and HIO algorithms will not stick in the minimum error, the image inpainting technique is periodically applied to perturb the convergence behavior of these two PR algorithms. Introducing the image inpainting technique can provide the PR algorithm with an ability to escape the local optimums. In addition, the recovered missing pixels using the image inpainting technique in the current iteration can render a finer representing feature set of the object for the next PR iteration. The core ideas of connecting the missing pixels with the ER and HIO algorithms, referred to ER + KSVD and HIO + KSVD, are, respectively, discussed as follows.

For the ER algorithm, the pixels in which the first estimated object violates the object constraints in ER criterion will be set to zeros, as expressed in equation (3c). These zero values in the estimated object were viewed as the missing pixels, marked red in Figure 5(a). Then we carried out the KSVD method to restore these missing pixels. The impact of the image inpainting method on the ER algorithm is exhibited in Figure 5. The object was not only inpainted with higher PSNR but also provide finer features for the next PR iteration. The procedure of the ER algorithm combining the KSVD method (ER + KSVD) is illustrated in Figure 6. Here, is the maximum number of iterations for the ER algorithm, is the iteration, and indicates that KSVD is executed every iterations for iterations. The ER + KSVD algorithm turns to the red path every iterations for inpainting pixels in the object , which is obtained by passing the estimated object through the activation and hence has missing pixels. Taking the object as an example set , the small patch can be approximately represented as linear combinations of a few atoms, . Assembling all small patches corresponding to their sparse coefficient vectors , for , the missing pixels in can be retrieved by the multiplication . Coefficients of sparse representations are summarized in the matrix , and the dictionary matrix is learned from the given example set at each iteration. Therefore, a well feature set of the object, represented as , is then transformed back to the Fourier domain for the next PR iteration. It is noted that only the pixels within the known support were recovered by KSVD. Zero pixels were preserved outside the support. The other iteration steps were kept the same as the ER algorithm, and we repeated equations (3a)–(3d) and then periodically carried out the aforementioned KSVD until reaching a satisfactory minimal RSE or the given maximal number of PR iterations.

**(a)**

**(b)**

The mechanism of passing through KSVD can be viewed as an additional perturbation of the magnitude in the spatial domain, and its mathematical expression is as follows:where is a recovered missing pixel as well as denoised pixel by means of the KSVD. The proposed method with spatial magnitude perturbation could be effective to make the PR algorithm avoid converging to a local optimum.

Similarly, we employed the HIO + KSVD algorithm to examine its effectiveness in the PR problem. The activation was introduced in every iterations, and then, the violating pixels were equivalently defined as the missing pixels before executing the KSVD algorithm, as illustrated in Figure 6.

Again, we connected the proposed KSVD scheme to the traditional PR iteration of equations (3a), (3b), (5), and (3d) until reaching a satisfactory minimal RSE or the given maximal number of PR iterations. The only difference between the ER + KSVD and HIO + KSVD algorithms is the activation in the object constraints during the PR iteration. Since the KSVD allows the content of oversampled dictionary matrix with sparse representation to adaptively fit a given set of object examples, the proposed method can be generalized to be implemented with many PR state-of-the-arts.

#### 3. Numerical Simulations

In this section, we compared the performance of the five PR algorithms: ER, HIO, ER + HIO, ER + KSVD, and HIO + KSVD, and demonstrated the effectiveness of image inpainting technique during the PR iteration. All the cases discussed here are a single measurement problem, i.e., only the Fourier magnitude and real, nonnegative, and finitely supported object constraints are available. A well-known objective image quality metric, peak signal-to-noise ratio (PSNR), was utilized to evaluate the performance of PR algorithms between the reconstructed image and ground truth.

In order to retrieve phase information in the Fourier domain, the oversampling condition should be fulfilled [1, 15, 17], as shown in Figure 7(b). The sampling rate, defined asis to be at least the Nyquist rate. Figure 7(a) shows that the original object image size was zero-padded to avoid aliasing effects. Under this condition, the support of the object image must be well known to reconstruct the object from the oversampled diffraction pattern in the Fourier domain. In order to quickly examine the proposed method, here we predefined locations of the object image’s boundary edges to get the tight support region, as indicated by the red rectangle in Figure 7(a). One can also use the autocorrelation of the object to get a support region. This finite support is finally used in the object constraint to force the estimated object to zero when it locates outside the support region during the PR iteration.

**(a)**

**(b)**

Given the oversampled diffraction pattern in the Fourier domain and the object constraints (real, nonnegative, and finitely supported) in the spatial domain, the simulation results of four test cases through the five PR algorithms are shown in Figure 8. The maximal number of PR iterations was set as a stopping criterion for the five PR algorithms, and KSVD is executed every 160 iterations . In the ER + HIO algorithm, ER and HIO alternatively execute 200 iterations in each cycle until the maximum number of PR iterations is reached. Note that the five PR algorithms were given the same initial Fourier phase at the baseline. With a decent initial phase, the classical ER or HIO algorithm can provide well appearance of the object at early stage, and the proposed methods (ER + KSVD and HIO + KSVD) subject to this reconstructed object (after 160 iterations in ER or HIO) can further improve image quality with additional KSVD scheme. Nonetheless, the results with ER + KSVD are likely to fail, as shown in the third and fourth rows in Figure 8. No matter how we increased the iteration number, the result with ER + KSVD is still poor. The reason lies in that the ER + KSVD highly depends on the ER at earlier stages. If ER could not provide finer features of the object before executing KSVD, KSVD dictionary cannot self-learn from such poor data sample. In order to overcome this issue, we can impose another initial random guess. Likewise, the performance of HIO + KSVD depends on the HIO, which provides the prominent feature set of the object at earlier stages.

The measured diffraction pattern can be set as the ground truth, and we examined the RSE evolution during the PR iteration, as shown in Figure 9. Under the test objects “NCTU”, RSEs with KSVD-based have periodical spikes, whereas the perturbation enables the algorithm to jump out of the stagnant local minima. The results agree with the aforementioned statement that the KSVD can further improve the image quality with moderate initial conditions. To further manifest the advantage of the KSVD-based scheme, we applied another metric, structural similarity index (SSIM), to examine the reconstructed image quality. As shown in Figure 9(b), SSIM computes the structural similarity index for the estimated Fourier magnitude using the measured Fourier magnitude as the reference image. Since KSVD was developed to attain the image features through a self-learning scheme, it is no surprise that SSIM is significantly increased as KSVD was executed at each iteration cycle.

**(a)**

**(b)**

Another advantage of the proposed KSVD-based scheme lies in its robustness against the noise. When noise was added to contaminate the diffraction intensity, the noise redistributed in the spatial domain would lead to reconstruction failure [17]. To demonstrate the KSVD-based method is superior to the classical PR algorithms under the noise case, we refer to the noise level defined in Rodriguez’s work:where represents the noise-free Fourier magnitudes and the Fourier magnitudes with noise [15]. In this study, Poisson noise is added to the diffraction intensity with about . Various reconstruction results obtained from the noisy diffraction patterns are shown in Figure 10. All the five PR algorithms were given the same initial Fourier phase for fair comparison. The maximal number of PR iterations and the iteration cycle for executing KSVD are set to be and , respectively. In the ER + HIO algorithm, ER and HIO alternatively execute 150 iterations in each cycle until the maximum number of PR iterations is reached. According to PSNR of the reconstructed images, the additional KSVD function effectively improved the classical ER and HIO methods.

At this moment, we were wondering which value of the iteration cycle is the best to periodically apply KSVD in our proposed methods. Given the same initial Fourier phases, reconstruction results by using the ER + KSVD and HIO + KSVD algorithms with different iteration cycles (from 16 to 160) are shown in Figure 11.

For the ER + KSVD method, no significant improvement was observed under different iteration cycles. On the other hand, it is obvious that the best image quality is obtained when in the HIO + KSVD method. The reason lies in that HIO has higher reconstruction performance than ER; moreover, the estimated objects violating object constraints should be fixed by KSVD immediately when they were recognized at the early stage. In this work, the intrinsic object was restored after several times of the classical PR iteration rather than alternating the PR and KSVD algorithms [20]. The reason for such two-step procedure lies in that the KSVD is time-consuming, and we suggest to allow HIO/ER iteration as the coarse feature reconstruction, whereas KSVD deals with the fine feature set accordingly.

Figure 12 shows the final RSEs (corresponding to reconstructed objects in Figure 11) under different iteration cycles. In each case, the ER + KSVD and HIO + KSVD algorithms reach the maximum number of PR iterations . Note that since the initial Fourier phase is randomly generated, the reconstruction results of the portrait from the ER + KSVD and HIO + KSVD algorithms using the same iteration cycle in Figures 8 and 11 have different reconstructed image quality. Besides, executing the KSVD algorithm is time-consuming, so we compare the computation time of the proposed methods for different iteration cycles to trade off the speed and reconstructed image quality. As shown in Figure 12(a), the ER + KSVD method using different iteration cycles unfortunately has no explicit varying trend of RSE (monotonically increasing or decreasing) to determine the best iteration cycle . This result makes sense because the reconstructed image quality from ER + KSVD has no significant improvement. In contrast, the HIO + KSVD method has a performance plateau which is reached at about .

**(a)**

**(b)**

#### 4. Conclusion

In this paper, we presented a new framework with ER + KSVD and HIO + KSVD to improve the classical ER and HIO algorithms for a single measurement PR problem. The core concept lies in that the estimated object is periodically processed through the activation , whereas the pixels violating object constraint were defined as the missing pixels. These missing pixels were then inpainted through the KSVD method, which is a sparse dictionary learning-based approach for image inpainting. The proposed two-step procedure lies in that the KSVD is time-consuming, and we suggest to allow HIO/ER iteration as the coarse feature reconstruction, whereas KSVD deals with the fine feature set accordingly. Since the KSVD allows the content of oversampled dictionary matrix with sparse representation to adaptively fit a given set of object examples, as long as the ER/HIO algorithms provide decent object estimation at early stage, the recovered missing pixels through KSVD can be restored with finer features of the object information. In this paper, we examined the effectiveness of the proposed scheme with more than 20 different and independent initial Fourier phases. The numerical analyses exhibited superior PR capability over the traditional ER or HIO algorithm alone. There were two possibilities for future work: one is to apply the state-of-the-art image inpainting techniques upon the PR process and extract more object features. The other one is with complex-valued objects, which would change the object constraint from this work (real, nonnegative, finitely supported) to a more complete discussion.

#### Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

#### Conflicts of Interest

The authors declare that they have no conflicts of interest.

#### Acknowledgments

This work was financially supported by the Ministry of Science and Technology (MOST) of Taiwan Government under grant no. 107-2221-E-009-115-MY3.