Abstract

Image super-resolution (SR) is one of the classical ill-posed image processing issues to generate high-resolution (HR) images from given low-resolution (LR) instances. Recent SR works aim to find an elaborate convolutional neural network (CNN) design and regard it as an end-to-end filter to map the image from LR space to HR space. However, seldom of them concentrate on the mathematical proof of network design or consider the problem from an optimization perspective. In this paper, we investigate the image SR based on the Landweber iteration method, which is an effective optimization method to find a feasible solution for the ill-posed problem. By considering the issue from the optimization perspective, we design a corresponding Landweber iteration-inspired network to adaptively learn the parameters and find the HR results. Experimental results show the proposed network achieves competitive or better subjective and objective performance than other state-of-the-art methods with fewer parameters and computational costs.

1. Introduction

Single image super-resolution (SISR), one of the classical image processing issues, has been widely investigated in recent years. Given a low-resolution (LR) image, the task of SISR is to generate a corresponding high-resolution (HR) instance with satisfying visual quality [1]. Image super-resolution (SR) has been widely investigated in numerous applications, such as image inpainting [2], self-driving [3], pose detection [4], underwater image enhancement [5], video deinterlacing [6], and recognition [7]. Figure 1 shows an example of image super-resolution. Figure 1(a) shows the given low-resolution image and the original high-resolution instance. Figure 1(b) shows the restored SR images.

Traditional SISR methods usually find a feasible solution for SISR by interpolation or compressed sensing. Zhang et al. proposed a contourlet-based interpolation method for SISR [8]. There are also works conducting the interpolation with the help of wavelet transformation [9]. Recently, rational fractal interpolation is also considered for image SR [10]. These works aim to design a filter to process the image but lack the learning step for finding the statistical correlation between LR and HR images. Compressed sensing for SISR, which learns a mapping relationship between LR and HR images from training data, is also fully investigated by researchers. Chen et al. utilized K-SVD for dictionary learning and achieved good restoration performance [11]. Zhao et al. considered local manifold projection in compressed sensing for more accurate results [12]. The compressed sensing-based methods highly depend on the choice of regulation term. Furthermore, the hyper-parameters are manually intervened and lack the ubiquity.

With the rapid development of deep learning, recently there are convolutional neural networks (CNNs) specially designed for SISR, which achieve state-of-the-art performance. SRCNN [13] is the first CNN-based method for SISR with a three-layer neural network to simulate the step of compressed sensing. After SRCNN, numerous works are devising elaborate blocks to build the CNN deeper and wider. VDSR [14] proposed a very deep network with residual connection for improving the performance. EDSR [15] removed the batch normalization and built the network with residual blocks [16]. In recent years, RDN [17], DRN [18], RFANet [19], and other works design effective blocks to achieve state-of-the-art performance. However, these works only concentrate on the network architecture, but almost neglect to consider the mathematical proof of the design.

There are also works considering the SISR from the optimization perspective. IRCNN [20] analyzed the image restoration by half-quadratic splitting (HQS) strategy and designed an end-to-end network for multitask image restoration. Hu et al. devised an end-to-end network inspired by the alternating direction method of multipliers (ADMM) [21]. Liu et al. also proposed a novel network termed ISRN [22] for restoration, which is inspired by the HQS strategy. However, these works only use stacked blocks to build the solver of iterative formulations, but do not concentrate on guiding the block design itself by the optimization method.

In this paper, we design a novel Landweber iteration [23] inspired network for SISR, which is termed as LandNet. Different from other iteration optimization-inspired CNN-based methods, the proposed LandNet unfolds the iteration steps into sequential network blocks, and designs a straightforward network to generate HR images from LR instance. Specially, we devise a novel block inspired by the formulation of Landweber iteration, and use convolutional layers to simulate the optimization step of each iteration. Based on the unfolding blocks, an end-to-end network is established for effective restoration. Experimental results show the proposed LandNet achieves competitive or better subjective and objective performance than other state-of-the-art methods with fewer parameters and computation cost.

The contribution of this paper can be concluded as(i)We analyze the image super-resolution in the optimization perspective and design a novel block for restoration inspired by the Landweber iteration method.(ii)We design an unfolding network for end-to-end image super-resolution based on the proposed optimization block, which is termed as LandNet.(iii)Experiment results show that the proposed LandNet achieves competitive or better subjective and objective performance than other works with fewer parameters and computation cost.

2. Relate Works

Image processing and analysis is an important task for signal processing [24, 25]. As a classical issue in image processing, the task of image super-resolution (SR) is to generate high-resolution (HR) images from low-resolution (LR) instances [2628]. In recent years, convolutional neural networks (CNNs) have demonstrated amazing performance on image SR. SRCNN [13] is the first CNN-based image super-resolution network that achieves a large improvement over traditional works. FSRCNN [29] improves the speed of SRCNN and utilizes a deeper network to obtain a better performance. ESPCN [30] proposes a subpixel convolution strategy to resize the feature map and utilizes it to substitute the deconvolutional layer, which has been widely used in recent works. VDSR [14], EDSR [15], and SRDenseNet [31] build the network deeper and wider for better performance. In recent years, more and more elaborate blocks are proposed for effective restoration. RDN [17] combines the residual [16] and dense [32] connections and designs a novel residual dense block for image restoration. RCAN [33] investigates a residual-in-residual block with channel-wise attention [34] and achieves the state-of-the-art performance. SAN [35] utilizes a second-order attention network to focus on the inherent correlations among features. RFANet [19] aggregates the residual information to improve the network representation. Based on the residual aggregation, RFDN [36] is also proposed for effective lightweight image super-resolution. These works build the network with well-designed blocks and achieve good performances. Cross-SRN [7] designs a cross convolution to enhance the edge information recovery. Zhu et al. devised a GAN-based method for perceptual-oriented image super-resolution [37]. Recently, there are effective networks specially designed for image and video super-resolution. Mei et al. combined the nonlocal attention and the sparse representation for image super-resolution and proposed a novel nonlocal sparse attention with dynamic sparse attention pattern for image super-resolution, which achieved the state-of-the-art performance [38]. Jiang et al. investigated the effective connections between different network modules and proposed a hierarchical dense recursive network for image super-resolution [39]. Zhang et al. also considered the attention over the context, and developed a context reasoning attention network for image super-resolution [40]. Progressive exploration and generative adversarial network were also investigated by Yi et al. for super-resolution [41]. Zhang et al. also analyzed the image super-resolution in the fluid micelle perspective and proposed an FMNet for image super-resolution [42]. However, they do not concentrate on the mathematical analysis of the image super-resolution and only use end-to-end CNN to map the low-resolution (LR) images to high-resolution (HR) instances.

Recently researchers begin to investigate the interpretation of neural networks and design networks based on mathematical analysis. ADMM-Net [43] provides an optimization perspective analysis on compressive sensing MRI and designs an end-to-end network to simulate the alternating direction method of multipliers (ADMM) operation. IRCNN [20] proposes a general analysis of the image restoration issue and designs a denoiser prior to solve the problem based on the half-quadratic splitting (HQS) strategy. Ma et al. propose an ADMM-based unfolding network for image super-resolution [44]. Tuo et al. also consider real aperture super-resolution with an ADMM-based solver [45]. ISRN [22] is also investigated recently for single image super-resolution, which utilizes HQS to build an end-to-end iterative network. Zhang et al. rethink the degradation model of image super-resolution and propose a plug-and-play super-resolution network for arbitrary downsampling situations [46]. However, these works only regard the iterative scheme as guidance to design the network pipeline, but neglect to concentrate on the block architecture.

3. Methodology

In this section, we use Landweber [23] iteration method to analyze the image SR issue and design an end-to-end network to restore the image. Given a low-resolution (LR) image , the task of image super-resolution is to find a corresponding high-resolution (HR) image , satisfyingwhere is the degradation operation, , , are the height, width, and channels of the image separately, and is the scaling factor. Usually can be represented as a matrix. If is a nonsingular matrix, then there is the specific solution for where

However, due to the highly ill-conditioned property of the image SR problem, the solution of (1) is sensitive to the noise. Especially, if is a singular matrix, there is no exact solution of . As such, we try to find a least-squares super-resolution solution that satisfies

To solve the issue, we consider the fix-point equation aswhere is the conjugate matrix of . The fix-point equation is converged at the point where

As such, we use the iterative method to gradually update the and find a feasible solution aswhere is the result of -th iteration.

The (5) can be re-written aswhich can be regarded as a gradient descent step. With the accumulate of iteration step , the accuracy becomes higher.

The iteration mechanism inspires us to design a network and solve the problem, which learns the matrix representation from the training data pairs. Our block is designed based on the (6). Figure 2 shows the block design for Landweber iteration step. Three residual blocks [16], two convolutional layers, and one ReLU activation are designed to calculate the . The residual block follows the same design with EDSR [15], which is composed of two convolutional layers, one ReLU activation, and a skip connection. After calculation, the inputs is subtracted by the , and then added by the .

Figure 3 shows the entire network design. First, one convolutional layer processes the input and gets the extracted feature map , which is described as

Then, four convolutional layers and three ReLU activation layers are utilized to calculate the from the extracted feature map aswhere denotes the CNN layers.

There are Landweber iteration blocks (LandBlock) in the network to perform the optimization according to the (6). For the th iteration, there iswhere denotes the th Landweber iteration block.

After th iteration, there is a padding structure composed of two convolutional layers and a ReLU activation to process the feature map. Then, a skip connection is designed to learn the residual information and improve the gradient transmission [16] as

Finally, there is an upscale module to restore the SR image from the feature map, which is described aswhere is the upscale module composed of one convolutional layer and a subpixel convolution [30].

The implementation detail of LandNet is as follows. The number of iteration blocks is set as . All convolutional layers are with channel number as except for the upscale module. The loss function is chosen as -norm.

4. Experiment

We train the network with DIV2K [47] data set. DIV2K data set contains 800 images for training and 100 images for validation. The images of DIV2K data set are with near 2K resolution, which are widely used in recent image SR works [7, 15, 17, 22]. We update our network for 1000 epoch by Adam optimizer [48] with learning rate . The patch size of training data is set as for LR input. All other settings are same with RDN [17]. The testing benchmark is chosen as Set5 [49], Set14 [50], B100 [51], Urban100 [52], and Manga109 [53].

4.1. Ablation Study
4.1.1. Investigation on the Computational Complexity

To show the effectiveness of our LandNet, we compare the performance, the computational complexity, and the parameters with recent works: OISR [54], MSRN [55], MemNet [56], EDSR [15], and DBPN [57]. The computational complexity is modeled as the number of multiply-accumulate operations (MACs). The MACs is calculated by restoring a 720P image with the given scaling factor. Table 1 shows the MACs, parameters, and PSNR comparisons with scaling factor . In the table, we can find our network achieves better performance than OISR and MSRN with fewer parameters and MACs. When compared with EDSR and DBPN, LandNet achieves competitive performance with much fewer parameters and MACs. Specially, LandNet holds near 10% parameters and MACs than EDSR and only drops 0.08 dB PSNR on Set5. Similarly, LandNet holds near 47% parameters and 5% MACs than DBPN while only suffers near 0.07 dB PSNR decrease. In this point of view, LandNet is proved to be an effective design for image super-resolution. Furthermore, we mainly compare our method with state-of-the-art methods (NLSN [38], CRAN [40]). In the table, we can find that NLSN has near 10 times MACs and parameters than LandNet with only 0.1 dB PSNR improvement on Set14. Similarly, CRAN has near 3 times MACs and parameters than LandNet, but only gains 0.3 dB PSNR improvement on Set5 and Set14. In this point of view, LandNet is an effective method for image super-resolution with restricted parameters and complexity.

4.1.2. Investigation on the Iterative Mechanism

To further investigate the effectiveness of iterative blocks, we demonstrate the results from different iterations. Figure 4 shows the output with different iteration blocks. In the figure, we can find that with the increase of iteration block , the artifacts are more and more suppressed and the visual quality becomes better. This is in accordance with the mathematical analysis.

Furthermore, we also compare the objective performances of different iteration blocks. For a fair comparison, we train the different network under the same protocol for 200 epochs. Table 2 shows the average PSNR/SSIM results of different iterations. We can find that with the increase of the number of the iteration block , the PSNR/SSIM results gradually increase. This is in accordance with the Landweber iteration that more iteration steps lead to more accurate solution.

4.1.3. Investigation on the Block Design

In the paper, we specially design the LandBlock to perform the iteration steps. To show the effectiveness of the block design, we compare our network with the modified version that substitutes the LandBlock with classical residual blocks. For a fair comparison, we train the different network under the same protocol for 200 epochs. Table 3 shows the PSNR/SSIM comparison. We can find that the LandBlock brings a significant improvement on all testing benchmarks, which demonstrate that the LandNet is an effective design.

4.2. Comparison with State-of-the-Art Methods

To show the performance of our LandNet, we compare our network with several traditional and recent works: SRCNN [13], FSRCNN [29], VDSR [14], DRCN [58], LapSRN [59], SelNet [60], RAN [61], DNCL [62], FilterNet [63], MRFN [64], SeaNet [65], DEGREE [66], FSN [67], MFSR [68], DSRLN [69], and MemNet [56]. The indicators are chosen as peak signal-to-noise ratio (PSNR) and structural similarity (SSIM). Table 4 shows the average PSNR/SSIM under degradation model bicubic-down (BI) with scaling factor , , and on five benchmarks. The best performances are shown in bold. The dash line means the paper does not report their performance.

In the table, we can find the proposed LandNet achieves the best performances on all five testing benchmarks and three scaling factors. When the scaling factor is , LandNet achieves 0.4 and 0.3 PSNR improvements than the MFSR on Urban100 and Manga109 benchmarks separately. Similarly, when the scaling factors are and , our network also achieves significant improvements on Urban100 and Manga109. It should be noticed that Urban100 is a data set with plentiful buildings, and Manga109 is a data set with comics, which contain a large amount of structural information. In this point of view, LandNet can effectively recover the high-frequency edge and structural information than other works.

Figure 5 shows the visual comparisons on Urban100 data set. We mainly compare our model with two representative methods: LapSRN [59] and Cross-SRN [7]. LapSRN is inspired by the Laplacian pyramid, which utilizes multiscale architecture to restore the structural information. Cross-SRN is specially designed to concentrate on the edges and lines. In the figure, we can find that LandNet achieves the best PSNR performance than other works. There is near 0.8 dB improvement on the two testing instances. In this point of view, the LandNet has superior structural information recovery capacity to other works. In the visual comparison, LandNet can recover more accurate lines. The results from LandNet are closest to the groundtruth.

Furthermore, we statistically verify the effectiveness of our methods. We conduct the two sample T-Test among LR, VDSR, and LandNet in pair to show the superiority of our method. Table 5 shows the two-sample T-Test comparisons. In the table, 0 means the left method performs statistically the same as the upper method. −1 means the left method performs worse than the upper method, and 1 means the left method performs better than the upper method. In the table, we can find that LandNet statistically improves the image quality and performs better than VDSR.

5. Discussion

5.1. Discussion on the Landweber Iteration

Landweber iteration is an effective method to solve the highly ill-conditioned problem such as image reconstruction [7072]. It has been proved with strong convergence and the precision becomes higher with the increase of iteration steps [73]. The image reconstruction can be generally described as (1), by substituting the degradation with different operations. There are numerous optimization methods to utilize the Landweber iteration method for image restoration [71, 74]. However, they highly rely on the explicit representation of the degradation , which is challenging for the image super-resolution task. As such, we propose a Landweber iteration-inspired network to adaptively learn the degradation representation and the solution from LR-HR pairs.

Figure 4 shows the effectiveness of Landweber iteration. The images in the figure are results from different iterations. We can find that with the increase of iterations, the image becomes clearer and the quality is boosted. This is in accordance with the Landweber iteration that the result converges with the increase of iteration time .

Different from straightforward end-to-end CNN methods for image reconstruction [13, 14, 29, 59], the Landweber iteration can find a more precise solution with the increase of iteration times. The optimization can adaptively adjust the descent direction with the help of the input. In this point of view, the Landweber iteration is more robust for finding a feasible solution, which is more suitable for solving the highly ill-conditioned image reconstruction issue. For the nonconvex optimization issue, Landweber iteration can find a good solution with higher PSNR/SSIM performance, as shown in Table 4.

5.2. Optimization Details

The parameter optimization is conducted by the Adam optimizer [48]. The Adam optimizer calculates the momentum for updating the network parameters. The learning rate is adaptively adjusted during the optimization step. We calculate the gradient of parameters by the back-propagation algorithm and use mean average error (MAE) to calculate the distance between the predict result and the label. We train our network for 1000 epochs. The batch size is set as 16, and the patch size is set as for LR input.

6. Conclusion

In this paper, we proposed an end-to-end network for single image super-resolution. We investigated the image super-resolution problem in the optimization perspective and derived an iterative scheme to solve the problem based on the Landweber iteration. According to the mathematical analysis, we devised an convolutional neural network block to simulate the iterative step and perform the optimization. Based on the block, an end-to-end network is designed for image restoration, which is termed as LandNet. Experimental results show the proposed LandNet can achieve competitive or better subjective and objective performances than state-of-the-art methods with fewer parameters and computational cost.

Data Availability

All data, models, or code generated or used during the study are available from the corresponding author by request ([email protected]).

Conflicts of Interest

The authors declare no conflicts of interest in this paper.

Acknowledgments

This research was supported in part by the National Natural Science Foundation of China (No. 61802161) and the Department of Education of Liaoning Province (No. JZL202015402).