Journal of Sensors

Volume 2016, Article ID 6370845, 12 pages

http://dx.doi.org/10.1155/2016/6370845

## An Efficient Image Enlargement Method for Image Sensors of Mobile in Embedded Systems

^{1}College of Electronics and Information Engineering, Sichuan University, Chengdu, Sichuan 610064, China^{2}College of Electrical and Engineering Information, Sichuan University, Chengdu, Sichuan 610064, China^{3}School of Software Engineering, Beijing Jiaotong University, Beijing 100044, China

Received 20 November 2015; Revised 5 March 2016; Accepted 21 March 2016

Academic Editor: Marco Anisetti

Copyright © 2016 Hua Hua et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

#### Abstract

Main challenges for image enlargement methods in embedded systems come from the requirements of good performance, low computational cost, and low memory usage. This paper proposes an efficient image enlargement method which can meet these requirements in embedded system. Firstly, to improve the performance of enlargement methods, this method extracts different kind of features for different morphologies with different approaches. Then, various dictionaries based on different kind of features are learned, which represent the image in a more efficient manner. Secondly, to accelerate the enlargement speed and reduce the memory usage, this method divides the atoms of each dictionary into several clusters. For each cluster, separate projection matrix is calculated. This method reformulates the problem as a least squares regression. The high-resolution (HR) images can be reconstructed based on a few projection matrixes. Numerous experiment results show that this method has advantages such as being efficient and real-time and having less memory cost. These advantages make this method easy to implement in mobile embedded system.

#### 1. Introduction

Over the last few decades, people have widely adopted mobile phones to life. For 2017, the number of mobile phone users will reach almost 5.3 billion. For many mobile phone users, mobile phone is used not only for spoken communication but also as a tool to capture images. Mobile phones offer great benefits to the users by enabling photography and video recording always and everywhere. Unfortunately, many of the images being taken with mobile phone are low in resolution since the low quality image sensor. There are two ways to obtain high-resolution images: (1) replace the mobile phone with a more powerful mobile phone; (2) use some methods to enlarge the images. Most of the mobile phone users prefer to use a method to enlarge the image rather than replacing the mobile phone with a more powerful mobile phone. Many efforts have been devoted to image enlargement methods in the past decade. However, the enlargement methods face three challenges when applied in embedded systems: (1) performance requirement, (2) real-time requirement, and (3) constraint on memory consumption.

Superresolution (SR) is one of the most prospective image enlargement methods. Existing SR methods can be divided into three categories: interpolation-based methods [1, 2], reconstruction-based methods [3–5], and example-based methods [6–10].

The interpolation-based methods [1, 2] apply the correlation of neighboring image pixels to approximate the fundamental HR pixels. These types of methods have lower computation complexities. However, the interpolation does not add any new detailed information into the enlarged image. The quality of the enlarged image is still unsatisfying and it may cause the aliasing to the enlarged LR image. Although the interpolation-based methods run fast and need little memory, the poor performance limits the application of interpolation-based methods for the implementation in embedded system.

Reconstruction-based methods [3–7] require different LR images of the same scene taken from slightly moved viewpoints, and those LR images have different subpixel shifts from each other. This category of methods tries to exploit additional information from a sequence of successive LR images of the same scene to synthesize HR images. Compared with interpolation-based methods, the reconstruction-based methods obtain better performance with a small desired magnification factor. However, the performance of this kind of methods degrades rapidly when the desired magnification factor becomes large. The reconstruction-based methods need to store the information of all the sequence LR images, which is high memory requirement. Due to the above reasons, reconstruction-based methods are not ready for embedded system.

Single image SR methods such as neighbor embedding-based methods [7, 8], regression-based methods [9, 10], and sparse representation-based methods [11–15] have been explored in recent years. These methods presume that the high-frequency details lost in the LR images can be predicted through learning the cooccurrence relationship between LR training patches and their corresponding HR patches. Recently, sparse representation-based methods have proven to be effective towards solving image superresolution problems. Yang et al. [16] proposed an approach based on sparse representation, with the assumption that the HR and LR images share the same set of sparse coefficients. Therefore, the HR image can be reconstructed by combining the trained HR dictionary and the sparse coefficients of the corresponding LR image. Although the sparse representation-based methods offer a good performance, the optimization of dictionary learning and image reconstruction has a problem of highly intensive computation. Besides, sparse representation-based SR methods reserve memory to store the information of HR dictionary and LR dictionary. The size of dictionary impacts the memory usage. Sparse representation-based SR methods require intensive large memory, especially with increasing size of dictionary. Both the time-complexity and memory usage are key limit factors in the embedded system applications of these methods. Zhan et al. [17] proposed a fast multiclass dictionaries learning method in MRI reconstruction. Timofte et al. [18] constructed a set of mapping relationships between the LR and HR patches using a learned LR-HR dictionary. Anchored Neighborhood Regression method [18] reformulates the problem as a least squares regression, which leads to a vast computational speedup while keeping the same accuracy as previous methods. Anchored Neighborhood Regression method calculates the mapping matrix based on a universal dictionary. However, a large number of different structural patterns exist in an image, whereas one dictionary is not capable of capturing all of the different morphologies. Besides, Anchored Neighborhood Regression method still needs to store separate projection matrix for each dictionary atom which is high memory usage.

The existing sparse representation-based SR methods always suffer from three main problems for embedded system. First, the performance of these methods is limited, since these methods only use one approach to extract the features of the image for presenting the LR image generally. However, the morphologies vary significantly across images. Different patches prefer different features for accurately representing different morphologies. A single feature extraction approach cannot represent the image accurately. Therefore, jointly representing an image with different kind of features is important. Furthermore, time-complexity and memory usage are key limit factors in the embedded system applications of these methods. The optimization of dictionary learning and image reconstruction leads to highly intensive computation. Sparse representation-based methods need to reserve memory to store the information of HR dictionary and LR dictionary. The size of dictionary impacts the memory usage.

Above all, this study makes the following three main contributions. (1) Jointly representing an image with different types of features is proposed in feature extraction stage. For accurately representing different morphologies, images (or patches) prefer different types of features extracted by different approaches, since one single feature extraction approach cannot accurately capture the essential features of the image. (2) Multiple dictionaries are learned based on different types of features in sparse representation stage, since one dictionary with single type of features is inadequate in capturing all of the different morphologies of the image. To capture the different morphologies of the image more accurately, multifeature dictionaries, which consist of different dictionaries with different features, are learned. (3) To reduce the computational cost and memory usage, we propose an Anchored Cluster Regression method. Anchored Cluster Regression method divides the dictionary atoms into several clusters. Then, the projection matrix for each cluster is calculated. In Anchored Cluster Regression method, each HR patch can be reconstructed by the projection matrix of its corresponding cluster. Anchored Cluster Regression method reformulates the problem as a least squares regression. It only needs to store the projection matrix of each cluster. Anchored Cluster Regression leads to a vast computational speedup and needs less memory.

#### 2. Sparse Representation-Based SR Method

Superresolution aims to reconstruct the HR image from the LR image, which can be formulated as follows:where is the observed low-resolution (LR) image. is its corresponding high-resolution (HR) image of the same scene. is a downsampled and blurred version of . denotes a downsampling operator and is the blur operator.

Let be LR patch of the LR image with the size at the location . Then, we havewhere is an operator that extracts a patch at position from the LR image .

Similarly, the corresponding HR patch is with the size at the location . And we have

With LR patch , is the feature extracted from . The feature can be expressed aswhere refers to extracting LR feature operator.

Subsequently, the corresponding HR feature is extracted from HR patch :where refers to extracting HR feature operator, which is usually the differences between the LR image and its corresponding HR image.

With the sparse generating model, each LR patch feature can be projected over the LR dictionary , which characterizes the LR patches. This projection produces a sparse representation of via :where and are the LR dictionary and the sparse representation of , respectively. Generally, in order to obtain an optimal that has the fewest nonzero elements, we should solve the following optimization problem:where is a constant.

Similarly, we have the sparse representation of the HR patch: where is the HR dictionary. Conventional sparse representation-based methods assume that the LR patch and its corresponding HR version share the same sparse coefficients in relation to their own dictionaries; namely, . Therefore,

HR dictionary is defined asThe sizes of the dictionaries and are and , respectively, where is the number of atoms in the dictionary. is the dimension of each atom in LR dictionary while is the dimension of each atom in HR dictionary.

It is clear that the sparse representation is a bridge between low-resolution and high-resolution patches. To generate such sparse representation, both LR dictionary and HR dictionary play a key role. The dictionaries and can be easily generated from a set of samples by the methods such as OMP [13].

Once sparse coefficients for each LR patch are learned, we can use this sparse representation to recover its corresponding HR patch. If we have obtained all the reconstructed HR patches, the HR image is recovered by averaging the overlapping reconstructed patches on their overlaps.

#### 3. The Proposed SR Method

The proposed method can be divided into three steps: (a) learning different dictionary based on different morphologies, (b) calculating the projection matrixes, and (c) reconstructing the HR mobile sensor image.

##### 3.1. Learning the Dictionaries Based on Different Features

Most existing sparse representation-based SR methods use only derivative features to represent the morphologies of LR image. However, the artifacts would occur when using inappropriate features. An explanation for this phenomenon is that dictionary learning from only one kind of features cannot represent essential morphologies of the images. Since the morphologies can vary significantly across images, different patches prefer different features for representation of their morphology accurately. As such, multifeature treatment can help represent the image in a more efficient manner. We propose a method which can present the image with different dictionaries based on different features.

For LR patch , different types of features can be adopted to represent it:where is the th kind of features of . denotes extracting th kind of features.

Similarly, for the HR patch , is the feature of it.

Given HR patch and LR patch , we can obtain kinds of LR and HR patch pairs for training.

Based on the kinds of LR and HR training sets prepared above, the LR and HR dictionaries of these training sets are learned from the following models.

The -SVD dictionary training is applied to the set of patches :where are sparse coefficient vectors of and is the norm counting the nonzero entries of a vector. Most sparse representation-based SR methods rely on the assumption that the HR and LR images share the same set of sparse coefficients. Therefore, the HR image can be reconstructed by combining the HR dictionary and the sparse coefficients of the corresponding LR image. Thus, the HR patch can be recovered by approximation as . can be calculated by minimizing the following mean approximation error; that is,

##### 3.2. Calculating the Projection Matrixes

Although sparse representation-based methods offer a good performance, the optimization of dictionary learning and image reconstruction has a problem of highly intensive computation. Besides, sparse representation-based SR methods need to reserve memory to store the information of HR dictionary and LR dictionary. The size of dictionary impacts the memory usage. Sparse representation-based SR methods require intensive large memory, especially with increasing size of dictionary. Both time-complexity and memory usage are key limit factors in the embedded system applications of these methods.

Timofte et al. [19] proposed an Anchored Neighborhood Regression method, which constructed a set of mapping matrixes between the LR and HR patches using learned LR and HR dictionaries.

Based on multiple dictionaries obtained in Section 3.2, Anchored Neighborhood Regression method solves this problem as follows: for each dictionary, to calculate the sparse representation of , problem (7) is reformulated as a least squares regression regularized by the -norm of the coefficients [19]: where is the LR dictionary of the th type of feature. is the corresponding HR dictionary of . is the sparse vector of .

Then, Ridge Regression is employed to solve the problem. The algebraic solution [19] is given as

Since sparse representation-based SR methods assume that the HR and LR images share the same set of sparse coefficients, therefore, the HR patches can be reconstructed by the sparse coefficients of the LR image and the corresponding HR dictionary :

We can obtain mapping matrixes between the LR and HR patches:

Equation (17) means that we can precalculate a mapping matrix for each dictionary. Inferring the HR patch becomes a multiplication for each input patch. The mapping matrix can be computed offline and saved as a simple matrix to be applied to new image patches, which makes vast computational speedup while keeping the same accuracy as previous methods.

Timoft et al. [18] group the dictionary atoms into neighborhoods. More specifically, for each atom in the dictionary, they compute its nearest neighbors, which will represent its neighborhood. Once the neighborhoods are defined, Anchored Neighborhood Regression method calculates a separate projection matrix for each dictionary atom based on its own neighborhood. The SR problem can then be solved by calculating the nearest atom in the dictionary for each input patch feature. Then, the HR patch can be reconstructed using the projection matrix of the nearest atom: where is the projection matrix of the atom and is the nearest atom of in the LR dictionary . is the neighborhoods set of atoms . is the corresponding set of the HR dictionary .

Anchored Neighborhood Regression method reformulates the SR problem as a least squares regression, which leads to a vast computational speedup. However, Anchored Neighborhood Regression method still needs to store separate projection matrix for each dictionary atom which is high memory usage. Memory usage is key limit factor in the embedded system application for Anchored Neighborhood Regression methods.

To reduce the memory usage, we propose an Anchored Cluster Regression method. This method divides the atoms into several clusters for each dictionary by -means clustering. Then, separate projection matrix of each cluster is calculated. Then, use the projection matrix of the nearest cluster to reconstruct the HR patch:where is the set of atoms in the cluster of the LR dictionary . is the corresponding set of the HR dictionary .

Anchored Cluster Regression method only needs to store the projection matrix of each cluster rather than the projection matrix of each atom. If atoms are divided into clusters, Anchored Cluster Regression method only needs to store projection matrix of each cluster, while Anchored Neighborhood Regression needs to store projection matrix of each atom. Anchored Cluster Regression significantly reduces the memory. Furthermore, the computational complexity of Anchored Cluster Regression method is , while complexity of Anchored Neighborhood Regression is , where is the number of atoms and is the number of clusters. Anchored Cluster Regression significantly reduces the computation.

##### 3.3. Reconstructing the HR Image

Given a LR patch, we can get different HR patches based on different projection matrixes. These different HR patches are integrated to generate the final reconstructed HR image.

For a LR patch , we get the kinds of features . For the features , we can find the nearest cluster of the th LR dictionary; then, we can obtain th estimated HR patch of the HR patch based on the projection matrix by (19).

Those different estimated HR patches are fused together to get a final reconstructed HR image of HR patch [20]:where is important. According to the weight , this study fuses the different estimated HR patches together to get the final reconstructed HR image :where is representation error function. reflects the accurateness of the sparse representation:where is smaller, is more similar to .

##### 3.4. Summary of Proposed Algorithm

The proposed method contains two phases, that is, learning phase and reconstruction phase. For the learning phase, features of different morphologies are extracted from training images. Then, valid multiple dictionaries are learned based on different morphologies. For each dictionary, the atoms are divided into multiple clusters. The projection matrix of each cluster is calculated by (19)–(21).

In the reconstruction phase, for each LR patch, features of different morphologies are first extracted. Then, for each type of features, the nearest clusters in its corresponding morphology dictionaries are found. Based on the projection matrixes of these clusters, multiple estimated HR patches are reconstructed in the final stage. Then, the final HR patches are generated by using weighting average to process all estimated HR patches. Ultimately, the HR image is composed through averaging the overlapping reconstructed patches. The algorithm is illustrated in Figure 1.