Abstract

Two real blind/no-reference (NR) image quality assessment (IQA) algorithms in the spatial domain are developed. To measure image quality, the introduced approach uses an unprecedented concept for gathering a set of novel features based on edges of natural scenes. The enhanced sensitivity of the human eye to the information carried by edge and contour of an image supports this claim. The effectiveness of the proposed technique in quantifying image quality has been studied. The gathered features are formed using both Weibull distribution statistics and two sharpness functions to devise two separate NR IQA algorithms. The presented algorithms do not need training on databases of human judgments or even prior knowledge about expected distortions, so they are real NR IQA algorithms. In contrast to the most general no-reference IQA, the model used for this study is generic and has been created in such a way that it is not specified to any particular distortion type. When testing the proposed algorithms on LIVE database, experiments show that they correlate well with subjective opinion scores. They also show that the introduced methods significantly outperform the popular full-reference peak signal-to-noise ratio (PSNR) and the structural similarity (SSIM) methods. Besides they outperform the recently developed NR natural image quality evaluator (NIQE) model.

1. Introduction

Image processing techniques such as acquisition, transmission, compression, restoration, and enhancement are in focus of current research. Therefore, quality assessment methods for these are in demand as well. Humans are the ultimate judge of image quality; however, their judgment is time consuming, subjective, and, at times, impractical. Hence, there is a need for automatic assessment, which is referred to as objective assessment.

Objective assessment can be categorized into full-reference (FR), reduced-reference (RR), and no-reference (NR) image quality assessment. FR models assess image quality by fully accessing the original image. RR models assess image quality by extracting some features from the reference image. Although FR and RR image quality assessment (IQA) methods provide a useful and effective way to measure the quality of distorted images, full or even partial reference image may not be available. In addition, the purification of reference images can be also uncertain. In these situations NR IQA method is the only available choice. As an example, perfect noise-free image is not available when assessing the quality of a denoising algorithm on a real-world database.

Most existing NR IQA methods are based on prior knowledge of the type of distortion, and so they are called “distortion-specific NR IQA” [15]. This specification limits the application of such algorithms. NR IQA algorithms which are non-distortion-specific are known as “general distortion algorithms.” General distortion algorithms that obtain a collection of distorted images with coregistering human scores are opinion aware (OA) [68]. On the other hand, algorithms that do not need training on databases of human judgments of distorted images are opinion unaware (OU) [9]. Among OU models, distorted images may not be available during IQA model construction or training, so models that do not require knowledge about anticipated distortions are distortion unaware (DU) [10].

A new model for no-reference image quality measurement using novel features of natural scenes is developed in this study. It introduces an original concept for gathering the effective features of the image based on edges. The enhanced sensitivity of the human eye to the information carried by edge and contour of an image supports this claim. A significant amount of image’s structure can be given by information carried by edge and contour and is crucial for human eye to grasp the scene [11]. This knowledge is used in building two separate real NR IQA algorithms. The model uses both Weibull distribution statistics and two sharpness functions as features to construct the algorithms.

Weibull statistics are used to form the first proposed algorithm features. These statistics facilitate efficient and rapid extraction of a scene’s gist [12]. Also in [12] the authors found that for natural images a large amount of visual gist information is contained in Weibull contrast statistics. Also the spatial structure of uniform textures of many different origins can be completely characterized by Weibull distribution parameters [13]. In addition Weibull distribution was used for defect detection in textures [14]. In [15] Weibull distribution was used to construct the learning mapping.

The sharper an image is the better its quality is as claimed by Singh and Chandler [16]. Moreover, more heavily weight judgments were given by humans of image quality from the sharp image regions [10]. Considering this knowledge, the model applied two sharpness functions of which their output parameters represent the extracted features of the second proposed algorithm.

Unlike other researches in the field of IQA which focus on improving prediction accuracy and ignoring algorithmic and microarchitectural efficiency, this study considers both of these issues. As it transfers from the research environment into application stage, the IQA algorithms face issues surrounding efficiency. These are such as execution speed and memory bandwidth requirements which began to emerge as equally important criteria. The algorithms that suffer in terms of lack of efficiency require relatively large memory and runtimes on the order of seconds for even modest-sized images (e.g., 1 MPixel) [17]. Such algorithms are likely to apply two key stages. First stage is local frequency-based decomposition of the input images where the algorithms potentially require a considerable amount of computation and memory bandwidth. This is especially when a large number of frequency bands are analyzed and when the decomposition had to be applied to each image as a whole. Second stage would seem to require more computation. The presented model in this paper avoids both of these two complexities where the model work in the spatial domain and no transforms (e.g., DCT and wavelet) are required.

2. Material and Method

The devised features are a set of natural low level features composed of locally normalized luminance and contrast values. These features have been modeled as pointwise statistics for single pixels. Pairwise based log-derivative statistics for the relation of adjacent pixels also have been obtained (Figure 1). Multivariate Gaussian model (MVG) is then used to fit the extracted features. The features that are corresponding to patches rich in edges are gathered. The distance between MVG fit of the features extracted from the distorted image and MVG model of the natural features extracted from natural (pristine) images is assigned as the distorted image quality score.

The model used in this research utilizes two kinds of features in which they form two separated algorithms. The features are generated using Weibull statistics and two sharpness functions as will be discussed in the flowing sections.

2.1. Normalized Luminance and Contrast Coefficients and Their Log-Derivatives

The model divides the image into patches of size and computes the contrast and normalized luminance of the distorted and the natural image for each of the patches. The normalized luminance, denoted by, , of both images is computed through local mean subtraction and contrast divisive normalization (MSCN) [18] defined as where and are spatial domain indices, and are the dimensions of the image, and are the estimated local mean and local contrast, respectively, and is a 2D circularly symmetric Gaussian weighting function sampled out to three standard deviations and rescaled to unit volume. After computing MSCN (1) and contrast coefficients (3), features are calculated through these coefficients for each patch. Features are extracted by means of log-derivative statistics [19].

To acquire the log-derivatives, the logarithm of is computed using (4) to create new image sub-b and :

The small constant ɛ is taken to be 0.1 to prevent from being zero. The five types of log-derivatives are then computed. These include horizontal, vertical, main-diagonal, secondary-diagonal, and combined-diagonal log-derivatives as given in (5)

In the spatial domain, the MSCN coefficients and their log-derivatives statistics significantly change in the presence of some distortion [6, 20]. This is the main premises of the proposed algorithm. The effectiveness of these statistics in modeling natural images and their variations due to different types of distortions has been examined in this study.

2.2. The Extracted Features

The proposed algorithms are distinguished by the type of features extracted, in which they are from Weibull statistics and two sharpness functions as below.

The Weibull Statistics Based Extracted Features. Weibull statistics are used to construct our first algorithm. The MSCN (1) and contrast coefficients (3) and their five log-derivatives are modeled using Weibull distribution (6), Figure 1; this gives 24 features (each of MSCN and contrast coefficients provides two features and their five derivatives provide twenty extra features). To describe multiscale behavior, these features are computed at two scales, by low pass filtering and downsampling by a factor of two, and this process leads to additional 24 features at yielding 48 overall. All features are extracted in the spatial domain. The probability density function (PDF) of Weibull distribution is given in where is the scale parameter, is the shape parameter, and is the origin of the contrast distribution. For natural images (as the case in this study) the origin is usually close to zero; however, this parameter is eliminated by stretching the contrast [14].

The features obtained by (6) for image patches were fitted with MVG density (7), to give their rich representation [10]: where are the features. The mean and covariance matrix of the MVG model are and , respectively.

The Two Sharpness Functions Based Extracted Features. The MSCN coefficients in (1) and the log-derivatives are modeled following two sharpness functions (Figure 1): grey level “amplitude” and grey level “variance” (8) [21] to form the second algorithm. The MSCN and the five log-derivatives used with each sharpness function come up with 12 model features as outputs of these two functions. These features are computed at two scales to portray multiscale behavior. This was achieved through low pass filtering and downsampling by a factor of two, and this process leads to a set of 24 features overall. All features are extracted in the spatial domain: where and “” and “” are mean and dimensions of a patch, respectively.

The proposed algorithm is based on the hypothesis that the sharper an image is the better its quality is [16]. This is so because more heavily weight judgments were given by humans of image quality from the sharp image regions [10].

The features obtained by (8) for image patches were fitted with MVG density (7), to give their rich representation [10].

2.3. Edges Based Natural Scene Statistic Model

The natural scene statistic (NSS) model was computed from 125 natural images, which were selected from Flickr data and from the Berkeley image segmentation database [22]. The features corresponding to patches with plenty of edges are selected. Each patch is divided into subpatches of size and only subpatches that are rich in edges (effective subpatches) are contributed into their main patches. Then the effective subpatches of each patch were computed. Patches that had an effective subpatch greater than 75% of the peak patch effective subpatches over the image are selected. The features corresponding to the selected patches were gathered. These features were then fitted to MVG model (7).

To compute the quality according to the procedure mentioned above, (9) is used. Consider

The mean vectors and covariance matrices of the NSS MVG and the tested image MVG models are , and , , respectively.

3. The Results and Discussion

The effectiveness of proposed features (based on both Weibull statistics and the two sharpness functions) in modeling pristine natural scenes for giving perfect quality measurement is investigated. This is done by comparing the statistics of these features with the statistics of features extracted from each of distortion types, as in Figures 2 and 3. These plots show how the natural (pristine) images based features suffer from changes due to various distortion types. The proposed model follows these changes and measures them to quantify the distorted image quality. The observation from Figure 2(a) shows that the plots of the pristine and the reference image are not completely overlapped. The reference image quality comes to be 3.30. This result highlights the issue of uncertainty in purification of the reference images which must be ideal. The same result can be observed from Figure 3(a). Figure 4 shows the lighthouse reference image and its five distorted versions from LIVE database. The features of these images are plotted in Figures 2 and 3.

In order to calibrate the proposed algorithms and test their performance, LIVE (Laboratory for Image and Video Engineering) IQA database [23] of 29 reference images and 779 distorted images is used. These are classified into five different types of distortions. These distortions can be a result of JPEG and JPEG2000 (JPEG2K) compression or introduced as Gaussian blur (Gblur). The image transmission through a Rayleigh channel also corrupts the image and is termed as fast fading (FF) distortion. One of the common types of distortion is the additive white Gaussian noise (WN).

Figure 5 shows scatter plots of differential mean opinion score (DMOS) versus peak signal-to-noise ratio (PSNR) (a), DMOS versus SSIM [24] index (b), DMOS versus the proposed algorithm based on Weibull statistics features (c), and DMOS versus the proposed algorithm based on the two sharpness functions features (d). The figure indicates that the introduced methods correlate with DMOS better than PSNR and SSIM.

To assess the prediction monotonicity, Spearman’s rank ordered correlation coefficient (SROCC) is used, while Pearson’s linear correlation coefficient (PLCC) is employed to evaluate the prediction accuracy of the proposed algorithm. Before PLCC is calculated, the objective scores are passed through a logistic nonlinear function [25] (where its parameters are found numerically using the MATLAB function “fminsearch” in the optimization toolbox) to maximize the correlations between subjective and objective scores.

Tables 1 and 2 indicate that the proposed algorithms perform better than the FR PSNR and SSIM methods. It is also clearly shown that the new methods outperform the recently developed NR OU-DU NIQE model [10].

4. Conclusions

Researchers must look for developing perceptual no-reference models that do not train on features extracted from distorted images and human opinion scores for practical considerations. However, choosing the appropriate features and the way to collect them play a significant role in the issue of IQA. In this study, a new technique for gathering novel low level features is devised. The validity of these features in measuring image quality is investigated. New two NR OU-DU estimators based on edges are built using these features. The new NR OU-DU methods have low computational complexity and extract the features in the spatial domain, so no transforms (e.g., DCT and wavelet) are required. The results show that the introduced methods give good performances.

One of the challenges that faces no-reference IQA and remains unsolved is to consider human visual system in design [17]. This research puts a brick in this regard by gathering natural features from image regions rich in edges to build DU-OU no-reference IQA.

When comparing the two proposed algorithms presented in this study, the observation that gathering the model features according to Weibull statistics predicts the rank-ordering of the opinion scores better than the sharpness functions as displayed in Table 1. Also Table 2 shows that the best prediction accuracy which measures how well an algorithm’s predictions correlate with DMOS values is done by sharpness functions features. The latter observation is also illustrated by Figure 5.

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.