WO2018058090A1

WO2018058090A1 - Method for no-reference image quality assessment

Info

Publication number: WO2018058090A1
Application number: PCT/US2017/053393
Authority: WO
Inventors: Dapeng Oliver Wu; Ruigang FANG
Original assignee: University Of Florida Research Foundation Incorporated
Priority date: 2016-09-26
Filing date: 2017-09-26
Publication date: 2018-03-29

Abstract

Aspects of the present disclosure are related to methods for automated image quality assessment without the use of a reference image. In some embodiments, quantitative measurements for specific artifacts that impact image quality such as blurriness, noisiness and blockiness (BNB) metrics are used to form a vector to represent the quality of a given electronic image. Based on this vector, entries in a data structure are selected. In some embodiments, a k-Nearest Neighbors algorithm (k-NN) is used to map the vector of BNB metrics of the electronic image to a human perception score based on vector differences between the quantitative measurements for the electronic image and similar quantitative measurements for known images to which human perception scores have been assigned. The human perception scores for the selected entries in the data set may then be combined to yield a quality score for the electronic image, emulating a quality score that would be assigned by human image evaluators.

Description

METHOD FOR NO-REFERENCE IMAGE QUALITY ASSESSMENT

RELATED APPLICATION

[0001] This application claims priority to and the benefit of U.S. Provisional Patent Application Number 62/399,985, filed September 26, 2016, the entire contents of which are incorporated herein by reference.

BACKGROUND

[0002] The present disclosure is generally related to the field of image quality assessment for electronic images processed in computing devices.

[0003] In recent years, both video and image media comprising electronic images have become a widely popular form of internet traffic and are displayed on a variety of electronic devices to media consumers. A media consumer's experience during the consumption of video/image media can be negatively impacted by distortions induced by compression and/or transmission losses over the internet or other computer networks. The media consumer's experience of viewing video/image media may also be impacted by characteristics of the electronic device displaying the media to the consumer, such as display pixel resolution and range of color gamut.

[0004] If a source image is available, or other image is available as a reference of what the electronic image is supposed to look like without the effects of compression, transmission or other processes that impact image quality, an automated assessment of image quality for the electronic image might be made by comparing the electronic image with the reference image. When there is no access to a source image to act as a reference, a no-reference image quality assessment may be performed to assess the quality of the electronic image. Heretofore, such no-reference image quality assessment techniques have not adequately reflected the image quality as perceived by a user.

SUMMARY

[0005] According to some embodiments, a computer-implemented method for assessing quality of an electronic image is disclosed. The method comprises computing values for a plurality of attributes of the electronic image; selecting a plurality of entries from a computer data structure. Each entry of the data structure comprises values of the plurality of attributes and a score. The selecting comprises selecting based on a vector difference of the values of the plurality of attributes of the electronic image relative to the values of the plurality of attributes of entries in the data structure. The method further comprises computing a quality score for the electronic image as a combination of the scores of the selected entries from the data structure.

[0006] According to some embodiments, a non-transitory computer readable medium comprising computer readable instructions that when executed by a processor, cause the processor to perform a method. The method comprises the acts of computing values for a plurality of attributes of the electronic image; selecting a plurality of entries from a computer data structure. Each entry of the data structure comprises values of the plurality of attributes and a score. The selecting comprises selecting based on a vector difference of the values of the plurality of attributes of the electronic image relative to the values of the plurality of attributes of entries in the data structure. The method further comprises computing a quality score for the electronic image as a combination of the scores of the selected entries from the data structure.

BRIEF DESCRIPTION OF DRAWINGS

[0007] The accompanying drawings are not intended to be drawn to scale. In the drawings, each identical or nearly identical component that is illustrated in various figures is represented by a like numeral. For purposes of clarity, not every component may be labeled in every drawing. In the drawings:

[0008] FIG. 1A is an image;

[0009] FIG. IB is a data plot illustrating the Laplace distribution of the image of FIG. 1A;

[0010] FIG. 2A is an image;

[0011] FIGs. 2B, 2C are data plots illustrating two sub-image Laplace distributions of the image of FIG. 2A;

[0012] FIG. 3A is an image;

[0013] FIGs. 3B are data plots illustrating the relationship between pixel distance and the variance of the Laplace distribution for the image of FIG. 3A;

[0014] FIG. 4A shows an illustrative image with a 3x3 window with variable names denoting the image intensity at the corresponding positions; [0015] FIG. 4B shows a data plot illustrating Laplace distribution showing a threshold and the probability of being over the threshold;

[0016] FIGs. 5A-C show data plots illustrating the relationship between blurriness and three features;

[0017] FIG. 6 shows a schematic diagram illustrating the relationship between frame sampling structure for vertical direction;

[0018] FIG. 7 shows a data plot illustrating the relationship between blockiness feature value and blockiness percentage;

[0019] FIGs. 8A-C are data plots illustrating the comparison between clear and distorted images;

[0020] FIGs 9A-D are exemplary images with Gaussian blur;

[0021] FIGs 10A-D are exemplary images with white noise;

[0022] FIGs. 11A-D are exemplary images with block artifacts;

[0023] FIG. 12 shows a data plot illustrating BNB performance for LIVE image database;

[0024] FIG. 13 is a schematic diagram of an illustrative computer 5000 on which any aspect of the present disclosure may be implemented.

DETAILED DESCRIPTION

[0025] The inventors have recognized and appreciated techniques to improve the automated computation of image quality scores without a reference image. Quality scores as computed herein may accurately reflect human perception of the quality of an electronic image. As these techniques can be performed without a reference image, they may be applied in settings in which arbitrary images are processed, including many settings in which images are transmitted for storage or display in a computer network, such as the Internet. The image quality scores may be used, for example, to automatically select parameters of image processing or display in a way that reduces computer resources without unacceptably degrading image quality as perceived by a human viewer of the images.

[0026] In accordance with some embodiments, a plurality of image quality attributes for a given image can be quantified to create a vector of metrics for the attributes. The vector of metrics may be used to select entries from a data structure storing multiple examples of those metrics linked to human perception scores. The selected entries may be used to compute a quality score representing a human perception of quality of the given image. The entries in the data set may be precomputed based on a set of images, such as a known library of images used in image processing research, but do not have to represent the same scene as the electronic image being processed. Accordingly, the quality score may be computed without the use of a reference image for the electronic image being processed. Moreover, as the image quality attributes may be computed without human input, the process of deriving an image quality score may be applied in settings in which automated processing is desirable.

[0027] The data structure may act as a "codebook" for processing other images, which are potentially unrelated to the images used to create the codebook. According to some aspects of the present application, to accurately reflect human perceptions of image quality, the codebook may be constructed using a library of images with known human perception scores. The codebook may be a data structure with a plurality of entries, each entry comprising a vector of metrics for a plurality of image quality attributes as well as a human perception score assigned to a sample image. For example, the codebook may be constructed using a library of popular images as sample images. Each entry in the codebook comprises a vector of metrics for a sample image in the library and a human perception score obtained by rating the sample image with real humans. However, it should be appreciated that each entry in the codebook need not correspond to any single sample image. In some embodiments, for example, each entry in the codebook may reflect an average of multiple sample images that were distorted to produce the same metrics of image quality. In other embodiments, for example, the codebook may be created heuristically based on observations of human sensitivity to variations in the metrics.

[0028] In some embodiments, the codebook may be constructed to reflect conditions under which the image is to be presented to a human viewer. For example, the human perception score for each sample image in the library may correspond substantially to the sample image displayed on a particular class of electronic devices. For example, the codebook may be constructed by rating a library of images on smartphone screens, high resolution flat-panel TVs, or computer monitors to reflect human perception scores when viewed on the respective class of devices, and different codebooks may be used in different automated systems, depending on the intended display format or, in some embodiments, a codebook may be selected automatically based on hardware or other configuration information obtained from a device to display an image.

[0029] According to some aspects of the present application, to assess the quality of an electronic image, a vector of metrics for a plurality of attributes is computed for the electronic image and compared with the vectors for entries in the codebook. Entries from the codebook may be selected for further processing based on similarity between the vector for the reference image and vectors for entries in the codebook. In some embodiments, a vector distance is computed between the vector of the electronic image the vector of entries in the codebook. A number of "nearest neighbor" entries from the codebook with vector distances smaller than a certain threshold may be selected as they represent a group of sample images most similar to the electronic image. The human perception scores of the selected codebook entries may then be used to synthesize a human perception score for the electronic image that most represent how a human user would perceive the quality of the electronic image.

[0030] In some embodiments, the human perception score for the electronic image may be computed by a weighted average of human perception scores of the selected codebook entries. In one example, a higher weight is assigned to a codebook entry with a smaller vector distance to the electronic image, such that more similar sample images are given more weight in the comparison.

[0031] The vector distance may be computed in any suitable fashion, such as an

Euclidean distance. In one example, the vector distance is a weighted Euclidean distance.

[0032] The inventors have recognized and appreciated that any image quality attribute may be used in assessing image quality according to aspects of the present application. In some embodiments, quantitative measurements for specific artifacts that affect image quality may be used. In one non-limiting example, the values for each of blurriness, noisiness and blockiness (hereinafter also referred to as "BNB") measurements may be used to form the vector of metrics to assess the quality of a given electronic image using a codebook of vectors of BNB metrics and human perception scores of known sample images, according to the sections discussed above. The BNB metrics quantifies the blurriness, noisiness and blockiness of a given image, which are considered three critical factors affecting users' quality of experience (QoE). [0033] In some embodiments, to construct a metric for each BNB artifact, features for each type of artifact are first extracted from the changing Laplace distribution, and then the quantitative relationship between the feature value and the variation of the artifact is identified. This method is rooted in the observation that for any image the difference between any two adjacent pixel values follows a generalized Laplace distribution with zero mean. This Laplace distribution changes differently when the image experiences various types of artifacts such as BNB.

[0034] As a specific example, starting with a data structure storing human perception scores of a popular image database with corresponding metrics for images in that database, a k-Nearest Neighbors algorithm (k-NN) is used to map a vector of three BNB metrics of an electronic image to a human perception score. The computation of BNB metrics and the k-NN approach require less computation as compared with more complex no-reference image quality assessment methods and may lead to cost savings in terms of reduced requirement of processor computing resources.

[0035] The human perception score may be used to indicate the quality of a given image displayed on an electronic device as perceived by an end user of the electronic device. In some applications, the given image may be part of media transmitted via teh internet for consumption by the end user, such as a still image of a video. The no- reference human perception score assessed from the image may be used in real time by the media supplier to gauge the quality of media delivered to the end user. For example a media supplier may adjust encoding parameters and/or increase transmission data rate if image quality is poor. In another example, a media supplier may determine that the image quality as perceived by the end-user on a portable electronic device with a low resolution display is sufficiently high and proceed to reduce media transmission data rate by e.g. down sampling to a lower resolution to save related bandwidth and storage costs without affecting the user's experience.

[0036] Accordingly, quality scores as described herein may be computed by a sending computing device preparing information for transmission to a receiving device. The sending device may use a codebook selected based on information about the display conditions sent by the receiving device to compute the quality scores. Alternatively or additionally, an image quality score may be computed by the receiving device, and transmitted to the sending device for use by the sending device in selecting parameters of the images to be sent. In some embodiments, the receiving device may select the image parameters from the quality score and send these parameters instead of or in addition to the quality score or may use the quality scores to otherwise control the display of images. However, it should be appreciated that the techniques introduced above and discussed in greater detail below may be implemented in any of numerous ways, as the techniques are not limited to any particular manner of implementation. Examples of details of implementation are provided herein solely for illustrative purposes. Furthermore, the techniques disclosed herein may be used individually or in any suitable combination, as aspects of the present disclosure are not limited to the use of any particular technique or combination of techniques.

[0037] 1 Introduction

[0038] The two kinds of image quality assessment methods are subjective and objective. For subjective methods, image quality is evaluated by human observers. While this method can offer accurate scoring, it is noisy, expensive, and can be time-consuming to acquire. These drawbacks make this method impractical for real-time applications. For objective methods, the goal is to provide computational models which automatically predict perceptual image quality. Typically, there are two main approaches in conducting objective assessments: content inspection and network-based measurement[22] . The former operates on the decoded content and can be used to design metrics ranging from simple pixel-to-pixel comparisons to sophisticated Human Visual System (HVS) frame level artifact analysis. The latter aims to predict the multimedia quality level based on the information gathered from the network conditions and packets without accessing the decoded video.

[0039] Decoded frame assessment offers three different methods to judge the image quality based on the availability of the original image: full-reference (FR), reduced- reference (RR), and no-reference (NR). FR methods have access to the original image, which may provide a means to offer certain connections to human visual perception using mean squared error (MSE), peak signal to noise ratio (PSNR), or structural similarity index (SSIM)[17]. Reduced reference entropic differencing (RRED) [20] is one example of an RR assessment scheme which only has access to partial information from the original image [1]. Since reference image information is often not available, both FR and RR methods have limited application range. NR methods handle the instances where information regarding the original image is unavailable. With the rise of scenarios that do not offer a mechanism for access to information from the original video/image, no-reference image quality assessment is both an essential and urgently needed technique.

[0040] No-reference techniques have been researched in the literature. These methods can be classified into two categories: 1) Artifact-Specific (AS) methods that measure the effect of specific artifacts such as blockiness [21], blurriness [10], noisiness [15], or ringing [25] on image quality, and 2) Non-Artifact-Specific (NAS) methods that do not measure the effect of specific artifacts on image quality. NAS methods are based on the idea of Natural Scene Statistics (NSS), which assumes that natural (undistorted) images occupy a small subspace of the space of all possible images. Using NSS, the quality of a test image can be represented by modeling its distance to the subspace of natural images.

[0041] AS methods are based on the assumption that distortions of an image are caused by specific artifacts; hence it is straightforward to take a divide-and-conquer approach, i.e., modeling the effect of each individual artifact on image quality and combining the effects of individual artifacts into a single image quality score (the key idea behind all AS methods). Therefore, the advantage of AS methods lies in directly identifying and quantifying the physical causes of image distortion. However, existing AS methods are not able to characterize the complicated interactions among multiple artifacts. These AS algorithms may perform well if a test image experiences only one type of artifact, but in reality an image may experience a mixture of multiple artifacts. In the Human Vision System, image quality perception is effected by nonlinear interactions among multiple artifacts. Since existing AS methods use a linear weighted sum to combine multiple artifact metrics into one single quality score [3], their performance is not satisfactory in characterizing these nonlinear relationships.

[0042] In contrast, NAS methods are inherently independent of specific types of artifacts since they derive features from different kinds of transformed domains, such as Wavelet[26], DCT [16], Spatial[l l], Curvelet [7], and Gradient [24], which are all non- artifact- specific. In most cases the features are entropies, or statistic parameters of the transformed coefficients. After feature extraction, NAS methods utilize more complex projection techniques when transforming from feature vectors to quality scores, such as

Support Vector Regression (SVM) and Neural Network Regression (NNR) [9], as opposed to linear weighted-sum methods. Recently, NAS methods are excelling in image quality assessment (IQA) due to their superior performance on several popular image databases. However, it is still unknown as to whether or not these NAS techniques would work well in other instances, since there is no evidence that the extracted features describe the image space completely. Without a complete description of the total image space, NSS methods still perform with uncertainty.

[0043] NAS methods have recently become the forefront of image quality assessment. In [13], the BIQI method was proposed, which is a two-step no-reference image quality assessment framework. Given a distorted image, the first step performs the wavelet transform and extracts features for estimation of the presence of a set of distortions which include those introduced by JPEG, JPEG2000, white noise, Gaussian blur, and fast fading. The probability of each distortion in the image is then estimated.

This first step is considered a classification step. The second step evaluates the quality of the image across each of these distortions by applying support vector regression on the wavelet coefficients. Although BIQI considers image distortion, the features it uses are derived from NSS. SSEQ [8], CurveletQA [7], and DIIVINE [14] also utilize the same type of two-step framework as described in BIQI, however the features used for each are from the spectral entropy and local spatial domain, curvelet domain, and wavelet domain, respectively. A method proposed in [11], BRISQUE, derives features from the empirical distribution of locally normalized luminance values and their products under a spatial natural scene statistic model. These features are then used in support vector regression to map image features to an image quality score. Unlike the two-step framework like that of BIQI, BRISQUE belongs to a one-step framework which does not require distortion classification. Other methods such as those in both [16] and [24] are similar to BRISQUE in that regard. The major difference between these 3 one-step methods is the feature space. In [16], the authors extract features from the Discrete

Cosine Transform (DCT) domain, whereas in [24] the authors utilize the joint statistics of two types of commonly used local contrast features: 1) the gradient magnitude (GM) map and 2) the Laplacian of Gaussian (LOG) response. In [6], image patches are taken as input and a Convolutional Neural Network (CNN) model is designed to predict the image quality score. This technique works directly in the spatial domain of the input without the need to hand-craft features, as is the case with most existing methods. A blind image quality assessment model was derived in [12] that only makes use of measurable deviations from statistical regularities observed in natural images, without training on human-rated distorted images or having any exposure to distorted images in general. In summary, various NR methods have different advantages and disadvantages.

[0044] In the present disclosure, a NR scheme is provided which combines the best features of both the AS and NAS methods. This is accomplished by developing three artifact- specific metrics and nonlinearly combining them. First, we employ the Natural Scene Statistics image property that the difference of two adjacent pixel values in an image follows a generalized Laplace distribution with zero mean and variance σ [5]. We observe that although different images may have different values of σ , the effect of a single type of artifact (i.e., blockiness, blurriness, or noisiness) on the Laplace distribution is similar and independent of image content. We leverage this image- content-invariant property to design metrics for three types of artifacts: blockiness, noisiness, and blurriness (hereinafter referred to as BNB metrics), which are considered the three most important types of artifacts induced in image compression and transmission. Second, when combining these three metrics, we abandon the usual inductive curve-fitting approaches, since we do not possess the required information to determine the exact relationship between these three artifacts and the human visual system (HVS) in general. Instead, we apply the transductive k-Nearest Neighbor algorithm to map the three BNB metrics of an image to a human perception score. We apply our scheme to the LIVE image quality assessment database [18, 19]. Our experimental results reveal a high correlation between the quality score obtained by our scheme and the provided subjective quality score.

[0045] Section 2 of this disclosure explores several key properties of the Laplace distribution for natural scene images, which are the basis for the design of our method. In this section, we will use experimental results to show the relationships between the variance of the Laplace distribution and pixel distance, low-pass filter, and high-pass filter, which explain why we use this natural scene property to assess image quality. In Section 3, we describe the three BNB metrics. In Section 4, we verify our metrics using two aspects of experimentation and provide detailed results. The algorithm developed and used to perform supervised learning to predict the perceptual value is discussed in Section 5. We compare our results from the LIVE database with existing models in Section 6. Finally, Section 7 concludes this paper with a discussion of final remarks.

[0046] 2 Laplace distribution characterization

[0047] For any pixel p(i, j) from a natural grayscale image 1 with size m x n , where i on and j < n , we construct two values s ₊ = p(i, j) - p(i, j + 1) and s ₊ = p(i, j) - p(i + 1, j) . A difference set, D , is constructed as the set of all such ^s .

We gather salient features from this difference set towards developing useful metrics. We observe that for different combinations of i and j , two ί are independent and follow the same Laplace distribution with zero mean and variance σ when i is not equal to j . Separate images can present different statistical properties ( <r in particular). Extensive experimentation supports these observations, with FIG. IB showing one such verification result. There are many properties related to Laplace distribution characterization which are important and useful in assessing image quality. We explore four properties in particular which are the motivation of the design of our metrics and include representative experimental results.

[0048] Property 1 : For a difference set, D, of any natural scene image, the statistical properties of any two equally down-sampled sub-sets, D₁ and D₂ , have the same statistical properties.

[0049] In order to support this property, the following experiment was designed: [0050] I₀ (i ) = I(3i ), z = 0,l,...,LyJ (1)

[0051] I_l (i ) = I(3i + l ), / = 0,1 L- J (2)

[0052] I₂ (i ) = I(3i + 2 ), / = 0, 1 L— J (3)

[0053] For a given image, / , we down-sample horizontally into 3 sub-images: /₀ , I_x , I₂ . We obtain these sub-images using (1), (2), and (3). With these sub-images, we create another two data sets /₀ = Ι - I_Q and I = I₂ - I . The statistical properties of these two sets are approximately the same. FIG. 2 displays an experimental result showing the original image in FIG. 2A and the Laplace distributions of set /₀ and set /_t having variances 130.56 and 131.05, respectively in FIG. 2B.

[0054] Property 2: For any two pixels of a natural scene image, the difference in pixel values follows a Laplace distribution that is related to the spatial distance between the pixels; an increased distance corresponds to a larger variance in the Laplace distribution.

[0055] The distance between any two separate pixels p(i, j) and p(k, l) is defined as the Euclidean distance of the two pixel positions as shown in (4).

[0056] d_p{iMl) = J(k - i)² + (l - j)² (4)

[0057] In the last experiment, the results are shown for pixel differences when d - 1 , or rather, adjacent pixels. Our next experiment further exhibits this property by showing the effects on the variance as you increase this distance. FIG. 3B visualizes the Laplace distributions of some such increases in the spatial distances between pixels, d .

[0058] Property 3: After convolving a natural scene image with a low-pass filter, f_x , the difference of values of the same pixel in the original image and the processed image will also follow a Laplace distribution.

[0060] In (5), we introduce f_x as a simple low-pass filter which can both lessen Gaussian noise and blur an image. Processing an image using this filter causes the variance of the difference of two adjacent pixel values to decrease. As shown in FIG. 4b, if we set a threshold λ in the distribution and define P as the probability of the difference of adjacent pixel values being larger than the threshold, P becomes smaller when the image is convolved with f_x . For a given pixel x₀ of an image / with pixel difference variance σ , processing / with f_x creates x₀ . The difference between x₀ and x₀ follows a new Laplace distribution with zero mean and variance of approximately—

81

[0061] Property 4: By processing a natural scene image with a high pass filter f₂ , any pixel value from the processed image will also follow a Laplace distribution. 0 - 1 0

[0062] - 1 4 - 1 (6)

0 - 1 0

[0063] In (6), f₂ is another very important tool which can be used to find high frequency content of images. For the structure shown in FIG. 4a, the pixel x_i is labeled x after filtering the image / with f₂ . The value of x₀ will follow another Laplace distribution with zero mean and variance— .

4

[0064] 3 NR Artifact Metrics

[0065] When testing image quality, we must consider the effects of image content. Many existing methods do not consider these effects when building metrics, which leads to instability and low accuracy. Although some methods try to mitigate this concern, they do so by simply reducing the content using 1-D filters. The use of 1-D filters causes two issues: (i) they do not remove image content well, (ii) they cause a loss of potentially valuable image information for quality assessment. To avoid these issues, we aim to be mindful of the content information by extracting features that are independent of the image content and related to image artifacts.

[0066] 3.1 Blurriness Metric

[0067] Many existing blurriness metrics are based on the idea that blurring reduces the sharpness of image edges. These metrics usually first find all edge points and their related edge directions, then calculate their average width [4]. These metrics can be problematic. First, defining an edge is challenging. Different definitions can vary the calculated edge extent, subsequently varying the average width value, and certain definitions may not even yield edges in an image. Additionally, blurriness affects more than the high frequency components of pixels in an image. An example of this would be when the average edge width of a clear image with fewer high frequency components is smaller than that of a blurred image containing more high frequency components.

[0068] With our method, we propose a framework for utilizing all pixels rather than the edge points within a frame. As previously mentioned in Property 4, for any clear or blurred image filtered using f₂ , the pixel values of this filtered image can be approximately modeled by a Laplace distribution. We now use V to denote the variance of the Laplace distribution. Convolving an image with f blurs the image, decreasing the value of V . The more times an image is processed using f_x , the blurrier the image becomes and the smaller the value of V . In FIG. 5, different curves are shown for different images with various image content. The horizontal axis represents the blurriness (the number of times the filter f_x was applied to the image), and the vertical axis signifies the variance V .

[0069] As can be observed in FIG. 5a, although different image content produces various specific relationships between V and blurriness, we notice a regularity consisting of an increasing blurriness accompanied by a decreasing V . Having identified this regularity, we use V as a feature to model the blurriness of images containing the same content, since it does not necessarily differentiate blurriness well for images with different content. To provide a more robust feature for handling different image content, we adopt an alternative feature: V—V_i . Given any image / , we blur using f_x to obtain the blurred image I_x . We form V₁ by subsequently filtering I_x with f₂ and calculating the variance of the resulting pixel values.

[0070] As shown in FIG. 5b, V—V₁ is a better blurriness feature than V since it reduces the scale along the feature axis, however it is still not robust to image content. Normalizing V—V₁ by V , we obtain our desired blurriness feature, γ_χ , as defined in (7).

V - V,

[0071] γ_χ =— (7)

[0072] A visualization of this feature is shown in FIG. 5c. The curves representing different image content have a more regular relationship between blurriness and our blurriness feature. This denser coupling signifies that this feature can help alleviate the image content issue discussed earlier.

[0073] In Section 4, we use the LIVE image database to show that our feature provides a better characterization of blurriness. Independent of the content of an image, we observe a decrease in our blurriness feature, γ_χ , as the blurriness of an image increases.

[0074] 3.2 Noisiness Metric [0075] We use and improve upon the following intuitive idea to design our noisiness metric for quality assessment. In using an image-denoising method to remove the noise of two separately distorted images containing differing levels of noise from the same original frame, the image with the larger difference between itself and the denoised frame is the noisier of the two.

[0076] To utilize and analyze this idea, we still make the assumption that the difference between any two adjacent pixels follows a Laplace distribution and pixel values contain additive, independent Gaussian noise. This assumption follows from Property 2, and for generalization and ease of computing, we maintain the use of the average filter f_x which can remove part of the Gaussian noise.

[0077] Similar to the blurriness metric, we obtain the variance of the coefficients, V , by processing a noisy image / with filter f₂ . Further processing / with filter f_x we get /_j , a denoised version of / . Again, V₁ is calculated as the variance of the pixel values after processing I_x with f₂ . Since I_x has less noise than image / , V will be larger than V_x . We employ V—V₁ as a noise feature, which maintains a good response to noise for images with the same content. Experimentation verifies that the larger the value of V—V_x , the noisier the image. For images containing different content, the value of V— V₁ will not always be the same even though images may have the same noisiness or level of human perception. To address this problem, we apply normalization to obtain (8).

V - V,

[0078] ^2 =—^ (8)

[0079] After normalization, the value of γ₂ increases as the level of noise increases, or alternatively, as the level of human perception decreases for any type of image content. We verify this property of γ₂ in Section 4 by using noisy images from the LIVE image database. Although the blurriness and noisiness metrics are defined in the same fashion, they have different influences on image quality assessment. The distinction will be further elaborated using experimental results in Section 4.

[0080] 3.3 Blockiness Metric [0081] Blockiness appears at a block boundary as a byproduct of encoding, decoding, or transmission. If there appears to be block-like artifacts in a frame, Property 1 states that the statistical relationship between two adjacent pixels in the same block will be different than that of two adjacent pixels from different blocks. To make use of this statistical property, the image is partitioned into b_s xb_s blocks and sampled in the horizontal and vertical directions as shown in (9) and (10). Methods for constructing these two types of down- sampling are shown below.

[0082] 1. Horizontally down-sample the frame to get sub-sampled images I_h . Here, k is all the required k'^h rows in the b_s xb_s block.

[0083] I_h = I(b + k, j), / = 0,1 L-^J (9)

[0084] 2. Vertically down-sample the frame to get sub-sampled images I_v . Here, k is all the required k^th columns in the b_s xb_s block.

[0085] I_v = I(i, b + k), 7 = 0,1, ... , L-^-J (10)

[0086] The size of a block can be adjusted according to application requirements. In our work, we use b_s = 8 . An example of the vertical sampling structure is shown in FIG.

6. The dark symbols inside a grid correspond to pixels in the resulting sampled sub- images. The different symbols correspond to different sub-images. Another two data sets D₁ and D₂ can be obtained by taking the difference of sub-images s₇ and s₆ and the difference of sub-images s₀ and s₇ , respectively. If blockiness is not present in the image, the pixel values of the data sets D₁ and D₂ should follow a similar Laplace distribution. If we set the same threshold in two Laplace distributions, β ^ν) represents the numbers of pixels which are larger than the threshold in D_i . It is apparent that the values of β ^ν) and ¾^v) should be close for a non-blocky image. However, if there is blockiness present in the frame where the boundary pixel values are similar to the contiguous pixels of the block artifact portion, but different from the contiguous pixels in other blocks, it is straightforward to observe that β ^ν) will decrease while ¾^v) increases. This will cause the ratio of ¾^v) to ¾^(v) to become larger. Similarly, we can also easily construct this ratio for the horizontal case as well to see that this ratio also increases. We combine these two directional assessment ratios to obtain the following expression for blockiness:

J _f blockiness

[0087] In (11), ζ is introduced as a tuning parameter to allow f_Mockiness to be tailored for a variety of situations and is chosen to be 1 in our experimentation. For a frame without blockiness, the value of f_blockiness should be close to one. Introducing blockiness into a frame will increase the value of f_Mockiness■ In FIG. 7, the five curves represent five different images. For each image, we randomly add different percentages of blockiness, which we call γ₃ , while measuring the value of f_hlockiness . We observe that f_hlockiness increases with respect to increases in γ . The relationship between f_Mockiness and γ is modeled as a quadratic function, as shown in (12). The reason for the appearance of noise about the quadratic curve stems from our method of randomly inserting block artifacts. As we increase our probability of insertion, we are inadvertently increasing the probability of spatially coincident block artifacts. The curves appear to converge to a value of 1 as γ approaches 0, which verifies our assumption. Due to the different statistical properties of separate images, each image will have a different quadratic function related to the response of blockiness, which should not be ignored when performing quality assessment. For any image / we can quickly calculate f_Mockiness , which we will now refer to as / . We randomly add 5% and 10% blockiness into the original frame separately to obtain two new images and calculate their / as f_5% and f_l0% , respectively. The values of the parameters a and b of the quadratic function and γ₃ of / can be obtained by solving (12) through (14) where / , f_5%, and f_l0% are known.

[0088] f_blockmess = f = a _{3 +} b r_{3 +} l (12)

[0089] f_5% = a (γ₃ + 5) ² + b (γ₃ + 5) + 1 (13)

[0090] f_10% = a (r₃ + i0) ² + b (γ₃ + 10) + 1 (14) [0091] 4 Metric Verification

[0092] In the previous section, we proposed three distortion metrics. We now validate their feasibility through extensive experimentation.

[0093] 4.1 Classification between clear and distorted images

[0094] A good distortion metric should offer a delineation between clear and distorted images. For this verification, we collect 85 distortionless images from several image databases. For the distorted images, we employ Gaussian blur, white noise, and JPEG images from the LIVE image database which correspond to blurriness C ), noisiness { J₂ ) ^and blockiness ( γ₃) metrics, respectively. The metric values are calculated for both clear and distorted images and displayed in FIG. 8.

[0095] In FIG. 8a, the dashed line is the density of γ_λ for 85 clear images, and the solid line is the density of γ_λ for the same number of Gaussian blur images. These two densities are estimated as a Gaussian kernel by using the two γ_χ histograms of clear and Gaussian blur images, respectively. As can be observed in FIG. 8a, the overlap of the two densities is relatively small, so our blurriness metric is a good candidate to classify between clear and blurry images. We obtain similar results for our noisiness { J₂ ) ^and blockiness ( γ₃) metrics, and show them in FIG. 8b and 8c, respectively, using dashed lines for the densities of clear images and solid lines for the densities of noisy and blocky images.

[0096] 4.2 Relationship between Blurriness and Noisiness

[0097] Previously, we introduced our blurriness and noisiness metrics, and they appeared to be equivalent. The noisiness metric provides a smaller value for a better quality image. However, the blurriness metric offers a larger value for a better quality image. The noisiness (blurriness) metric value of a clear image lies between the metric values of a completely blurred image and completely noisy image. For better use of these two metrics, we first estimate the range (R ) of the noisiness (blurriness) metric value for clear images through extensive experimentation and then refine these two metrics into (15) and (16) below. γ - γ V - V

¹ if ¹ < max(R)

V v

[0098] o otherwise (15)

V - V, V - V,

¹ 1 ¹ < /MfiU (R)

V

[0099] 7₂ 0 otherwise (16)

[0100] 4.3 Correlation between metric value and human perception

[0101] For a distortion metric, it is important to have agreement between the metric value obtained from an image and a human perception score. As a simple verification, we list 4 images for each distortion and give their related metric values and human perceived scores, as displayed in Tables 0, 1 and 2. The human perceived scores of an image are in the range [0,100], with a larger score signifying a poorer quality image.

[0102] Table 1: Blurriness and human perception for blurry images

[0106] Table 4: Comparison of Srocc for three metrics

[0108] In Table 4, we calculate the metric values for all blurry, noisy and blocky images from the LIVE image database and obtain their Spearman Rank Order correlation (Srocc). Mylene[4] tried to measure the same distortions: noisiness, blockiness, and blurriness, so we compare the Srocc for metrics from our model with Mylene's model. Our comparison reveals that our metrics are more correlated with human perception.

[0109] 5 Overall Perceptual Value Estimation

[0110] If an image is distorted by one or more artifacts, the overall distortion can be measured as a combination of distortions due to individual artifacts. In general, there are many ways to combine features to find a good quality assessment metric. The weighted Minkowski metric used by [3] is shown in (17). When p = 1 , it becomes a linear combination metric:

Q = ( - Blockiness ^p + β^■ Blurriness ^p + γ^■ Noisiness ^p ) ^p (17)

[0111] Another metric, (18), is given by [23] as:

Q = a + fiBlockiness⁷¹ Blurriness^{y '}Noisiness⁷³ (18)

[0112] The parameters of these two models can be estimated by using curve fitting and subjective data. One critical problem when using these models is that we do not know the best curve fitting function a priori. Due to the difficulty in determining the interplay between the three artifacts and how they influence human perception, we find reasonable functions greedily (one by one). Since in this scenario the result accuracies are limited, we were not able to find a good parametric method to form a model to provide universally valid results. Due to this realization, we propose the use of non- parametric methods to predict the human perceptual score. We employ the codebook method, creating the following as our algorithm.

[0113] 1. Codebook Construction: One element of the codebook is a vector including four values: γ_χ , γ₂ , γ₃ and perceptual values denoted as ( , _j , C_{i 2} , C_{i 3} , C, ₄ ).

Given a training image and matching human perceptual score, we map the score to one element of the codebook using our proposed three artifact metrics to extract the related feature values. In our experiments, we employ the LIVE image database. Because we focus on significant blockiness, blurriness, and noisiness artifacts, we utilize the JPEG, Gaussian blur, and white noise portions of the database to build our codebook. [0114] 2. Neighborhood Construction: For any test image / , the artifact metric values ( / , / , / ) are calculated and used to find the k-NN's from the codebook. We define the distance between a test image and the images in the codebook as the weighted Euclidean distance assigning different weights for different artifacts. This distance between the test image / and the image C_i from the codebook, where p , q , r are the weights for the three artifact metric values, is shown in (19):

d_(I,Ci) = ^M _l + M_ia + M ₃ (19), where M _{i X} = p(l^ - C_{i X} f

M ₂ = q(I_r2 - C,.₂ )² ,.₃ = r(/ ₃ - C,.₃ )² .

[0115] 3. Perceptual Score Prediction: After finding the & -nearest neighbors, we use the k perceptual values provided by the codebook to predict the test image perceptual value. Because the distances of the k-NN's are different, it is not necessarily reasonable to simply use the mean of their values to make the prediction. With this consideration, we assign weights related to the distances. Suppose d_i is the distance between the test image / and the neighboring image C_i , the weight w_i of C_i is defined in (20). With these weights, we can predict the test image perceptual value P, by using (21).

J_

[0116] _W = - L— (20)

[0117] Ρ, =∑^_ίΑ (21)

=l

[0118] 6 Experimental Results

[0119] We use the images from the LIVE database to verify our model. Some examples of images used from this database are shown in FIGs. 9, 10, and 11. The experimental details and results are given below.

[0120] 6.1 Experiment Description [0121] Because our focus is on the three significant artifact types, there are only 581 images from this dataset available to test our model. We were unable to divide these images into both a training and a testing set since building a codebook requires a large number of images to train. Because of this limitation, we apply the one-vs-all model that selects one virgin image as test data, constructs the codebook and trains the model parameters using the rest of the images, predicts the perceptual value of the test data using the codebook and well-trained model, and repeats this process until all the images are used for prediction once. Using this method, we can ultimately have 581 test images which is much more appealing. In our codebook model, there are four important parameters: p , q , r , and k . The first three parameters represent the distance weights for the three artifact types, and the fourth is the number of nearest neighbors used for prediction. We utilize genetic algorithms to find the most suitable parameters in such a large parameter space.

[0122] 6.2 Experiment Results

[0123] The predicted results for the 581 images are presented in FIG. 12. This representation allows for the visualization of outliers in the results. We believe the outliers originate from one of three causes:

[0124] 1. There are only a select amount of images whose perceptual values are in the range [0, 20] in the database.

[0125] 2. The size of the database is not sufficiently large enough to construct a well represented codebook.

[0126] 3. Some perceptual values provided by the database seem unreliable, such as in the JPEG images "img28" and "img30" which are nearly indistinguishable, however the difference between the two images' perceptual values is 23.

[0127] We expect better prediction results from the use of a larger data set, which would help mitigate these three outlier causes.

[0128] 6.3 Comparison Study

[0129] Ensuring the same computational environment and datasets, we compare the results of our proposed model with the best existing no-reference image quality assessment methods: Blindll [16], NIQE [12], BRISQUE [11], and BIQI [13]. In Table

7, Srocc is the Spearman's rank correlation coefficient and Pear is the Pearson correlation coefficient, where both have the property that the larger the coefficient value, the stronger the correlation. Time is the amount of time in seconds for computation to process an image from feature extraction to quality score prediction. The comparison results in Table 7 show that our BNB method achieves better correlation with human perception than existing methods. To predict the quality of an image, our BNB method requires the least amount of computation time, better allowing for real-time applications. It should be appreciated that we can further decrease the algorithm's computation time due to the fact that our method uses statistical information from the rather large pixel space (there are about 400,000 pixels for one experimental image), and using fewer pixels could still provide statistically relevant information without a loss in performance. In other words, we could use much fewer representative blocks to process the assessment. Accordingly, when we discuss processing of an image, it should be understood that, unless otherwise indicated by context, the processing may be on the entire image or a relevant part of the image.

[0130] Table 5 Comparison on Live Image Database

[0131] As discussed above, image quality scores as described herein - though they represent human perception of an image - may be computed without human input about image quality once the data structure acting as the codebook has been created. Accordingly, these techniques may be executed by hardware and/or software components in one or more computing devices that process images for display, including computing devices that transmit devices for display over a network or that receive images from another device for display. FIG. 13 shows, schematically, an illustrative computer 5000 on which any aspect of the present disclosure may be implemented. In the embodiment shown in FIG. 13, the computer 5000 includes a processing unit 5001 having one or more processors and a non-transitory computer-readable storage medium 5002 that may include, for example, volatile and/or non-volatile memory. The memory 5002 may store one or more instructions to program the processing unit 5001 to perform any of the functions described herein. The computer 5000 may also include other types of non-transitory computer-readable medium, such as storage 5005 (e.g., one or more disk drives) in addition to the system memory 5002. The storage 5005 may also store one or more application programs and/or external components used by application programs (e.g., software libraries), which may be loaded into the memory 5002.

[0132] The computer 5000 may have one or more input devices and/or output devices, such as devices 5006 and 5007 illustrated in FIG. 13. These devices can be used, among other things, to present a user interface. Examples of output devices that can be used to provide a user interface include printers or display screens for visual presentation of output and speakers or other sound generating devices for audible presentation of output. Examples of input devices that can be used for a user interface include keyboards and pointing devices, such as mice, touch pads, and digitizing tablets. As another example, the input devices 5007 may include a microphone for capturing audio signals, and the output devices 5006 may include a display screen for visually rendering, and/or a speaker for audibly rendering, recognized text.

[0133] As shown in FIG. 13, the computer 5000 may also comprise one or more network interfaces (e.g., the network interface 5010) to enable communication via various networks (e.g., the network 5020). Examples of networks include a local area network or a wide area network, such as an enterprise network or the Internet. Such networks may be based on any suitable technology and may operate according to any suitable protocol and may include wireless networks, wired networks or fiber optic networks.

[0134] Having thus described several aspects of at least one embodiment of this invention, it is to be appreciated that various alterations, modifications, and improvements will readily occur to those skilled in the art.

[0135] For example, it was described that the quality score is computed for an image to be displayed. However, it should be appreciated that in many settings, a collection of similar images may be displayed on the same device, such as may occur, for example, when a stream of images is displayed as a video. In some embodiments, the quality score may be computed for one or more images in the collection. Those quality scores may be used to select parameters of image processing or display, which may be applied to the collection of images. [0136] Such alterations, modifications, and improvements are intended to be part of this disclosure, and are intended to be within the spirit and scope of the invention. Further, though advantages of the present invention are indicated, it should be appreciated that not every embodiment of the invention will include every described advantage. Some embodiments may not implement any features described as advantageous herein and in some instances. Accordingly, the foregoing description and drawings are by way of example only.

[0137] The above-described embodiments of the present invention can be implemented in any of numerous ways. When implemented in software, the software code can be executed on any suitable processor or collection of processors, whether provided in a single computer or distributed among multiple computers. Such processors may be implemented as integrated circuits, with one or more processors in an integrated circuit component, including commercially available integrated circuit components known in the art by names such as CPU chips, GPU chips, microprocessor, microcontroller, or co-processor. Alternatively, a processor may be implemented in custom circuitry, such as an ASIC, or semicustom circuitry resulting from configuring a programmable logic device. As yet a further alternative, a processor may be a portion of a larger circuit or semiconductor device, whether commercially available, semi-custom or custom. As a specific example, some commercially available microprocessors have multiple cores such that one or a subset of those cores may constitute a processor. Though, a processor may be implemented using circuitry in any suitable format.

[0138] Further, it should be appreciated that a computer may be embodied in any of a number of forms, such as a rack-mounted computer, a desktop computer, a laptop computer, or a tablet computer. Additionally, a computer may be embedded in a device not generally regarded as a computer but with suitable processing capabilities, including a Personal Digital Assistant (PDA), a smart phone or any other suitable portable or fixed electronic device.

[0139] Also, a computer may have one or more input and output devices. These devices can be used, among other things, to present a user interface. Examples of output devices that can be used to provide a user interface include printers or display screens for visual presentation of output and speakers or other sound generating devices for audible presentation of output. Examples of input devices that can be used for a user interface include keyboards, and pointing devices, such as mice, touch pads, and digitizing tablets. As another example, a computer may receive input information through speech recognition or in other audible format. In the embodiment illustrated, the input/output devices are illustrated as physically separate from the computing device. In some embodiments, however, the input and/or output devices may be physically integrated into the same unit as the processor or other elements of the computing device. For example, a keyboard might be implemented as a soft keyboard on a touch screen. Alternatively, the input/output devices may be entirely disconnected from the computing device, and functionally integrated through a wireless connection.

[0140] Such computers may be interconnected by one or more networks in any suitable form, including as a local area network or a wide area network, such as an enterprise network or the Internet. Such networks may be based on any suitable technology and may operate according to any suitable protocol and may include wireless networks, wired networks or fiber optic networks.

[0141] Also, the various methods or processes outlined herein may be coded as software that is executable on one or more processors that employ any one of a variety of operating systems or platforms. Additionally, such software may be written using any of a number of suitable programming languages and/or programming or scripting tools, and also may be compiled as executable machine language code or intermediate code that is executed on a framework or virtual machine.

[0142] In this respect, the invention may be embodied as a computer readable storage medium (or multiple computer readable media) (e.g., a computer memory, one or more floppy discs, compact discs (CD), optical discs, digital video disks (DVD), magnetic tapes, flash memories, circuit configurations in Field Programmable Gate Arrays or other semiconductor devices, or other tangible computer storage medium) encoded with one or more programs that, when executed on one or more computers or other processors, perform methods that implement the various embodiments of the invention discussed above. As is apparent from the foregoing examples, a computer readable storage medium may retain information for a sufficient time to provide computer-executable instructions in a non-transitory form. Such a computer readable storage medium or media can be transportable, such that the program or programs stored thereon can be loaded onto one or more different computers or other processors to implement various aspects of the present invention as discussed above. As used herein, the term "computer-readable storage medium" encompasses only a computer-readable medium that can be considered to be a manufacture (i.e., article of manufacture) or a machine. Alternatively or additionally, the invention may be embodied as a computer readable medium other than a computer-readable storage medium, such as a propagating signal.

[0143] The terms "code", "program" or "software" are used herein in a generic sense to refer to any type of computer code or set of computer-executable instructions that can be employed to program a computer or other processor to implement various aspects of the present invention as discussed above. Additionally, it should be appreciated that according to one aspect of this embodiment, one or more computer programs that when executed perform methods of the present invention need not reside on a single computer or processor, but may be distributed in a modular fashion amongst a number of different computers or processors to implement various aspects of the present invention.

[0144] Computer-executable instructions may be in many forms, such as program modules, executed by one or more computers or other devices. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. Typically the functionality of the program modules may be combined or distributed as desired in various embodiments.

[0145] Also, data structures may be stored in computer-readable media in any suitable form. For simplicity of illustration, data structures may be shown to have fields that are related through location in the data structure. Such relationships may likewise be achieved by assigning storage for the fields with locations in a computer-readable medium that conveys relationship between the fields. However, any suitable mechanism may be used to establish a relationship between information in fields of a data structure, including through the use of pointers, tags or other mechanisms that establish relationship between data elements.

[0146] Various aspects of the present invention may be used alone, in combination, or in a variety of arrangements not specifically discussed in the embodiments described in the foregoing and is therefore not limited in its application to the details and arrangement of components set forth in the foregoing description or illustrated in the drawings. For example, aspects described in one embodiment may be combined in any manner with aspects described in other embodiments.

[0147] Also, the invention may be embodied as a method, of which an example has been provided. The acts performed as part of the method may be ordered in any suitable way. Accordingly, embodiments may be constructed in which acts are performed in an order different than illustrated, which may include performing some acts simultaneously, even though shown as sequential acts in illustrative embodiments.

[0148] The indefinite articles "a" and "an," as used herein in the specification and in the claims, unless clearly indicated to the contrary, should be understood to mean "at least one."

[0149] The phrase "and/or," as used herein in the specification and in the claims, should be understood to mean "either or both" of the elements so conjoined, i.e., elements that are conjunctively present in some cases and disjunctively present in other cases. Multiple elements listed with "and/or" should be construed in the same fashion, i.e., "one or more" of the elements so conjoined. Other elements may optionally be present other than the elements specifically identified by the "and/or" clause, whether related or unrelated to those elements specifically identified. Thus, as a non-limiting example, a reference to "A and/or B", when used in conjunction with open-ended language such as "comprising" can refer, in one embodiment, to A only (optionally including elements other than B); in another embodiment, to B only (optionally including elements other than A); in yet another embodiment, to both A and B (optionally including other elements); etc.

[0150] As used herein in the specification and in the claims, the phrase "at least one," in reference to a list of one or more elements, should be understood to mean at least one element selected from any one or more of the elements in the list of elements, but not necessarily including at least one of each and every element specifically listed within the list of elements and not excluding any combinations of elements in the list of elements.

This definition also allows that elements may optionally be present other than the elements specifically identified within the list of elements to which the phrase "at least one" refers, whether related or unrelated to those elements specifically identified. Thus, as a non-limiting example, "at least one of A and B" (or, equivalently, "at least one of A or B," or, equivalently "at least one of A and/or B") can refer, in one embodiment, to at least one, optionally including more than one, A, with no B present (and optionally including elements other than B); in another embodiment, to at least one, optionally including more than one, B, with no A present (and optionally including elements other than A); in yet another embodiment, to at least one, optionally including more than one, A, and at least one, optionally including more than one, B (and optionally including other elements); etc.

[0151] Use of ordinal terms such as "first," "second," "third," etc., in the claims to modify a claim element does not by itself connote any priority, precedence, or order of one claim element over another or the temporal order in which acts of a method are performed, but are used merely as labels to distinguish one claim element having a certain name from another element having a same name (but for use of the ordinal term) to distinguish the claim elements.

[0152] Also, the phraseology and terminology used herein is for the purpose of description and should not be regarded as limiting. The use of "including," "comprising," or "having," "containing," "involving," and variations thereof herein, is meant to encompass the items listed thereafter and equivalents thereof as well as additional items.

[0153] LIST OF REFERENCES

[0154] The following references are hereby incorporated by reference in their entireties for all they teach.

[0155] [1] Guangquan Cheng, Jincai Huang, Zhong Liu, and Cheng Lizhi. Image quality assessment using natural image statistics in gradient domain. AEU - International Journal of Electronics and Communications, 65(5):392 - 397, 2011.

[0156] [2] Ruigang Fang and Dapeng Wu. No-reference image quality assessment based on bnb measurement. In Proceedings of IEEE China Summit & International Conference on Signal and Information Processing (ChinaSIP), pages 528-532, 2013.

[0157] [3] M.C.Q. Farias, S.K. Mitra, and J.M Foley. Perceptual contributions of blocky, blurry and noisy artifacts to overall annoyance. IEEE International Conference on Multimedia and Expo, 1:1 - 529-32, 2003.

[0158] [4] M.C.Q. Farias and S.K Mitra. No-reference video quality metric based on artifact measurements. IEEE International Conference on Image Processing, 3:111 - 141-4, sept 2005. [0159] [5] Jinggang Huang and D Mumford. Statistics of natural images and models. IEEE Computer Society Conference on Computer Vision and Pattern Recognition, l:xxiii+637+663, 1999.

[0160] [6] Le Kang, Peng Ye, Yi Li, and D. Doermann. Convolutional neural networks for no-reference image quality assessment. In 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 1733-1740, June 2014.

[0161] [7] Lixiong Liu, Hongping Dong, Hua Huang, and A.C. Bovik. No-reference image quality assessment in curvelet domain. Signal Processing: Image Communication, 29(4):494 -505, 2014.

[0162] [8] Lixiong Liu, Bao Liu, Hua Huang, and A.C. Bovik. No-reference image quality assessment based on spatial and spectral entropies. Signal Processing: Image Communication, 29(8):856 -863, 2014.

[0163] [9] Chaofeng Li, A.C. Bovik, and Xiaojun Wu. Blind image quality assessment using a general regression neural network. IEEE Transactions on Neural Networks, 22(5):793-799, May 2011.

[0164] [10] P. Marziliano, F. Dufaux, S. Winkler, and T Ebrahimi. A no-reference perceptual blur metric. IEEE International Conference on Image Processing, 3:111-57 - 111-60, 2002.

[0165] [11] A. Mittal, A.K. Moorthy, and A.C. Bovik. No-reference image quality assessment in the spatial domain. IEEE Transactions on Image Processing, 21(12):4695 -4708, Dec 2012.

[0166] [12] A. Mittal, R. Soundararajan, and A.C. Bovik. Making a completely blind image quality analyzer. IEEE Signal Processing Letters, (3):209-212, Mar 2013.

[0167] [13] A.K. Moorthy and A.C. Bovik. A two-step framework for constructing blind image quality indices. IEEE Signal Processing Letters, 17(5):513 -516, May 2010.

[0168] [14] A.K. Moorthy and A.C. Bovik. Blind image quality assessment: From natural scene statistics to perceptual quality. IEEE Transactions on Image Processing, 20(12):3350-3364, Dec 2011.

[0169] [15] K. Rank, M. Lendl, and R Unbehauen. Estimation of image noise variance. IEE Proceedings on Vision, Image and Signal Processing, 146(2):80 -84, aug 1999. [0170] [16] M.A. Saad, A.C. Bovik, and C. Charrier. Blind image quality assessment: A natural scene statistics approach in the dct domain. IEEE Transactions on Image Processing, 21(8):3339-3352, Aug 2012.

[0171] [17] R. Serral-Gracia, E. Cerqueira, M. Curado, M. Yannuzzi, E. Monteiro, and X Masip-Bruin. An overview of quality of experience measurement challenges for video applications in ip networks. Wired/Wireless Internet Communications, 6074:252- 263, april 2010.

[0172] [18] H.R. Sheikh, M.F. Sabir, and A.C. Bovik. A statistical evaluation of recent full reference image quality assessment algorithms. IEEE Transactions on Image Processing, 15(11):3440 -3451, nov. 2006.

[0173] [19] H.R. Sheikh, Cormack.L Wang.Z, and A.C. Bovik. Live image quality assessment database release 2. http://live.ece.utexas.edu/research/quality, 2006.

[0174] [20] R. Soundararajan and A.C. Bovik. Rred indices: Reduced reference entropic differencing for image quality assessment. IEEE Transactions on Image Processing, 21(2):517 -526, feb 2012.

[0175] [21] Zhou Wang, A.C. Bovik, and B.L Evan. Blind measurement of blocking artifacts in images. IEEE International Conference on Image Processing, 3:981 -984, 2000.

[0176] [22] Zhou Wang, A.C. Bovik, H.R. Sheikh, and E.P. Simoncelli. Image quality assessment: from error visibility to structural similarity. IEEE Transactions on Image Processing, 13:600 -612, 2004.

[0177] [23] Zhou Wang, H.R. Sheikh, and A.C Bovik. No-reference perceptual quality assessment of jpeg compressed images. IEEE International Conference on Image Processing, 1:1-477 - 1-480, 2002.

[0178] [24] W. Xue, X. Mou, L. Zhang, A.C. Bovik, and X. Feng. Blind image quality assessment using joint statistics of gradient magnitude and laplacian features. IEEE Transactions on Image Processing, 23(11), Nov 2014.

[0179] [25] Feng X. and J.P Allebach. Measurement of ringing artifacts in jpeg images. Proc. SPIE, 6076:74-83, january 2006.

[0180] [26] Peng Ye and D. Doermann. No-reference image quality assessment based on visual codebook. IEEE International Conference on Image Processing (ICIP), pages 3089 -3092, sept. 2011.

Claims

CLAIMS What is claimed is:

1. A computer-implemented method for assessing quality of an electronic image, the method comprising:

computing values for a plurality of attributes of the electronic image;

selecting a plurality of entries from a computer data structure, wherein:

each entry of the data structure comprises values of the plurality of attributes and a score; and

selecting comprises selecting based on a vector difference of the values of the plurality of attributes of the electronic image relative to the values of the plurality of attributes of entries in the data structure; and

computing a quality score for the electronic image as a combination of the scores of the selected entries from the data structure.

2. The method of claim 1, wherein:

computing the quality score comprises computing a weighted average of the selected entries.

3. The method of claim 2, wherein:

a weight for computing the weighted average of the selected entries is based on the vector difference between the values of the plurality of attributes of the electronic image relative to the values of the plurality of attributes of an entry in the data structure.

4. The method of claim 1, wherein:

the vector difference is a weighted Euclidean distance.

5. The method of claim 1, wherein:

the score in each entry of the data structure is indicative of human perception of image quality of one or more sample images.

6. The method of claim 5, wherein:

the electronic image is a compressed image transmitted over internet computer network for displaying on a user device, and an amount of compression of the compressed image is automatically selected based in part on the quality score.

7. The method of claim 6, wherein:

the electronic image is an still image of a video.

8. The method of claim 6, wherein:

scores in entries of the data structure are indicative of human perception of image quality of sample images when displayed on a type of electronic device similar to the user device.

9. The method of claim 1, wherein:

the plurality of attributes comprise blurriness, noisiness and blockiness.

10. A non-transitory computer readable medium comprising computer readable instructions that when executed by a processor, cause the processor to perform a method comprising the acts of:

(A) computing values for a plurality of attributes of the electronic image;

(B) selecting a plurality of entries from a computer data structure, wherein:

(C) computing a quality score for the electronic image as a combination of the scores of the selected entries from the data structure.

11. The non-transitory computer readable medium of claim 10, wherein:

12. The non-transitory computer readable medium of claim 10, wherein:

13. The non-transitory computer readable medium of claim 10, wherein:

the vector difference is a weighted Euclidean distance.

14. The non-transitory computer readable medium of claim 10, wherein:

15. The non-transitory computer readable medium of claim 14, wherein:

the electronic image is a compressed image transmitted over a computer network for displaying on a user device, and an amount of compression of the compressed image is automatically selected based in part on the quality score.

16. The non-transitory computer readable medium of claim 15, wherein:

the electronic image is an still image of a video.

17. The non-transitory computer readable medium of claim 15, wherein:

18. The non-transitory computer readable medium of claim 10, wherein:

the plurality of attributes comprise blurriness, noisiness and blockiness.

19. The non-transitory computer readable medium of claim 18, wherein act (A) comprises computing a value for blurriness of the electronic image and wherein computing a value for blurriness comprises: (D) determining a variance V of a Laplace distribution of differences between intensity values of all adjacent pairs of pixels in the electronic image;

(E) applying a low pass filter to the electronic image;

(F) applying a high pass filter to the result of act (E);

(G) determining a variance VI of a Laplace distribution of differences between intensity values of all adjacent pairs of pixels in the result of act (F);

(H) computing the value for blurriness based on the result of act (G).

20. The non-transitory computer readable medium of claim 19, wherein act (A) comprises computing a value for noisiness of the electronic image and wherein computing a value for noisiness of the electronic image comprises:

(I) determining a variance V of a Laplace distribution of differences between intensity values of all adjacent pairs of pixels in the electronic image;

(J) applying a high pass filter to the electronic image;

(K) applying a low pass filter to the result of act (J);

(L) determining a variance VI of a Laplace distribution of differences between intensity values of all adjacent pairs of pixels in the result of act (K);

(M) computing the value for noisiness based on the result of act (L).

21. The non-transitory computer readable medium of claim 20, wherein act (A) comprises computing a value for blockiness of the electronic image and wherein computing a value for blockiness of the electronic image comprises:

(N) determining a number of pixels β 1 with intensity above a threshold in a first sub image of the electronic image;

(O) determining a number of pixels β2 with intensity above the threshold in a second sub image of the electronic image;

(P) computing the value for blockiness based at least in part on a ratio between βΐ and β2.