CN116721304B

CN116721304B - Image quality perception method, system and equipment based on distorted image restoration guidance

Info

Publication number: CN116721304B
Application number: CN202311000119.4A
Authority: CN
Inventors: 王中元; 杨冀帆; 李娜; 杨玉红; 黄宝金
Original assignee: Wuhan University WHU
Current assignee: Wuhan University WHU
Priority date: 2023-08-10
Filing date: 2023-08-10
Publication date: 2023-10-20
Anticipated expiration: 2043-08-10
Also published as: CN116721304A

Abstract

The invention discloses an image quality perception method, system and equipment based on distorted image restoration guidance, which comprises the steps of firstly extracting an image I to be detected, inputting the image I to a feature extraction network, and obtaining depth features DF (I) of the image; then, under the guidance of priori knowledge based on an image restoration priori model, using a distorted image restoration network to restore the distorted image by using depth features DF (I) to obtain restoration features IR (I) of the distorted image; and finally, carrying out re-weighted fusion on the recovery feature IR (I) and the depth feature DF (I) through a feature fusion network, carrying out regression on the re-weighted fused image quality feature vector, and outputting the quality score of the image. The invention can automatically sense the quality of natural images and provides a means for selecting high-quality image samples for training the deep learning model.

Description

Image quality perception method, system and equipment based on distorted image restoration guidance

Technical Field

The invention belongs to the technical field of image quality evaluation, relates to an image quality sensing method, an image quality sensing system and image quality sensing equipment based on deep learning, and particularly relates to an image quality sensing method, an image quality sensing system and image quality sensing equipment based on distorted image restoration guidance.

Background

In computer vision tasks, high quality images help train computer vision models and achieve better results. But image quality is susceptible to factors such as brightness, sharpness, contrast, and noise, and thus Image Quality Assessment (IQA) is critical. IQA can be divided into subjective IQA and objective IQA. The subjective IQA method uses subjective perception of humans to evaluate image quality. Subjective IQA, however, requires many observers, involves a slow and expensive experimental process, and is susceptible to subjective opinion by the evaluator. As a general alternative, IQA works by learning image features to mimic the viewing behavior of the Human Visual System (HVS), and designing algorithms that automatically predict image quality. Based on the accessibility of the original reference image, the objective IQA model can be divided into a full reference model (FR-IQA), a half reference model (RR-IQA) and a no reference model (NR-IQA). Applications of NR-IQA are more common due to the general lack of reference images in practical IQA tasks (e.g., image super resolution, image rain removal, face recognition). But its implementation is also more challenging because of the lack of assistance of the reference information, which is less accurate than FR-IQA.

The NR-IQA method can be broadly divided into two categories depending on the type of features extracted: a method based on manual natural image statistics (NSS) and a method based on feature learning. The NSS-based method assumes that the original natural image has an inherent statistical law, which is changed if there is distortion in the image. The existing NSS-based method mainly adopts methods such as local normalized brightness coefficient, wavelet coefficient and the like to carry out NSS modeling, and quality perception characteristics are extracted. However, the NSS-based NR-IQA method relies heavily on knowledge of the field of NSS modeling, which is less effective for complex distorted images with unknown distortion types or complex distortions of multiple distortion types. With the development of deep learning, NR-IQA tends to extract quality features directly from distorted images using neural networks and perform end-to-end optimization to overcome the limitations of NSS-based methods. The current partial NR-IQA method based on feature learning is inspired by the free energy principle, uses a pre-trained recovery network to recover images and uses them for quality prediction. But since image recovery and quality prediction are two independent tasks, the accuracy of image quality prediction is highly dependent on the effectiveness of distorted image recovery. Once faced with severely distorted images that are difficult to recover, the quality prediction link may artificially introduce additional distortion (due to erroneous recovery), resulting in a dramatic decrease in the accuracy of the quality prediction that follows.

Therefore, if the image restoration model can automatically generate accurate distorted image restoration features, effective auxiliary reference information is provided for the original distorted image, so that the improvement of the image quality evaluation efficiency is indirectly promoted.

Disclosure of Invention

In order to solve the technical problems, the invention provides an image quality perception method, an image quality perception system and image quality perception equipment based on distorted image restoration guidance.

The technical scheme adopted by the method is as follows: an image quality perception method based on distorted image restoration guidance comprises the following steps:

step 1: extracting depth features DF (I) of an image I to be detected;

step 2: performing distorted image restoration by using the depth feature DF (I) to obtain restoration features IR (I) of the distorted image;

step 3: and re-weighting and fusing the recovery feature IR (I) and the depth feature DF (I), and regressing the re-weighted and fused image quality feature vector to output the quality score of the image.

Preferably, in the step 1, a quality feature extraction network is adopted to extract depth features DF (I) of an image I to be detected;

the quality feature extraction network consists of a first convolution layer, a second convolution layer, eight improved Swim converterlers and a residual connection layer; the image I to be detected sequentially passes through a first convolution layer, eight improved Swim converter blocks and a second convolution layer, and is output after passing through the residual connection layer with the input of the first convolution layer;

the improved Swim transducer block consists of eight Swim transducer layers and a third convolution layer, a residual connection layer; the output of the input after passing through the eight Swim converter layers and the third convolution layer in sequence is output after passing through the residual connection layer with the input.

Preferably, in the step 2, under the guidance of priori knowledge based on the image restoration priori model, a distorted image restoration network is adopted for distorted image restoration;

the image restoration prior model and the distorted image restoration network are respectively composed of four convolution layers and a Relu activation function layer which are sequentially connected, and the Relu activation function layer is arranged between a first volume layer level and a second convolution layer.

Preferably, the distorted image restoration network is a trained network; in the training process, an average Euclidean distance is adopted as a loss function to optimize the error between the restored image and the false true value generated by the image restoration priori model, and the loss function is specifically defined as follows:

，

for the input distorted image I, inputting the image I into a distorted image restoration network to obtain a restored image corresponding to the distorted image，/>Initial parameters representing an image restoration network;yrepresentative teachingA recovered image pseudo-true value generated by the engineer model for the image I;

and optimizing the distorted image restoration network by adopting a gradient descent method, and updating parameters of the distorted image restoration network by using an Adam optimizer.

Preferably, in the step 3, a feature fusion network is adopted to carry out re-weighted fusion on the recovery feature IR (I) and the depth feature DF (I);

the feature fusion network consists of a weight generation network, a splicing operation layer, a pixel-by-pixel multiplication operation layer, a pixel-by-pixel subtraction operation layer and a global average pooling operation layer; and the restoration feature IR (I) and the depth feature DF (I) pass through the splicing operation layer and the weight generation network and then output a feature A, and the feature A and the restoration feature IR (I) and the depth feature DF (I) respectively pass through the pixel-by-pixel multiplication operation layer and then output after passing through the pixel-by-pixel subtraction operation layer and the global average pooling operation layer.

Preferably, in the step 3, regression is performed on the image quality feature vector after re-weighted fusion by adopting a multi-layer perceptron MLP with a three-layer structure, and the quality score of the image is output;

the multi-layer perceptron MLP consists of three full-connection layers FC1, FC2 and FC 3; wherein, FC1 is a full connection layer with 1024 nodes; FC2 is a fully connected layer with 256 nodes; FC3 is a fully connected layer with 64 nodes.

The system of the invention adopts the technical proposal that: an image quality perception system based on distorted image restoration guidance, comprising the following modules:

the depth feature extraction module is used for extracting depth features DF (I) of the image I to be detected;

the distorted image recovery module is used for recovering the distorted image by using the depth feature DF (I) to obtain a recovery feature IR (I) of the distorted image;

and the image quality perception module is used for carrying out re-weighted fusion on the recovery characteristic IR (I) and the distortion image characteristic DF (I), carrying out regression on the re-weighted fused image quality characteristic vector and outputting the quality score of the image.

The technical scheme adopted by the equipment is as follows: an image quality perception device based on distorted image restoration guidance, comprising:

one or more processors;

and a storage means for storing one or more programs which, when executed by the one or more processors, cause the one or more processors to implement the image quality perception method based on distorted image restoration guidance.

The invention can automatically sense the image quality without manual intervention and reference truth value, and is more in line with the image quality evaluation application condition of the real scene. In addition, the invention has the following novelty in the concrete method:

1) The invention tightly couples the image recovery task and the quality assessment task, and utilizes the image recovery effect to guide the distorted image quality assessment, thereby providing a feasible approach for the image quality assessment without reference.

2) According to the invention, the image recovery task and the quality assessment task are placed under a unified frame, so that the task target is optimized in an end-to-end joint manner, and the recovered characteristics are more in line with the expectations of quality assessment.

Drawings

The following examples, as well as specific embodiments, are used to further illustrate the technical solutions herein. In addition, in the course of describing the technical solutions, some drawings are also used. Other figures and the intent of the present invention can be derived from these figures without inventive effort for a person skilled in the art.

Fig. 1: a method schematic diagram of an embodiment of the present invention;

fig. 2: the quality feature extraction network structure diagram of the embodiment of the invention;

fig. 3: experimental results of the examples of the present invention.

Detailed Description

In order to facilitate the understanding and practice of the invention, those of ordinary skill in the art will now make further details with reference to the drawings and examples, it being understood that the examples described herein are for the purpose of illustration and explanation only and are not intended to limit the invention thereto.

Referring to fig. 1, the image quality sensing method based on distorted image restoration guidance provided by the invention comprises the following steps:

step 1: extracting an image I to be detected, and inputting the image I to be detected into a feature extraction network to obtain depth features DF (I) of the image;

in one embodiment, the specific structure of the quality feature extraction network is shown in fig. 2, and consists of a first convolution layer, a second convolution layer, eight improved Swim transducer blocks and a residual connection layer; the image I to be detected sequentially passes through a first convolution layer, eight improved Swim converter blocks and a second convolution layer, and is output after passing through a residual connection layer with the input of the first convolution layer;

the improved Swim transducer block consists of eight Swim transducer layers and a third convolution layer, a residual connection layer; the output of the input after passing through eight Swim transducer layers and the third convolution layer in sequence is output after passing through the residual connection layer with the input. The shallow and deep characteristic information extracted by the network can be effectively integrated by adding residual connection operation to the original Swim converter technology, so that the method is beneficial to providing more information for the subsequent network.

Step 2: based on an image restoration priori model, using priori knowledge to guide, adopting a distorted image restoration network, and using depth feature DF (I) of the image to restore the distorted image to obtain restoration feature IR (I) of the distorted image;

the distorted image recovery network in the step 2 mainly aims at the fact that a true distorted image does not have a reference image, and reliable reference information is difficult to use for improving the performance of an image quality evaluation model. The reference image information can be obtained from the recovery process of the distorted image, so that the invention adopts a learning strategy based on the image recovery priori model guidance to learn the natural distortion image recovery task to obtain the priori knowledge of the image recovery. The image restoration priori model is adopted to guide the learning strategy because the image restoration priori model can be combined with the quality assessment model to optimize the learning strategy, so that more accurate mapping from the image quality characteristics to the quality scores is formed.

The image restoration prior model network structure used in fig. 1 is consistent with the distorted image restoration network structure in the present invention, and is composed of four convolution layers and a Relu activation function layer, which are sequentially connected, and the Relu activation function layer is disposed between the first volume layer and the second convolution layer.

The natural image evaluation task data (support set and query set) adopted by the image restoration priori model training is a large-scale natural image quality evaluation data set KADIS-700k with various types of distortion, and the data set is used as a training task set of the image restoration priori model to learn restoration knowledge of various types of distorted images. The network structure adopted by the invention is a multilayer convolutional neural network plus an activation layer, and the effect of a decoder is similar to that of the decoder so as to recover the distorted image. For the input distorted image I, inputting the image I into an image restoration network to obtain a restoration image Ir corresponding to the distorted image;

；

wherein the method comprises the steps ofRepresenting the initial parameters of the image restoration network. The average Euclidean distance is used as a loss function to optimize the error between the restored image and the false true value generated by the image restoration priori model, and the loss function is specifically defined as follows:

，

wherein the method comprises the steps ofyRepresenting recovered image false-true values generated by the teacher model for image I. In order to better learn the generalization capability among different tasks, the invention optimizes the distorted image restoration network by adopting a gradient descent method commonly used in the field of image restoration, and updates the parameters of the distorted image restoration network by using an Adam optimizer. The false true value restored image generated by using the image restoration priori model can effectively make up for the realityThe distorted image does not have a reference image, and it is difficult to recover the network from the de novo training image. Under the guidance of the image restoration prior model, the image restoration network can obtain a better restoration effect through less training, and accurate image restoration characteristics are generated.

Step 3: and re-weighting and fusing the recovery feature IR (I) and the depth feature DF (I) through a feature fusion network, and regressing the re-weighted and fused image quality feature vector to output the quality score of the image.

In one embodiment, the specific structure of the feature fusion network is shown in fig. 1, and consists of a weight generation network, a splicing operation layer, a pixel-by-pixel multiplication operation layer, a pixel-by-pixel subtraction operation layer and a global average pooling operation layer (GAP); the method comprises the steps that after a restoration feature IR (I) and a depth feature DF (I) are subjected to a splicing operation layer and a weight generation network, a feature A is output, and after the restoration feature IR (I) and the depth feature DF (I) are respectively subjected to a pixel-by-pixel multiplication operation layer, the output is output after the pixel-by-pixel subtraction operation layer and the global average pooling operation layer.

In one embodiment, the image quality feature vector after re-weighted fusion is regressed by adopting a multi-layer perceptron MLP with a three-layer structure, and the quality score of the image is output; the specific structure of the multi-layer perceptron MLP is shown in figure 1, and consists of three full-connection layers FC1, FC2 and FC 3; wherein, FC1 is a full connection layer with 1024 nodes; FC2 is a fully connected layer with 256 nodes; FC3 is a fully connected layer with 64 nodes.

The specific implementation of the step 3 comprises the following sub-steps:

step 3.1: this process can be expressed as a re-weighted fusion by inputting the restoration feature IR (I) and depth feature DF (I) into the feature fusion network:

；

wherein H is _WGU The feature fusion network is represented by a combination of two convolutional layers and one Relu active layer,Concatrepresenting the recovery characteristics IR (I) and distortionTandem operation of image features DF (I);

step 3.2: and re-weighting the characteristic diagram through a weight matrix Wr. Final feature map F _final Is calculated as:

；

wherein the method comprises the steps ofIs representing the product operation between matrices, +.>Representing the subtraction operation between matrices;

step 3.3: and (3) carrying out regression on the image quality characteristic vector subjected to re-weighted fusion by adopting a multi-layer perceptron (MLP) with a three-layer structure, and outputting the quality score of the image.

The results of some experiments are given in this example. As shown in fig. 3, the quality score of the image with different quality levels by the method of the present invention is highly consistent with the subjective quality, thereby confirming the effectiveness of the method.

The invention comprises two parts: (1) Aiming at the problem that only original distorted images can be acquired in many actual scenes and corresponding original undistorted images cannot be acquired. Therefore, the invention utilizes the priori knowledge of learning strategy based on image restoration priori model guidance to learn distorted image restoration from natural distorted image restoration tasks, utilizes the pseudo-true value restoration image generated by the image restoration priori model to conduct training on the image restoration model guidance to quickly obtain a trusted image restoration model, and can carry out joint training with a quality evaluation model. (2) The invention further provides an image quality assessment model based on data driving, which comprises correction fusion of the distorted image restoration features obtained from the first part and the distorted image quality features, and mapping the fused features to image quality scores.

The invention learns the knowledge of human visual system on the restoration of the distorted image by simulating the restoration process of human brain on the distorted image, and constructs an image quality perception model according to the knowledge. The invention adopts a unified learning framework to carry out joint optimization on distorted image recovery and image quality prediction. The invention utilizes rich labels, including restored images and quality scores, to enable a quality perception system to learn more discriminating features and to build a more accurate mapping from feature representation to quality scores. In order to avoid the influence of the unreliable recovery information on the quality prediction, the invention designs a feature fusion network, re-weighted fusion is carried out on the recovery features and the original features, and the perception consistency of the fusion features is promoted. The invention can automatically sense the quality of natural images and provides a means for selecting high-quality image samples for training the deep learning model.

It should be understood that the foregoing description of the preferred embodiments is not intended to limit the scope of the invention, but rather to limit the scope of the claims, and that those skilled in the art can make substitutions or modifications without departing from the scope of the invention as set forth in the appended claims.

Claims

1. An image quality perception method based on distorted image restoration guidance is characterized by comprising the following steps:

step 1: extracting depth features DF (I) of an image I to be detected;

extracting depth features DF (I) of an image I to be detected by adopting a quality feature extraction network;

the improved Swim transducer block consists of eight Swim transducer layers and a third convolution layer, a residual connection layer; the output after the input sequentially passes through eight Swim converters layers and a third convolution layer is output after the input passes through the residual connection layer;

the method comprises the steps of carrying out distorted image restoration by adopting a distorted image restoration network under the guidance of priori knowledge based on an image restoration priori model;

the image restoration prior model and the distorted image restoration network are composed of four convolution layers and a Relu activation function layer which are sequentially connected, and the Relu activation function layer is arranged between a first volume layer and a second convolution layer;

step 3: re-weighting and fusing the recovery feature IR (I) and the depth feature DF (I), and regressing the re-weighted and fused image quality feature vector to output the quality score of the image;

the method comprises the steps of carrying out re-weighted fusion on a recovery feature IR (I) and a depth feature DF (I) by adopting a feature fusion network;

the feature fusion network consists of a weight generation network, a splicing operation layer, a pixel-by-pixel multiplication operation layer, a pixel-by-pixel subtraction operation layer and a global average pooling operation layer; the restoration feature IR (I) and the depth feature DF (I) pass through the splicing operation layer and the weight generation network and then output a feature A, and the output of the feature A and the restoration feature IR (I) and the depth feature DF (I) respectively pass through the pixel-by-pixel multiplication operation layer and then are output after passing through the pixel-by-pixel subtraction operation layer and the global average pooling operation layer;

regression is carried out on the image quality feature vector after re-weighted fusion by adopting a multi-layer perceptron MLP with a three-layer structure, and the quality score of the image is output;

2. The distorted image restoration guidance-based image quality sensing method according to claim 1, wherein: the distorted image recovery network is a trained network; in the training process, an average Euclidean distance is adopted as a loss function to optimize the error between the restored image and the false true value generated by the image restoration priori model, and the loss function is specifically defined as follows:

，

for the input distorted image I, inputting the image I into a distorted image restoration network to obtain a restored image corresponding to the distorted image，/>Initial parameters representing an image restoration network;yrepresenting a recovered image pseudo-true value generated by the teacher model for the image I;

3. An image quality perception system based on distorted image restoration guidance, comprising:

the image quality perception module is used for carrying out re-weighted fusion on the recovery characteristic IR (I) and the distortion image characteristic DF (I), carrying out regression on the re-weighted fused image quality characteristic vector and outputting the quality score of the image;

4. An image quality perception device based on distorted image restoration guidance, comprising:

one or more processors;

storage means for storing one or more programs which, when executed by the one or more processors, cause the one or more processors to implement the image quality perception method based on distorted image restoration guidance according to claim 1 or 2.