CN113658072A

CN113658072A - Underwater image enhancement method based on progressive feedback network

Info

Publication number: CN113658072A
Application number: CN202110935907.7A
Authority: CN
Inventors: 牛玉贞; 张宇杰; 张凌昕
Original assignee: Fuzhou University
Current assignee: Fuzhou University
Priority date: 2021-08-16
Filing date: 2021-08-16
Publication date: 2021-11-16
Anticipated expiration: 2041-08-16
Also published as: CN113658072B

Abstract

The invention provides an underwater image enhancement method based on a progressive feedback network, which comprises the following steps: step S1: pairing underwater image data for training, and then performing data enhancement and normalization processing on the underwater image data to obtain paired images to be trained; step S2: inputting paired images to be trained into a multi-stage progressive image enhancement network capable of enhancing images at each stage by combining discrete wavelet transformation and an attention feedback mechanism, training an image enhancement model capable of enhancing underwater images, and correcting the images at each stage of the network by using a supervision attention module; step S3: setting a target loss function of the image enhancement network; step S4: using the paired training image to enhance network convergence to Nash balance; step S5: normalization processing is carried out on the underwater image to be enhanced, then the trained image enhancement model is input, and the enhanced image is output; the invention is beneficial to improving the underwater image quality.

Description

Underwater image enhancement method based on progressive feedback network

Technical Field

The invention relates to the technical field of image processing and computer vision, in particular to an underwater image enhancement method based on a progressive feedback network.

Background

Underwater image enhancement techniques are gaining attention due to their importance in the fields of ocean engineering and underwater robotics. The quality of underwater imaging has a great influence on underwater operation, and tasks such as submarine exploration, underwater target detection and the like which depend on vision have high requirements on the quality of underwater images, and the efficiency and the accuracy of the tasks are seriously reduced due to the low-quality underwater images. Enhancement of underwater images is a challenging problem due to the complexity of the underwater environment and lighting conditions. Generally, underwater images are subject to wavelength dependent absorption and scattering, including forward scattering, which is a scattering phenomenon in which light reflected from objects in the water is offset by a small angle when transmitted to a camera, and backscattering, resulting in blurred image details. Backscattering refers to impurities in the water that are scattered and received directly by the camera when illuminating objects in the water, resulting in low image contrast. Furthermore, in the ocean, many plankton, plants, silt and dust, etc. float up and down, introducing noise and increasing the effect of scattering. These adverse effects reduce visibility, reduce contrast, and even introduce color bias, which causes severe degradation of the quality of the underwater image and prevents marine engineering and underwater operations.

The existing underwater image enhancement methods are mainly divided into two types: firstly, a method based on deep learning regards the conversion from an underwater image to a normal image as a mapping relation, and learns the mapping change by using the learning fitting capability of a network, so as to realize the conversion from the underwater image to the normal image, but the method has higher requirements on the size and the category of data volume, and is usually accompanied by the coding and decoding operation on the image in the learning process, and some image information loss may be generated in the process, so that the local detail blurring phenomenon occurs in the enhanced image. And secondly, a method using a physical model is adopted, a mathematical model of the degradation process of the underwater image is established firstly, and a clear underwater image is obtained through inversion of the degradation process, and the method needs to estimate model parameters. However, the underwater environment is complex, conditions such as illumination are variable, the estimation of parameters is difficult and the precision is low, so that the quality of the enhanced image is generally low, and meanwhile, different models to be established are different due to different factors considered by different environments, so that the methods have great limitations.

The existing method usually has information loss in the image enhancement process, so that the enhanced image is easy to have a detail blurring phenomenon. The method comprises the steps of dividing an image enhancement process into a plurality of stages, carrying out independent enhancement on an image in each stage, carrying out feedback correction on the output of each stage and a label image, inputting the corrected result into the next stage, realizing gradual optimization and promotion, and remarkably improving the quality of an underwater image.

Disclosure of Invention

The invention provides an underwater image enhancement method based on a progressive feedback network, which is beneficial to improving the quality of underwater images.

The invention adopts the following technical scheme.

An underwater image enhancement method based on a progressive feedback network comprises the following steps:

step S1: pairing underwater image data for training, and then performing data enhancement and normalization processing on the underwater image data to obtain paired images to be trained;

step S2: inputting paired images to be trained into a multi-stage progressive image enhancement network capable of enhancing images at each stage by combining discrete wavelet transformation and an attention feedback mechanism, training an image enhancement model capable of enhancing underwater images, and correcting the images at each stage of the network by using a supervision attention module;

step S3: setting a target loss function of the image enhancement network;

step S4: using the paired training image to enhance network convergence to Nash balance;

step S5: and carrying out normalization processing on the underwater image to be enhanced, then inputting the trained image enhancement model, and outputting the enhanced image.

The step S1 includes the steps of:

step S11: matching the underwater image for training with the corresponding label image;

step S12: carrying out uniform random turning operation on all paired images to be trained, and enhancing data;

step S13: all images to be trained are normalized, an image I (I, j) is given, and the normalized image is

At pixel position (i, j), a normalized value is calculated

The formula of (1) is as follows:

wherein, (i, j) represents the position of the pixel, and the normalized paired image is used as the input image and label image pair of the subsequent step.

The step S2 includes step S21, step S22, step S23;

the step S21 specifically includes: designing a multi-stage progressive image enhancement network, wherein the input of the network is a normalized underwater image

Outputting the underwater image after being enhanced; the network is divided into three stages to be progressively executed, each stage is combined with a discrete wavelet transform and an attention feedback mechanism to enhance the image, the network structures of the three stages are the same, and a supervision attention module is used between the stages to supervise the characteristics of the stage, namely the supervision attention module is used to supervise the image enhanced at the stage after the first stage and the second stage.

Three of the progressive executionIn each stage, the first stage inputs underwater image characteristics F1 after a convolutional layer with a convolutional kernel of 3x3 and a step length of 1_inThe output of the first stage is the image feature F1 enhanced by the current stage_outThe feature is compared with the normalized underwater image

The output of the supervision attention module is the corrected feature F2 as an input to the supervision attention module_in；

The second stage input is the corrected feature F2_inThe output of the second stage is the image feature F2 enhanced by the current stage_outThe output features are corrected by the supervision attention module to obtain image features F3 in the same way as in the previous stage_inTaking the corrected features as input of the third stage;

the output of the third stage is the image feature F3 enhanced by the current stage_outObtaining a final enhanced image R after a convolution layer with a convolution kernel of 3x3 and a step length of 1, namely the output of the image enhancement network; the calculation formula is as follows:

Fi_out＝Net_i(Fi_in) I is 1, 2, 3 formula six;

wherein

Representing the normalized underwater input image, SAM (X) representing the network of supervising attention modules, Net_i(1), i ═ 1, 2, 3 denote three stage networks in a multi-stage progressive image enhancement network;

the step S22 specifically includes: designing the three-stage network in the step S21, wherein the three-stage network has the same structure; each stage network can be divided into three layers from top to bottom, and each layer consists of a wavelet pooling layer, a residual attention module, a wavelet anti-pooling layer and an attention feedback module.

The wavelet pooling layer decomposes features using discrete Haar wavelets, using four decomposition kernels LL^T，LH^T，HL^T，HH^TWherein the low-frequency and high-frequency filters are respectively

The wavelet anti-pooling layer uses discrete Haar wavelets to carry out combined reconstruction on the low-frequency component and the high-frequency component, and the used anti-pooling nuclear parameters are the same as those of the wavelet anti-pooling layer;

the residual attention module consists of convolution with two layers of convolution kernels of 3x3 and step length of 1 and a channel attention network, and the calculation formula is as follows:

X_out＝ECA_Net(ADD[X_in，Relu(Conv(Relu(Conv(X_in))))]) A formula of nine;

wherein X_inIndicating input features, X_outRepresenting output characteristics, Relu being an activation function, ADD representing characteristic addition operation, ECA _ Net (x) representing a channel attention network;

the attention feedback module consists of a channel attention network and a sigmoid function, and the calculation formula is as follows

W＝Sigmoid(ECA_Net(X_in) Equation ten;

wherein X_inRepresenting input features, W representing output feature weights, ECA _ Net (, x) representing the channel attention network.

The network structure of each stage of the multi-stage progressive image enhancement network is the same, namely the input of the first level of each stage of the network is the input of the current stage of the network, namely the image characteristic F1_in，

When the first stage of the multi-stage progressive image enhancement network works, the method comprises the following steps:

step A1, performing forward wavelet decomposition from the first layer to the third layer to obtain low-frequency component and high-frequency component, ll₁，lh₁，hl₁，hh₁,, wherein ll₁For low frequency components, lh₁，hl₁，hh₁Are all high frequency components; will ll₁As input of the second level, the low-frequency component ll is decomposed into a low-frequency component ll through a wavelet pooling layer as same as the first level₂And a high frequency component lh₂，hl₂，hh₂(ii) a Will ll₂As the input of the third level, the low-frequency component ll is decomposed into the low-frequency components through the wavelet pooling layer, which is the same as the first two levels₃And a high frequency component lh₃，hl₃，hh₃；

Step A2, enhancing the features reversely through a residual attention module, a wavelet inverse pooling layer and an attention feedback module; firstly, the low-frequency component ll of the third level₃And a high frequency component lh₃，hl₃，hh₃Obtaining transformed low-frequency components ll through residual attention modules respectively₄And a high frequency component lh₄，hl₄，hh₄Inputting the transformed low-frequency component and high-frequency component into a wavelet anti-pooling layer for combined reconstruction, wherein the output characteristics of the wavelet anti-pooling layer are used as the input of an attention feedback module, and the output of the attention feedback module is a third level to a second level ll₂Feedback weight w3 of vector to convert the low-frequency component ll of the second level₂Multiplying the feedback weight w3 to perform correction to obtain a corrected low-frequency component ll of the second layer₂′；

Step A3, correcting ll₂' component and second-level high-frequency component lh₂，hl₂，hh₂Obtaining transformed low-frequency component ll by inputting residual attention module₅And a high frequency component lh₅，hl₅，hh₅(ii) a Inputting the transformed low-frequency component and high-frequency component into a wavelet anti-pooling layer for combined reconstruction, wherein the output characteristics of the wavelet anti-pooling layer are used as the input of an attention feedback module, and the output of the attention feedback module is the second level to the first level ll₁Feedback weight w2 of vector, and low-frequency component ll of first level₁Multiplying the feedback weight w2 to perform correction to obtain a corrected low-frequency component ll of the second layer₁′；

Step A4, correcting ll₁' component and first-level high-frequency component lh₁，hl₁，hh₁Obtaining transformed low-frequency component ll by inputting residual attention module₆And a high frequency component lh₆，hl₆，hh₆(ii) a Inputting the transformed low-frequency component and high-frequency component into a wavelet anti-pooling layer for combined reconstruction, taking the output characteristics of the wavelet anti-pooling layer as the input of an attention feedback module, taking the output of the attention feedback module as the feedback weight w1 of the first level to the initial input characteristics of the network, and inputting the initial input characteristics of the network F1_inMultiplying by weight w1 to obtain the image feature F1 after the current stage is enhanced_outNamely the output of the first-stage network;

in the above way, the output F2 of the second stage network is further obtained_outAnd the output of the second stage network F3_out。

Step S23 specifically includes: designing the network of the supervised attention Module in step S21, the network input being the enhanced image features output by the first or second stage network F1_outOr F2_outAnd normalized underwater images

Taking the first module for supervision attention as an example, the input image features F1 are first input_outObtaining a residual image through convolution with convolution kernel of 3x3 and step length of 1, and adding the residual image and the underwater image according to pixels to obtain a stage underwater enhanced image R₁The obtained stage underwater enhanced image R₁Calculating a loss with the normalized tag image G according to step S3; then, the stage underwater enhanced image R₁R is obtained as an input to the convolutional layer with a convolution kernel of 3x3 and a step size of 1₁And obtaining a feature weight w4 through a sigmoid function, wherein the feature weight is used as a guide weight for the network input image feature of the supervision attention module. I.e. the originally input enhanced image feature F1_outMultiplying by weight w4 to obtain corrected enhanced image feature F2_inAs output from the network of supervising attention modules. The calculation formula is as follows:

enhanced image features F1 for first and second stage network output_out and F2_outRespectively obtaining corrected characteristics F2 after passing through the network of the supervision attention module_in and F3_inAnd underwater enhanced image R in the generation stage₁ and R₂。

The step S3 includes the following steps;

step S31: designing a network target loss function, wherein the network total target loss function is as follows:

l＝λ₁·l₁+λ₂·l_sa formula thirteen;

wherein ,l₁ and l_sAre each L₁Loss and grid loss, λ₁ and λ₂Is each loss balance coefficient, is a real number dot product operation; the specific calculation formula of each loss is as follows:

l₁＝||R-G||₁+||R₁-G||₁+||R₂-G||₁a formula fourteen;

wherein R is the final output result of the designed underwater image enhancement network, R₁ and R₂Stage underwater enhanced images generated in the two monitoring attention modules are respectively, G is a normalized label image, | |₁Is an absolute value operation;

wherein

Representing the Frobenius square norm, j representing the j-th layer, phi representing the feature extraction network, here using a pre-trained VGG16 network,

a Gram matrix representing the j-th layer activation characteristics extracted by the network phi is defined as follows:

where x represents the input image and phi represents the feature extraction network, here a pre-trained VGG16 network is used_j(x)_h，w，c′Represents the j-th layer activation feature, phi, of the input image x obtained in the feature extraction network phi_j(x)_h，w，c′Representing the transpose of the j-th layer of activation features obtained by the input image x in the feature extraction network phi, C, C' representing the number of feature channels, h representing the feature height, w representing the feature width, C_j×H_j×W_jThe size of the features is activated for layer j.

The step S4 includes the following steps;

step S41: randomly dividing the matched underwater images and the label images into a plurality of batches, wherein each batch comprises N pairs of images;

step S42: inputting the underwater image into the image enhancement network in the step S2 to obtain a final enhanced image and a stage underwater enhanced image;

step S43, calculating the gradient of each parameter in the image enhancement network by using a back propagation method according to the total target loss function of the image enhancement network, and updating the parameter of the image enhancement network by using a random gradient descent method;

step S44: and (4) repeating the steps from S41 to S43 image enhancement network training by taking batches as units until the target loss function value of the image enhancement network converges to Nash balance, and storing the network parameters to finish the training process of the image enhancement network.

Compared with the prior art, the image enhanced by the existing water system image enhancement method often has a detail blurring phenomenon, the invention provides an underwater image enhancement network based on progressive feedback, which can effectively reduce information loss in an image transmission process, retain image detail information, avoid detail blurring, and be suitable for most complex scenes.

The invention has the beneficial effects that: the method is suitable for enhancing the underwater images in various complex environments, and can effectively restore the distorted colors of the images, remove the image blur and improve the contrast and brightness of the images in use. The enhanced image conforms to human subjective visual perception.

Drawings

The invention is described in further detail below with reference to the following figures and detailed description:

FIG. 1 is a schematic flow chart of an implementation of the method of the present invention;

FIG. 2 is a schematic diagram of a network model architecture in an embodiment of the invention;

FIG. 3 is a schematic diagram of a phase network model structure according to an embodiment of the present invention;

FIG. 4 is a diagram illustrating a supervised attention network model architecture in an embodiment of the present invention;

FIG. 5 is a block diagram of a residual attention module in a phase network according to an embodiment of the present invention.

Detailed Description

As shown in the figure, the underwater image enhancement method based on the progressive feedback network comprises the following steps:

step S3: setting a target loss function of the image enhancement network;

The step S1 includes the steps of:

At pixel position (i, j), a normalized value is calculated

The formula of (1) is as follows:

The step S2 includes step S21, step S22, step S23;

In the three stages of the progressive execution, the first stage inputs underwater image characteristics F1 after passing through a convolutional layer with a convolutional kernel of 3x3 and a step length of 1_inThe output of the first stage is the image feature F1 enhanced by the current stage_outThe feature is compared with the normalized underwater image

Fi_out＝Net_i(Fi_in) I is 1, 2, 3 formula six;

wherein

X_out＝ECA_Net(ADD[X_in，Relu(Conv(Relu(Conv(X_in))))]) A formula of nine;

wherein X_inIndicating input features, X_outRepresenting output characteristics, Relu is an activation function, AD represents characteristic addition operation, and ECA _ Net (x) represents a channel attention network;

W＝Sigmoid(ECA_Net(X_in) Equation ten;

step A1, performing forward wavelet decomposition from the first layer to the third layer to obtain low-frequency component and high-frequency component, ll₁，lh₁，hl₁，hh₁,, wherein ll₁For low frequency components, lh₁，hl₁，hh₁Are all high frequency components; will ll₁As input of the second level, the low-frequency component ll is decomposed into a low-frequency component ll through a wavelet pooling layer as same as the first level₂And a high frequency component lh₂，hl₂，hh₂(ii) a Will ll₂As input to the third level, the wavelet is passed, as in the first two levelsDecomposition of pooling layer into low frequency components ll₃And a high frequency component lh₃，hl₃，hh₃；

Step A4, correcting ll₁' component and first-level high-frequency component lh₁，hl₁，hh₁Obtaining transformed low-frequency component ll by inputting residual attention module₆And a high frequency component lh₆，hl₆，hh₆(ii) a Inputting the transformed low-frequency component and high-frequency component into wavelet anti-pooling layer for combined reconstruction, and outputting the wavelet anti-pooling layerThe characteristics are used as the input of an attention feedback module, the output of the attention feedback module is feedback weight w1 of the first level to the initial input characteristics of the network, and the initial input characteristics of the network are F1_inMultiplying by weight w1 to obtain the image feature F1 after the current stage is enhanced_outNamely the output of the first-stage network;

The step S3 includes the following steps;

l＝λ₁·l₁+λ₂·l_sa formula thirteen;

l₁＝||R-G||₁+||R₁-G||₁+||R₂-G||₁a formula fourteen;

wherein

where x represents the input image and phi represents the feature extraction network, here a pre-trained VGG16 network is used_j(x)_h，w，cRepresents the j-th layer activation feature, phi, of the input image x obtained in the feature extraction network phi_j(x)_h，w，c′Representing the transpose of the j-th layer of activation features obtained by the input image x in the feature extraction network phi, C, C' representing the number of feature channels, h representing the feature height, w representing the feature width, C_j×H_j×W_jThe size of the features is activated for layer j.

The step S4 includes the following steps;

The above are preferred embodiments of the present invention, and all changes made according to the technical scheme of the present invention that produce functional effects do not exceed the scope of the technical scheme of the present invention belong to the protection scope of the present invention.

Claims

1. An underwater image enhancement method based on a progressive feedback network is characterized in that: the method comprises the following steps:

step S3: setting a target loss function of the image enhancement network;

2. The underwater image enhancement method based on the progressive feedback network as claimed in claim 1, wherein: the step S1 includes the steps of:

At pixel position (i, j), a normalized value is calculated

The formula of (1) is as follows:

3. The underwater image enhancement method based on the progressive feedback network as claimed in claim 1, wherein: the step S2 includes step S21, step S22, step S23;

4. The underwater image enhancement method based on the progressive feedback network as claimed in claim 3, wherein: in the three stages of the progressive execution, the first stage inputs underwater image characteristics F1 after passing through a convolutional layer with a convolutional kernel of 3x3 and a step length of 1_inThe output of the first stage is the image feature F1 enhanced by the current stage_outThe feature is compared with the normalized underwater image

The second stage input is the corrected feature F2_inThe output of the second stage is the image feature F2 enhanced by the current stage_outThe output features are corrected by the supervision attention module to obtain image features as in the previous stageF3_inTaking the corrected features as input of the third stage;

Fi_out＝Net_i(Fi_in) I is 1, 2, 3 formula six;

wherein

Representing the normalized underwater input image, SAM (X) representing the network of supervising attention modules, Net_i(x), i ═ 1, 2, 3 denotes three stage networks in a multi-stage progressive image enhancement network.

5. The underwater image enhancement method based on the progressive feedback network as claimed in claim 3, wherein: the step S22 specifically includes: designing the three-stage network in the step S21, wherein the three-stage network has the same structure; each stage network can be divided into three layers from top to bottom, and each layer consists of a wavelet pooling layer, a residual attention module, a wavelet anti-pooling layer and an attention feedback module.

6. The underwater image enhancement method based on the progressive feedback network as claimed in claim 5, wherein: the wavelet pooling layer decomposes features using discrete Haar wavelets, using four decomposition kernels LL^T，LH^T，HL^T，HH^TWherein the low-frequency and high-frequency filters are respectively

X_out＝ECA_Net(ADD[X_in，Relu(Conv(Relu(Conv(X_in))))]) A formula of nine;

W＝Sigmoid(ECA_Net(X_in) Equation ten;

7. Underwater image enhancement method based on progressive feedback network according to claim 6The method is characterized in that: the network structure of each stage of the multi-stage progressive image enhancement network is the same, namely the input of the first level of each stage of the network is the input of the current stage of the network, namely the image characteristic F1_in，

Step A2, enhancing the features reversely through a residual attention module, a wavelet inverse pooling layer and an attention feedback module; firstly, the low-frequency component ll of the third level₃And a high frequency component lh₃，hl₃，hh₃Obtaining transformed low-frequency components ll through residual attention modules respectively₄And a high frequency component lh₄，hl₄hh₄Inputting the transformed low-frequency component and high-frequency component into a wavelet anti-pooling layer for combined reconstruction, wherein the output characteristics of the wavelet anti-pooling layer are used as the input of an attention feedback module, and the output of the attention feedback module is a third level to a second level ll₂Feedback weight w3 of vector to convert the low-frequency component ll of the second level₂Multiplying the feedback weight w3 to perform correction to obtain a corrected low-frequency component ll of the second layer₂′；

Step A3, correcting ll₂' component and second-level high-frequency component lh₂，hl₂，hh₂Transformed by the input residual attention moduleLow frequency component ll₅And a high frequency component lh₅，hl₅，hh₅(ii) a Inputting the transformed low-frequency component and high-frequency component into a wavelet anti-pooling layer for combined reconstruction, wherein the output characteristics of the wavelet anti-pooling layer are used as the input of an attention feedback module, and the output of the attention feedback module is the second level to the first level ll₁Feedback weight w2 of vector, and low-frequency component ll of first level₁Multiplying the feedback weight w2 to perform correction to obtain a corrected low-frequency component ll of the second layer₁′；

8. The underwater image enhancement method based on the progressive feedback network as claimed in claim 3, wherein: step S23 specifically includes: designing the network of the supervised attention Module in step S21, the network input being the enhanced image features output by the first or second stage network F1_outOr F2_outAnd normalized underwater images

Taking the first module for supervision attention as an example, the input image features F1 are first input_outStep size of 3x3 after passing through a convolution kernelObtaining a residual image for convolution of 1, and adding the residual image and the underwater image according to pixels to obtain a stage underwater enhanced image R₁The obtained stage underwater enhanced image R₁Calculating a loss with the normalized tag image G according to step S3; then, the stage underwater enhanced image R₁R is obtained as an input to the convolutional layer with a convolution kernel of 3x3 and a step size of 1₁And obtaining a feature weight w4 through a sigmoid function, wherein the feature weight is used as a guide weight for the network input image feature of the supervision attention module. I.e. the originally input enhanced image feature F1_outMultiplying by weight w4 to obtain corrected enhanced image feature F2_inAs output from the network of supervising attention modules. The calculation formula is as follows:

9. The underwater image enhancement method based on the progressive feedback network as claimed in claim 1, wherein: the step S3 includes the following steps;

wherein ,

and

are each L₁Loss and grid loss, λ₁ and λ₂Is each loss balance coefficient, is a real number dot product operation; the specific calculation formula of each loss is as follows:

wherein

where x represents the input image and phi represents the feature extraction network, here a pre-trained VGG16 network is used_j(x)_h，w，cTo representThe j-th layer of activation features, phi, obtained from the input image x in the feature extraction network phi_j(x)_h，w，c′Representing the transpose of the j-th layer of activation features obtained by the input image x in the feature extraction network phi, C, C' representing the number of feature channels, h representing the feature height, w representing the feature width, C_j×H_j×W_jThe size of the features is activated for layer j.

10. The underwater image enhancement method based on the progressive feedback network as claimed in claim 1, wherein: the step S4 includes the following steps;

step S43: calculating the gradient of each parameter in the image enhancement network by using a back propagation method according to the total target loss function of the image enhancement network, and updating the parameter of the image enhancement network by using a random gradient descent method;