CN113160198A - Image quality enhancement method based on channel attention mechanism - Google Patents

Image quality enhancement method based on channel attention mechanism Download PDF

Info

Publication number
CN113160198A
CN113160198A CN202110474102.7A CN202110474102A CN113160198A CN 113160198 A CN113160198 A CN 113160198A CN 202110474102 A CN202110474102 A CN 202110474102A CN 113160198 A CN113160198 A CN 113160198A
Authority
CN
China
Prior art keywords
image
quality
channels
convolution
low
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN202110474102.7A
Other languages
Chinese (zh)
Inventor
颜成钢
肇恒润
孙垚棋
张继勇
李宗鹏
张勇东
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Dianzi University
Original Assignee
Hangzhou Dianzi University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Dianzi University filed Critical Hangzhou Dianzi University
Priority to CN202110474102.7A priority Critical patent/CN113160198A/en
Publication of CN113160198A publication Critical patent/CN113160198A/en
Withdrawn legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/90Dynamic range modification of images or parts thereof
    • G06T5/92Dynamic range modification of images or parts thereof based on global image properties
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10004Still image; Photographic image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Quality & Reliability (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Processing (AREA)

Abstract

The invention discloses an image quality enhancement method based on a channel attention mechanism, which comprises the steps of firstly processing high-quality original images into corresponding low-quality images to obtain a low-quality-high-quality image contrast group; then, an image enhancement network model is built, and the image enhancement network model is trained through the obtained low-quality image; and finally, inputting the low-quality image into the trained image enhancement network model to obtain a high-quality image. The method of the invention adopts the neural network model with the residual error network module and the channel attention module as the image enhancement model to enhance the image, and can output high-quality images with richer details by utilizing low-quality images through the cooperation of the residual error network and the channel attention network model.

Description

Image quality enhancement method based on channel attention mechanism
Technical Field
The invention relates to the field of digital image processing and computer vision, in particular to an image quality enhancement method based on a channel attention mechanism.
Background
Compared with the characters, the images can provide more vivid, easily understood and artistic information, and are an important source for people to forward and exchange information. Image enhancement techniques use low quality, detail-deficient images to produce high quality, detail-rich images.
Currently, image enhancement techniques can be divided into three categories: spatial domain based, frequency domain based and learning based methods. The spatial domain-based method directly acts on image pixels to process the image. The frequency domain based method is to modify the transform coefficient of the image in a certain transform domain of the image, and then to inverse transform the image to the original space domain to obtain an enhanced image. With the rapid development of deep learning, image enhancement algorithms based on learning are more of a focus of research in recent years. According to the method, a large number of high-quality images are adopted to generate a learning model, prior knowledge learned by the learning model is introduced in the process of recovering the low-quality images, and a neural network is trained to search for the corresponding relation between the low-quality images and the corresponding high-quality images, so that richer details are obtained, and a satisfactory image enhancement effect is obtained. The deep learning-based method comprises LL-NET, MBLLEN, LightenNet and the like, the models have certain image enhancement capability, but due to the limitation of the network layer number, the number of detail information which can be extracted by the models is limited, and the generated images are not sharp enough and the details are not rich enough.
Disclosure of Invention
Based on the problems in the prior art, the invention aims to provide an image quality enhancement method based on a channel attention mechanism, which can solve the problems of insufficient sharpness, insufficient details and the like in the conventional image enhancement method.
The technical scheme adopted by the method for solving the technical problems in the known technology is as follows:
an image quality enhancement method based on a channel attention mechanism comprises the following steps:
step one, processing the high-quality original image into a corresponding low-quality image to obtain a low-quality-high-quality image contrast group.
And step two, building an image enhancement network model.
And step three, training an image enhancement network model through the low-quality image obtained in the step one.
And step four, inputting the low-quality image into the trained image enhancement network model to obtain a high-quality image.
According to the technical scheme provided by the invention, the image enhancement method provided by the embodiment of the invention has the beneficial effects that:
the neural network model with the residual error network module and the channel attention module is used as an image enhancement model to enhance the image, and the residual error network and the channel attention network model are matched to output a high-quality image with richer details by utilizing a low-quality image.
Drawings
FIG. 1 is a flow chart of an embodiment of the present invention;
FIG. 2 is a schematic diagram of a structure of an image enhancement network model according to an embodiment of the present invention;
FIG. 3 is a diagram illustrating an original low quality image according to an embodiment of the present invention;
fig. 4 is an image output after an original image is enhanced by using the image enhancement method provided by the embodiment of the invention.
Detailed Description
The technical solutions in the embodiments of the present invention are clearly and completely described below with reference to the specific contents of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the present invention without making any creative effort, shall fall within the protection scope of the present invention. Details which are not described in detail in the embodiments of the invention belong to the prior art which is known to the person skilled in the art.
As shown in fig. 1, an embodiment of the present invention provides an image enhancement method for enhancing a low-quality image by using a neural network model having a residual network module and a channel attention module, including:
step one, processing a high-quality original image into a corresponding low-quality image to obtain a low-quality-high-quality image comparison group, wherein the specific method comprises the following steps:
reducing the quality of the image by JPEG compressing (QF <30) or changing the exposure (increasing or decreasing the exposure within +/-3 EV) to obtain a low-quality-high-quality image picture group; the problem that in practice, low-quality pictures with only single controllable degradation factors cannot be collected at the same time, and a large amount of data cannot be provided for deep learning can be solved.
In order to expand the number of the data set samples and enable the network to have better generalization, the same random inversion and rotation operation is carried out on the paired low-quality-high-quality image contrast groups, and in order to accelerate the training speed, the low-quality-high-quality image contrast groups are cut into 128 × 128 images by carrying out the cutting operation of randomly selecting cutting positions;
step two, building an image enhancement network model;
the image enhancement network model comprises an image enhancement network model and is composed of a down-sampling part, a residual error network part, a channel attention module and an up-sampling part. The down-sampling part comprises two stages of Pixel unscuffle and two-dimensional convolution with convolution kernel size of 3 x 3, the residual network part comprises a residual block part consisting of N residual blocks with the same structure and two-dimensional convolution with convolution kernel size of 3 x 3, and the up-sampling part comprises two stages of Pixel Shuffle and two-dimensional convolution with convolution kernel size of 3 x 3.
The input image is a low-quality image processed in the first step.
The structure of the image enhancement network model is shown in fig. 2, in which:
a Down-sampling part (Down Scale) performs four-time Down-sampling processing on an image by adopting a two-stage Pixel unschuffle method, inputs the image with RGB three channels, firstly reduces the length and width of the image into 1/2 of the input image by using a Pixel unschuffle method with a scaling multiple of 2, the number of channels is changed into 12, then converts the image from 12 channels into 64 channels by using two-dimensional convolution with a convolution kernel size of 3 x 3, then reduces the length and width of the image into 1/4 of the input image by using a Pixel unschuffle method with a scaling multiple of 2, the number of channels is changed into 256, and finally converts the image from 256 channels into 64 channels by using two-dimensional convolution with a convolution kernel size of 3 x 3.
The Residual Block part is N Residual blocks (Residual blocks) with the same structure, each Residual Block is composed of a two-dimensional convolution with a convolution kernel size of 3 x 3, a linear rectification unit and a convolution with a convolution kernel size of 3 x 3, and the number N is 16. The deep network is built by using the plurality of residual blocks, so that shallow information can be transmitted to the deep network layer, the problems of gradient disappearance, network degradation and the like are avoided, and the feature extraction and mapping capability of the network is greatly enhanced. The residual block used here only includes 3 × 3 convolution, linear sorting unit and 3 × 3 convolution, and compared with the conventional ResNet without batch normalization, this modification can greatly improve performance and effectively reduce the video memory usage rate of the GPU.
Conv3 × 3 is a two-dimensional convolution with a convolution kernel size of 3 × 3.
The channel attention module (SeBlock) is composed of 3 x 3 two-dimensional convolution with the step size of 2, a global pooling layer, a full connection layer, a linear rectification unit, a full connection layer and a Sigmoid activation function. SeBlock adopts a brand-new characteristic recalibration strategy, and can explicitly model the interdependence relation between characteristic channels. SeBlock firstly carries out Squeeze operation on the feature graph obtained by convolution to obtain the global feature of a channel level, then carries out Excitation operation on the global feature to learn the relation among channels, also obtains the weight of different channels, and finally multiplies the weight by the original feature graph to obtain the final feature. This attention mechanism allows the model to focus more on the channel features with the greatest amount of information, while suppressing those channel features that are not important. Thereby enabling better utilization of the features.
An upsampling part (Up Scale) performs quadruple upsampling processing on an image by adopting a two-stage Pixel Shuffle method, firstly, the number of channels of a feature map is changed from 64 to 256 through two-dimensional convolution with the convolution kernel size of 3 × 3, then the length and the width of the image are amplified to be 2 times of the input feature map by using a Pixel Shuffle method with the scaling multiple of 2, the number of the channels is changed to 64, then the number of the channels of the feature map is changed from 64 to 256 through two-dimensional convolution with the convolution kernel size of 3 × 3, finally the length and the width of the image are amplified to be 4 times of the input feature map by using a Pixel Shuffle method with the scaling multiple of 2, the number of the channels is changed to 64, and finally the feature map of 64 channels is converted into an RGB three-channel high-quality image output after network enhancement by using two-dimensional convolution with the convolution kernel size of 3 × 3.
Step three, training an image enhancement network model;
the training uses Adam optimizer as optimizer and charbonier Loss as Loss function. In a graph enhancement task, direct corresponding relations between low-quality images and high-quality images are not in one-to-one correspondence and are completely determined, the information content in the low-quality images is far less than that of the high-quality images, one low-quality image can have multiple corresponding high-quality images, the images generated by training of L1 and L2 norms commonly used in deep learning cannot well capture comprehensive characteristics of potential high-quality images, and the enhanced images are often too smooth, so that the Charbonnier Loss with stronger noise resistance is used as a Loss function, a network can have higher convergence speed, and the generated images have sharper details.
In the training process, the initial learning rate is set to be 1e-4, the learning rate of every n epochs is reduced to be half of the current learning rate, the higher learning rate in the early stage can enable the network to be quickly converged, the lower learning rate in the later stage can enable the model to be finely adjusted, the effect of the model is better, the epoch number n is determined according to the sample size of the training data set and the training effect, and the training is finished when the learning rate is lower than 1 e-6.
Inputting the low-quality image into an image enhancement network model to obtain a high-quality image, wherein the specific processing flow is as follows:
the input low quality images are input into a trained image enhancement network model, as shown in fig. 3. The image is changed into a feature map of 64 channels through Down sampling (Down Scale), and then feature extraction and feature mapping are performed through N Residual blocks (Residual blocks), wherein the larger the number N of the Residual blocks is, the deeper the network depth is, the stronger the feature extraction and mapping capability is, but more computing resources are consumed at the same time. And after convolution of 3 × 3, the feature graph output by the Residual Block is added with the 64-channel feature graph before entering the Residual network part, the output feature graph is input into the SeBlock, and different channels of the feature graph are distributed with different weights by applying an attention mechanism. Then, 4 times of upsampling operation is performed to restore the feature map to the size of the input image, and finally, a convolution of 3 × 3 is performed to convert the 64-channel feature map into a 3-channel RGB image for final output, as shown in fig. 4.
Further, the number N of the residual blocks is preferably 16.
The above description is only for the preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present invention are included in the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims (6)

1. An image quality enhancement method based on a channel attention mechanism is characterized by comprising the following steps:
processing a high-quality original image into a corresponding low-quality image to obtain a low-quality-high-quality image contrast group;
step two, building an image enhancement network model;
step three, training an image enhancement network model through the low-quality image obtained in the step one;
and step four, inputting the low-quality image into the trained image enhancement network model to obtain a high-quality image.
2. The image quality enhancement method based on the channel attention mechanism as claimed in claim 1, wherein the step one is as follows:
reducing the quality of the image by JPEG compressing (QF <30) or changing the exposure (increasing or decreasing the exposure within +/-3 EV) to obtain a low-quality-high-quality image picture group;
and carrying out the same random inversion and rotation operation on the paired low-quality-high-quality image contrast groups, and carrying out cutting operation of randomly selecting cutting positions on the low-quality-high-quality image contrast groups to cut the low-quality-high-quality image contrast groups into 128-by-128 images in order to increase the training speed.
3. The image quality enhancement method based on the channel attention mechanism is characterized in that the specific method of the step two is as follows;
the image enhancement network model comprises an image enhancement network model and a channel attention module, wherein the image enhancement network model comprises a down-sampling part, a residual error network part and an up-sampling part; the down-sampling part comprises two stages of Pixel Unshuffles and two-dimensional convolution with convolution kernel size of 3 x 3, the residual network part comprises a residual block part consisting of N residual blocks with the same structure and two-dimensional convolution with convolution kernel size of 3 x 3, and the up-sampling part comprises two stages of Pixel Shuffles and two-dimensional convolution with convolution kernel size of 3 x 3;
the input image is a low-quality image processed in the first step;
the structure of the image enhancement network model is as follows:
a Down-sampling part (Down Scale) performs four-time Down-sampling processing on an image by adopting a two-stage Pixel Unshuffle method, inputs the image with RGB three channels, firstly reduces the length and width of the image into 1/2 of the input image by using a Pixel Unshuffle method with a scaling multiple of 2, the number of channels is changed into 12, then converts the image from 12 channels into 64 channels by using two-dimensional convolution with a convolution kernel size of 3X 3, then reduces the length and width of the image into 1/4 of the input image by using a Pixel Unshuffle method with a scaling multiple of 2, the number of channels is changed into 256, and finally converts the image from 256 channels into 64 channels by using two-dimensional convolution with a convolution kernel size of 3X 3;
the residual block part is N residual blocks (ResidualBlock) with the same structure, each residual block is composed of a two-dimensional convolution with a convolution kernel of 3 x 3, a linear rectification unit and a convolution with a convolution kernel of 3 x 3, and the number N is 16;
conv3 × 3 is a two-dimensional convolution with a convolution kernel size of 3 × 3;
the channel attention module (SeBlock) is composed of 3 x 3 two-dimensional convolution with the step length of 2, a global pooling layer, a full connection layer, a linear rectification unit, a full connection layer and a Sigmoid activation function; firstly, carrying out Squeeze operation on a feature map obtained by convolution by SeBlock to obtain global features of channel levels, then carrying out Excitation operation on the global features, learning the relation among channels, obtaining the weights of different channels, and finally multiplying the weights by the original feature map to obtain final features; the attention mechanism enables the model to pay more attention to the channel characteristics with the largest information quantity and restrain the unimportant channel characteristics; thereby enabling better utilization of the features;
an upsampling part (Up Scale) performs quadruple upsampling processing on an image by adopting a two-stage Pixel Shuffle method, firstly, the number of channels of a feature map is changed from 64 to 256 through two-dimensional convolution with the convolution kernel size of 3 × 3, then the length and the width of the image are amplified to be 2 times of the input feature map by using a PixelShuffle method with the scaling multiple of 2, the number of the channels is changed to 64, then the number of the channels of the feature map is changed from 64 to 256 through two-dimensional convolution with the convolution kernel size of 3 × 3, finally the length and the width of the image are amplified to be 4 times of the input feature map by using a Pixel Shuffle method with the scaling multiple of 2, the number of the channels is changed to 64, and finally the feature map of the 64 channels is converted into an RGB three-channel high-quality image which is subjected to network enhancement by using two-dimensional convolution with the convolution kernel size of 3 × 3.
4. The image quality enhancement method based on the channel attention mechanism is characterized in that the specific method in the third step is as follows;
training by adopting an Adam optimizer as an optimizer and a Charbonier Loss as a Loss function;
in the training process, the initial learning rate is set to be 1e-4, the learning rate of every n epochs is reduced to be half of the current learning rate, the higher learning rate in the early stage can enable the network to be quickly converged, the lower learning rate in the later stage can enable the model to be finely adjusted, the effect of the model is better, the epoch number n is determined according to the sample size of the training data set and the training effect, and the training is finished when the learning rate is lower than 1 e-6.
5. The image quality enhancement method based on the channel attention mechanism as claimed in claim 4, wherein the specific processing flow of the step four is as follows:
inputting a low-quality image into a trained image enhancement network model, changing the image into a feature map of 64 channels through Down sampling (Down Scale), and then performing feature extraction and feature mapping through N residual blocks (ResidualBlock), wherein the larger the number N of the residual blocks is, the deeper the network depth is, the stronger the feature extraction and mapping capability is, but more computing resources are consumed; adding the feature graph output by the Residual Block with a 64-channel feature graph before entering a Residual network part after convolution by 3 × 3, inputting the output feature graph into a SeBlock, and distributing different channels of the feature graph with different weights by applying an attention mechanism; and then, performing 4 times of upsampling operation to restore the feature map to the size of the input image, and finally converting the 64-channel feature map into a 3-channel RGB image through a 3-by-3 convolution for final output.
6. The method of claim 5, wherein the number N of the residual blocks is preferably 16.
CN202110474102.7A 2021-04-29 2021-04-29 Image quality enhancement method based on channel attention mechanism Withdrawn CN113160198A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110474102.7A CN113160198A (en) 2021-04-29 2021-04-29 Image quality enhancement method based on channel attention mechanism

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110474102.7A CN113160198A (en) 2021-04-29 2021-04-29 Image quality enhancement method based on channel attention mechanism

Publications (1)

Publication Number Publication Date
CN113160198A true CN113160198A (en) 2021-07-23

Family

ID=76872400

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110474102.7A Withdrawn CN113160198A (en) 2021-04-29 2021-04-29 Image quality enhancement method based on channel attention mechanism

Country Status (1)

Country Link
CN (1) CN113160198A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117893413A (en) * 2024-03-15 2024-04-16 博创联动科技股份有限公司 Vehicle-mounted terminal man-machine interaction method based on image enhancement

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117893413A (en) * 2024-03-15 2024-04-16 博创联动科技股份有限公司 Vehicle-mounted terminal man-machine interaction method based on image enhancement

Similar Documents

Publication Publication Date Title
CN109712203B (en) Image coloring method for generating antagonistic network based on self-attention
CN111242846B (en) Fine-grained scale image super-resolution method based on non-local enhancement network
CN107464217B (en) Image processing method and device
CN111340708B (en) Method for rapidly generating high-resolution complete face image according to prior information
CN112288632B (en) Single image super-resolution method and system based on simplified ESRGAN
WO2015095529A1 (en) Image adjustment using texture mask
CN111161137A (en) Multi-style Chinese painting flower generation method based on neural network
CN110533614B (en) Underwater image enhancement method combining frequency domain and airspace
CN113554599B (en) Video quality evaluation method based on human visual effect
CN110958469A (en) Video processing method and device, electronic equipment and storage medium
CN111008938A (en) Real-time multi-frame bit enhancement method based on content and continuity guidance
CN116452410A (en) Text-guided maskless image editing method based on deep learning
CN115100039B (en) Lightweight image super-resolution reconstruction method based on deep learning
CN116310712A (en) Image ink style migration method and system based on cyclic generation countermeasure network
CN116229077A (en) Mathematical function image example segmentation method based on improved Mask-R-CNN network
CN115713462A (en) Super-resolution model training method, image recognition method, device and equipment
CN115797176A (en) Image super-resolution reconstruction method
CN113160198A (en) Image quality enhancement method based on channel attention mechanism
Liu et al. Facial image inpainting using multi-level generative network
CN113850721A (en) Single image super-resolution reconstruction method, device and equipment and readable storage medium
CN117237190B (en) Lightweight image super-resolution reconstruction system and method for edge mobile equipment
CN110413840B (en) Neural network for constructing video determination label and training method thereof
CN117274059A (en) Low-resolution image reconstruction method and system based on image coding-decoding
CN114830168A (en) Image reconstruction method, electronic device, and computer-readable storage medium
CN104123707B (en) Local rank priori based single-image super-resolution reconstruction method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication
WW01 Invention patent application withdrawn after publication

Application publication date: 20210723