CN111429433A - Multi-exposure image fusion method based on attention generation countermeasure network - Google Patents

Multi-exposure image fusion method based on attention generation countermeasure network Download PDF

Info

Publication number
CN111429433A
CN111429433A CN202010219045.3A CN202010219045A CN111429433A CN 111429433 A CN111429433 A CN 111429433A CN 202010219045 A CN202010219045 A CN 202010219045A CN 111429433 A CN111429433 A CN 111429433A
Authority
CN
China
Prior art keywords
network
attention
channel
feature
image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010219045.3A
Other languages
Chinese (zh)
Inventor
李晓光
吴超玮
黄江鲁
卓力
李嘉锋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing University of Technology
Original Assignee
Beijing University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing University of Technology filed Critical Beijing University of Technology
Priority to CN202010219045.3A priority Critical patent/CN111429433A/en
Publication of CN111429433A publication Critical patent/CN111429433A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Quality & Reliability (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides a multi-exposure image fusion method based on an attention generation countermeasure network. The idea of the attention mechanism is highly matched with the detail weighting problem in multi-exposure fusion, and the weights of all input images can be adaptively selected by applying channel attention, and the weights of different spatial positions can be adaptively selected by using spatial attention. The technology has wide application prospect in various multimedia vision fields. The algorithm designs a new attention generation countermeasure network for a multi-exposure image fusion task, and by introducing a visual attention mechanism into the generation network, the algorithm can help the network to adaptively learn the weights of different input images and different spatial positions so as to achieve a better fusion effect.

Description

Multi-exposure image fusion method based on attention generation countermeasure network
Technical Field
The invention belongs to the field of digital image/video signal processing, and particularly relates to a multi-exposure image fusion method based on an attention generation countermeasure network.
Background
With the development of computer and multimedia technologies, various multimedia applications have placed a wide demand for high quality images. High quality images can provide rich information and a realistic visual experience. However, in the image acquisition process, due to the influence of factors such as image acquisition equipment, acquisition environment, noise and the like, the image presented on the display terminal is often a low-quality image. Therefore, how to reconstruct a high quality image from a low quality image has been a challenge in the field of image processing.
From bright sunlight to dim starlight, the illumination intensity in a natural scene can span a very large dynamic range, with a brightness contrast that can exceed 14 orders of magnitude. A common digital camera can only capture 8-bit images on each color channel. The existing image brightness level does not match the natural scene brightness dynamic range. The limited dynamic range of brightness makes the digital image have limited capability to display high-contrast natural scenes, and the problems of over-exposure of bright areas or under-exposure of dark areas can occur. The dynamic range of the image brightness is enhanced, so that the expressive ability of the image to a high-contrast scene can be effectively improved, and the visual quality of the image is improved.
Various multimedia applications place extensive demands on high dynamic range images and video. If the network video company wants to improve the subjective quality of the video content by improving the dynamic range of the video, the mobile phone manufacturer can take the high dynamic range content shooting performance of the camera as a selling point for promotion. Therefore, the high dynamic range image has wide application requirements and important commercial value in the field of visual media.
Aiming at the problem of enhancing the dynamic range of images, the multi-exposure fusion method generates a high dynamic range image with enhanced details by fusing the detail information of different exposure images. Among them, how to select detailed information from images exposed differently is a challenging problem.
By rapidly scanning the global image, the human visual system can acquire a target area that needs to be focused (often referred to as focus), and then devote more attention to that area to obtain more detailed information about the focus area. This is the ability of humans to quickly screen out high value information from a large amount of information with limited resources. Human visual attention greatly improves the efficiency and accuracy of visual information processing.
Inspired by human visual attention, the concept of attention mechanism is introduced into deep learning. In recent years, methods based on deep learning have had great success in many computer vision tasks and some low-level image processing problems. Among other things, attention mechanisms have become functional in a variety of applications. We have observed that the attention mechanism is suitable for solving the multi-exposure fusion problem, and that the attention mechanism can be used to adaptively select weights.
The invention provides a multi-exposure image fusion method based on an attention generation countermeasure network. The idea of the attention mechanism is highly matched with the detail weighting problem in multi-exposure fusion, and the weights of all input images can be adaptively selected by applying channel attention, and the weights of different spatial positions can be adaptively selected by using spatial attention. The technology has wide application prospect in various multimedia vision fields.
Disclosure of Invention
The invention aims to overcome the defects that the traditional multi-exposure image fusion method depends on different calculation modes for artificially defining fusion weights, and aims at the problem of enhancing the dynamic range of an image based on the multi-exposure image fusion method, and provides a multi-exposure fusion method based on an attention generation countermeasure network.
The invention is realized by adopting the following technical means:
a multi-exposure image fusion method based on an attention generation countermeasure network. Firstly, a plurality of images with different exposures are fused into a generation network of an attention mechanism to obtain a multi-exposure fusion image; and then, sending the fused image and the target group-judge image into a judging network for judgment, and training to obtain a multi-exposure fusion generating network with enhanced details and dynamic range in the mutual game of the generating network and the judging network. The whole network of the method is shown as the attached figure 1 and is divided into two parts: the generated network and the discriminant network are shown in fig. 2.
The invention introduces a visual attention mechanism in a designed generating network, adaptively selects weights, and extracts image detail information by adopting a residual block.
The invention is realized by adopting the following technical means: a multi-exposure image fusion method based on an attention generation countermeasure network comprises three parts, namely establishment of a generation countermeasure network structure based on an attention mechanism, countertraining of the multi-exposure image fusion generation network and a discrimination network, and multi-exposure image fusion testing.
Firstly, the first part is to build a generation countermeasure network based on an attention mechanism, the overall network is composed of a generation network and a discrimination network, and the attention mechanism is introduced into the generation network. The network construction specifically comprises the following steps:
1) generating network structure build
The generated network structure is shown in a figure drawing and is formed by combining a feature extraction mechanism and an attention mechanism, wherein the feature extraction part comprises 3 × 3 convolution with the output channel number of 32, PRe L U activation operation, 5 residual block modules with the input and output channels of 32, 3 × 3 convolution and PRe L U activation operation with the output channel number of 32, and the obtained feature maps are added with the corresponding positions of the feature maps subjected to the first layer convolution and activation operation, namely, the feature extraction operation is completed on one image to obtain 32 feature maps of the one image, and the same feature extraction is performed on each image in N images in a training pair to obtain 32 feature maps of the N images, and the 32 feature maps are cascaded to obtain N × 32 feature maps.
Each residual block operation comprises sequential 1-layer 3 × 3 convolution, batch normalization operation and PRe L U activation, then 1-layer 3 × 3 convolution and batch normalization operation, and finally the result characteristic diagram of the operation is added with the corresponding position of the input characteristic diagram, so that the result of the residual block can be obtained.
The attention module in the invention is designed as a cascade mixed attention module, namely, the channel attention operation is firstly carried out on an input feature diagram, and the channel attention weight is multiplied by the channel feature diagram channel by channel to complete the channel attention operation; then, performing space attention operation on the feature map with the adjusted channel attention, calculating the weight of each space position, and multiplying the weight by the feature map element by element to complete the space attention operation; through the sequential operation of channel attention and spatial attention, a mixed attention operation is completed.
Wherein, the channel attention operation mainly extracts the attention parameter by performing two pooling operations based on the channel plane. Respectively calculating the global Average value and the maximum value of each channel of the input feature map to obtain feature vectors with the same scale as the input feature map and the same number of channels, then respectively carrying out linear addition on the two feature vectors by a multilayer perceptron with shared weight, and then carrying out sigmoid activation operation to obtain a channel attention result, namely obtaining the weight of each feature map; and multiplying the channel attention weight value by the corresponding channel to obtain a characteristic diagram after the channel attention is adjusted.
The spatial attention operation is that averageposing and Max posing are carried out on feature graphs of all channels by taking spatial positions as units, the feature graphs are spliced together according to channel dimensions to obtain 2 weight matrixes consistent with the input feature graph in size, then 7 × 7 convolution operation is carried out on the obtained feature graphs to obtain a spatial attention weight matrix consistent with the input feature graph in size, namely the weight of each spatial position is obtained, element-by-element multiplication of the feature graphs and the spatial attention weight is carried out after the channel attention operation, and mixed attention operation is completed.
After the attention operation, a convolution operation of 3 × 3 is performed, and the output fusion result is obtained through the tanh activation function.
2) Distinguishing network structure building
The discrimination network is connected with the generation network, receives the result of the generation network and generates a group-route corresponding to the network input image, and is used for judging the truth of two input images, the structure parameters of the discrimination network are shown in attached table 1 of the drawing, the discrimination network comprises 10 convolution layers, the size of each filter is 3 × 3, the number of the filters is continuously increased and is increased from 64 to 1024, and is increased by one time every 2 times, in the 2 nd to 8 th convolution operation layers, each convolution layer comprises 1 convolution operation, 1 batch normalization, 1L eakyRe L U activation, only the 1 st convolution layer has no batch normalization operation, next, the 512 feature maps are sequentially subjected to average pooling operation, convolution operation, L eakyRe L U activation, convolution operation again, and finally, the discrimination result is activated by a sigmoid function to output.
The second part is the countertraining of the multi-exposure image fusion generation network and the discrimination network.
The method comprises the steps of firstly preparing training data, processing a disclosed multi-exposure image data set to obtain a training data set, wherein the original data set comprises 589 samples, each sample comprises 2-7 low dynamic range images with different exposures and a corresponding group-channel image, the size of the image is about 3000 × 5000 pixels, based on the data set, 440 image pairs are selected to serve as the training set, each sample selects 3 low dynamic range images with different exposures and the corresponding group-channel image to serve as a training sample pair, downsampling and then segmenting are carried out, each pair of images is divided into 6 blocks, and finally 2640 pairs of image blocks are obtained to serve as the training data set.
The specific countermeasure training method comprises the steps of alternately training a generation network and a discrimination network, firstly performing generation network training once by using generation loss and performing back propagation, then performing discrimination network training once by using discrimination loss and performing back propagation, and thus alternately training all the time. The overall loss function is shown in equation (1):
minGmaxDf(G,D), (1)
so as to achieve Nash equilibrium and complete training.
The loss function of the generated network designed by the invention consists of four parts, namely image loss (l)mse) Loss of perception (l)pe) To combat the loss (l)ad) And TV loss (l)tv)。
Adding the 4 losses according to a certain proportion generates the network loss, and the specific loss function is shown as formula (2):
lmef=αlmse+βlpe+γlad+ltv, (2)
and finally, a multi-exposure image fusion testing part.
The tested data set is selected by using 97 pairs of data left after the training part is selected, down sampling processing is carried out, cutting is not carried out, a test program is input, and a multi-exposure fusion image is generated. The test program applies two parts of multi-exposure image fusion to generate a network and judgment network countermeasure training result, parameters obtained by countermeasure training are input into the test program to perform multi-exposure image fusion to generate a multi-exposure fusion image, and subjective visual effect, objective peak signal-to-noise ratio (PSNR) and Structural Similarity (SSIM) index are applied to evaluation.
The algorithm designs a new attention generation countermeasure network for a multi-exposure image fusion task, and by introducing a visual attention mechanism into the generation network, the algorithm can help the network to adaptively learn the weights of different input images and different spatial positions so as to realize better fusion effect;
description of the drawings:
FIG. 1, a network architecture diagram;
FIG. 2, generating a network structure diagram;
FIG. 3, attention Module Structure diagram;
FIG. 4 is a chart comparing subjective results of the method of the present invention and the prior art, wherein the top row is low, medium and high exposure input and label images from left to right, and the bottom row is L i, Ma, Kou from left to right, and the result chart of the method of the present invention
The specific implementation mode is as follows:
the following description of the embodiments of the present invention is provided in conjunction with the accompanying drawings:
firstly, establishing a generation countermeasure network based on an attention mechanism, wherein the generation countermeasure network comprises a generation network and a judgment network, and the attention mechanism is introduced into the generation network; secondly, fusing a multi-explosion light image to generate a network and a judgment network for confrontation training, and performing successive alternate confrontation training on the generated network and the judgment network through a training sample set to obtain network parameters of the generated network; and finally, a testing stage, wherein in the multi-exposure fusion stage, 3 different exposure images are used as input, and the multi-exposure image fusion is realized through a trained generation network. The specific process is described below.
(1) Network construction
a) Generating networks
The generation network is mainly divided into 2 stages, the first half is a feature extraction and connection network, and the second half is an attention operation and fusion network.
In the feature extraction stage, firstly convolution of 3 × 3 is carried out, then PRe L U activation operation is carried out, and then 5 residual block operations are carried out, wherein each residual block operation comprises 1 convolution of 3 × 3, 1 batch of normalization operation, 1 PRe L U activation operation, 1 convolution of 3 × 3 and 1 batch of normalization operation, then a feature map input initially by a residual block and a feature map of the last batch of normalization operation are added to obtain an output result of the residual block, after the 5 residual block operations, the obtained 32 feature maps are sequentially subjected to convolution of 3 × 3 and 1 PRe L U activation operation, and then the obtained 32 feature maps are added to the feature map obtained by the first convolution operation, namely, the feature extraction of each input image is completed, and 32 feature maps are obtained.
And a connection stage, namely connecting the feature maps obtained in the feature extraction stage of the 3 different exposure input images input into the network to obtain 3 × 32 feature maps.
The attention operation stage is a mixed attention operation, namely, the channel attention operation and the space attention operation are sequentially completed, wherein the channel attention operation is to pool an input feature map based on each channel, Average focusing and Max focusing operations are performed, then 2 feature vectors are added through a multilayer perceptron, finally, a channel attention result is obtained through a sigmoid operation, the result is multiplied by the input feature map, namely, the channel attention operation is completed, 3 × 32 feature maps are obtained, then, the feature map with the channel attention operation completed is pooled based on space, the feature map is compressed on channel dimensions through the Average focusing and Max operations, a space attention result of each space position is obtained, the result is multiplied by the input feature map, namely, the space attention operation is completed, the two parts of operations are completed, namely, the attention operation is completed, and 3 × 32 feature maps are obtained.
The fusion phase, i.e. the convolution of 3 × 32 feature maps subjected to attention operations, again 3 × 3, finally activates the output using the tanh function.
b) Discriminating network
Judging whether the network is connected to the back of the generating network, receiving a multi-exposure fusion result of the generating network and a group-route corresponding to the input end of the generating network as two inputs, wherein the size of the pixels is 400 × 400 × 3, and the pixels are used for judging whether the generated image and the group-route belong to the same category;
the method comprises the steps of judging that a network sequentially conducts convolution with convolution kernel of 3 × 3 step size 1, then L eakyRe L U operation to obtain 64 feature maps, then completing 1 convolution with convolution kernel of 3L step size 2, 1 batch normalization operation, 1L eakyRe L U operation to obtain 64 feature maps, then completing 1 convolution with convolution kernel of 3L step size 1, 1 batch normalization operation, 1L eakyRe L U operation, obtaining 128 feature maps, completing 1 convolution with convolution kernel of 3L step size 2, 1 batch normalization operation, 1L eakyRe L U operation, obtaining 128 feature maps, completing 1 convolution with convolution kernel of 3L step size 1, 1 batch normalization operation, 1 367 eakyRe L U operation, L operation of L step size 1, L normalization operation, L operation of L is achieved, L, the initial convolution with convolution operation of 3 step size 2, L is achieved by L, the initial convolution operation of L operation, L operation of L is achieved by L, and the initial convolution with a L operation of L is achieved by L operation, and the initial normalization operation of L is achieved by L a specific convolution with a step size 1 operation of L, and L a L is achieved by L a normalization operation of L operation.
(2) Counter training
a) Training data set preparation
The method comprises the steps of adopting a disclosed multi-exposure image data set, preprocessing to obtain a training data set and a testing data set, wherein the original data set comprises 589 samples, each sample comprises 2-7 low dynamic range images with different exposures and a corresponding group-channel image, the image size is about 3000 × 5000 pixels, 440 sample pairs are selected to form the training data set based on the data set, each sample selects 3 low dynamic range images with different exposures and the corresponding group-channel image as a training sample pair, then the spatial resolution of the sample is uniformly reduced to 1200 × 800 pixels, so that more details are reserved and the contrast of an input image is reserved to the maximum extent, each image is correspondingly divided into image blocks with 400 × 400 pixels, and therefore the training data set comprises 2640 pairs of image blocks.
b) Loss function
During the training process of the network, the total loss function is shown in formula (1):
minGmaxDf(G,D), (1)
and alternately training the generation network and the decision network to achieve Nash equilibrium.
The definition of the loss function is crucial to the generation network, and the loss function of the generation network designed by the invention consists of four parts, namely image loss (l)mse) Loss of perception (l)pe) To combat the loss (l)ad) And TV loss (l)tv)。
Adding the 4 losses according to a certain proportion generates the network loss, and the specific loss function is shown as formula (2):
lmef=αlmse+βlpe+γlad+ltv, (2)
wherein α -1, β -6 × 10-3,γ=10-3,=2×10-8
In particular, lmseFor computationally generating networksMean square loss between the generated fusion result and Ground-truth, andpethe method is a perceptual loss, and is used for calculating the mean square loss between feature maps obtained after a pre-trained VGG network generates a fusion result generated by a generating network and a group-route, as shown in formulas (3) and (4):
Figure BDA0002425429130000071
Figure BDA0002425429130000081
wherein W and H are dimensions indicating the width and height of the input image, respectively, FiRefers to generating a network-generated fused result, GT refers to the group-route corresponding to the input, and VggThe invention selects the output result of the first 30 layers of the pre-trained VGG network to calculate, corresponding to the operation of the pre-trained VGG network.
ltvUsually in a very small proportion (not more than 10)-8Of this order) for use with other losses for suppressing noise in the generation process,/tvAs shown in equation (5):
Figure BDA0002425429130000082
wherein,
Figure BDA0002425429130000083
u is used to refer to the image being calculated, DuRefers to the domain of support for the image.
c) Confrontational training process
The specific process of training is that the generation network and the discrimination network are alternately carried out, namely, the generation network is generated and then reversely propagated after training for 1 time, and then the discrimination network is updated for 1 time and then reversely propagated. During training, the batch size is set to be 1, and the convergence effect can be achieved after about 100 rounds of training.
(3) Multiple exposure fusion test
The test data set applied by the test procedure was 97 pairs of data selected from the original data set with the remainder of the training data set removed, down-sampled to 400 × 400 × 3 size, without cropping, each pair containing 3 differently exposed images as input and 1 labeled group-try image for comparative evaluation with the generated multi-exposure fused image.
The multi-exposure fusion test process intercepts a generation network part for generating a countermeasure network, utilizes network parameters obtained by the last training in a countermeasure training stage to load the network parameters into the generation network part, and inputs a test data set into the network to obtain a multi-exposure fusion image, and the specific process is to input 3 images with different exposures and the size of 400 × 400 × 3 into the generation network, and obtain 1 fusion image with the detail definition and the enhanced contrast of 400 × 400 × 3 through feature extraction, connection, attention operation and fusion.
In order to verify the effectiveness of the invention, subjective visual effect and objective numerical index are adopted to evaluate and generate fusion effect. The subjective visual effect pair of the method of the invention and other existing methods is shown in figure 4, for example, while the objective index adopts two common image quality evaluation indexes, namely peak signal-to-noise ratio (PSNR) and Structural Similarity (SSIM), and the results are shown in table 2. From the subjective visual effect, the result details of the method are clearer and have stronger contrast, and from the objective numerical value, the numerical value of the method is higher, so that the result of the method is better than that of the existing method from the two aspects of subjectivity and objectivity.
Table 1 network parameter table for discriminating network
Figure BDA0002425429130000091
TABLE 2 comparison of objective results for the method of the present invention and the prior art methods
Average PSNR (dB) Average SSIM
Li 15.9827 0.5234
Ma 15.9915 0.5350
Kou 16.4513 0.5469
The invention 17.6994 0.5489

Claims (1)

1. A multi-exposure image fusion method based on an attention generation countermeasure network is characterized in that: the method comprises three parts of generation countermeasure network structure construction based on an attention mechanism, multi-exposure image fusion generation network, judgment network countermeasure training and multi-exposure image fusion testing;
firstly, a first part is to build a generation countermeasure network based on an attention mechanism, the overall network is composed of a generation network and a judgment network, and the attention mechanism is introduced into the generation network; the network construction specifically comprises the following steps:
1) generating network structure build
The method comprises the steps of generating a network structure, wherein the network structure is formed by combining a feature extraction mechanism and an attention mechanism, the feature extraction part is completed by 3 × 3 convolution with the output channel number of 32 and PRe L U activation operation, 5 residual block modules with the input and output channels of 32 are respectively used, the residual block modules are further completed by 3 × 3 convolution with the output channel number of 32 and PRe L U activation operation, the obtained feature graphs are added with the corresponding positions of the feature graphs subjected to the first layer convolution and activation operation, namely, the feature extraction operation is completed on one image to obtain 32 feature graphs of one image, simultaneously, the same feature extraction is performed on each image in N input images in a training pair to obtain 32 feature graphs of the N images, and the 32 feature graphs are cascaded to obtain N × 32 feature graphs;
each residual block operation comprises sequential 1-layer 3 × 3 convolution, batch normalization operation and PRe L U activation, then 1-layer 3 × 3 convolution and batch normalization operation, and finally the result characteristic diagram of the operation is added with the corresponding position of the input characteristic diagram to obtain the result of the primary residual block;
the attention module is designed as a cascaded mixed attention module, namely, the channel attention operation is firstly carried out on the input feature diagram, and the channel attention weight is multiplied by the channel feature diagram channel by channel to complete the channel attention operation; then, performing space attention operation on the feature map with the adjusted channel attention, calculating the weight of each space position, and multiplying the weight by the feature map element by element to complete the space attention operation; through the sequential operation of channel attention and spatial attention, the mixed attention operation is completed;
wherein the channel attention operation is to perform two pooling operations based on a channel plane to extract an attention parameter; respectively calculating the global Average value and the maximum value of each channel of the input feature map to obtain feature vectors with the same scale as the input feature map and the same number of channels, then respectively carrying out linear addition on the two feature vectors by a multilayer perceptron with shared weight, and then carrying out sigmoid activation operation to obtain a channel attention result, namely obtaining the weight of each feature map; multiplying the channel attention weight value by the corresponding channel to obtain a characteristic diagram after the channel attention is adjusted;
the spatial attention operation is that Average value and Max value of all channel feature maps are carried out by taking spatial position as unit, and the Average value and Max value are spliced together according to channel dimension to obtain 2 weight matrixes consistent with the input feature map scale, then 7 × 7 convolution operation is carried out on the obtained feature maps to obtain a spatial attention weight matrix consistent with the input feature map scale, and the weight of each spatial position is obtained;
after the attention operation, performing a 3 × 3 convolution operation, and obtaining an output fusion result through a tanh activation function;
2) distinguishing network structure building
The judging network is connected with the generating network and used for receiving the result of the generating network and generating a group-route corresponding to the network input image and judging the truth of the two input images, wherein the judging network comprises 10 convolutional layers, the size of each filter is 3 ×, the number of the filters is continuously increased and is increased from 64 to 1024, and is doubled every 2 times;
the second part is that the multi-exposure image is fused to generate a network and is subjected to confrontation training with a discrimination network;
firstly, training data is prepared, down-sampling and then segmentation are carried out, and each pair of images is divided into 6 blocks;
the confrontation training method is to alternately train the generation network and the discrimination network, firstly, the generation loss is used for carrying out the generation network training for one time, the back propagation is carried out, then, the discrimination loss is used for carrying out the discrimination network training for one time, and then, the back propagation is carried out, thus, the alternate training is always carried out; the overall loss function is shown in equation (1):
minGmaxDf(G,D), (1)
so as to achieve Nash equilibrium and complete training;
the loss function of the designed generation network consists of four parts, respectively image loss (l)mse) Loss of perception (l)pe) To combat the loss (l)ad) And TV loss (l)tv);
Adding the 4 losses according to a certain proportion generates the network loss, and the specific loss function is shown as formula (2):
lmef=αlmse+βlpe+γlad+ltv, (2)
the tested data set is selected by using the data left after the training part is selected, down-sampling processing is carried out, cutting is not carried out, a test program is input, and a multi-exposure fusion image is generated; and the test program generates the result of the network and the discrimination network countertraining by fusing the second part of the multi-exposure images, and inputs the parameters of the generated network obtained by the countertraining into the test program for multi-exposure image fusion to generate the multi-exposure fused image.
CN202010219045.3A 2020-03-25 2020-03-25 Multi-exposure image fusion method based on attention generation countermeasure network Pending CN111429433A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010219045.3A CN111429433A (en) 2020-03-25 2020-03-25 Multi-exposure image fusion method based on attention generation countermeasure network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010219045.3A CN111429433A (en) 2020-03-25 2020-03-25 Multi-exposure image fusion method based on attention generation countermeasure network

Publications (1)

Publication Number Publication Date
CN111429433A true CN111429433A (en) 2020-07-17

Family

ID=71548624

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010219045.3A Pending CN111429433A (en) 2020-03-25 2020-03-25 Multi-exposure image fusion method based on attention generation countermeasure network

Country Status (1)

Country Link
CN (1) CN111429433A (en)

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112132790A (en) * 2020-09-02 2020-12-25 西安国际医学中心有限公司 DAC-GAN model construction method and application in mammary gland MR image
CN112488971A (en) * 2020-11-23 2021-03-12 石家庄铁路职业技术学院 Medical image fusion method for generating countermeasure network based on spatial attention mechanism and depth convolution
CN112580782A (en) * 2020-12-14 2021-03-30 华东理工大学 Channel enhancement-based double-attention generation countermeasure network and image generation method
CN112766087A (en) * 2021-01-04 2021-05-07 武汉大学 Optical remote sensing image ship detection method based on knowledge distillation
CN112766279A (en) * 2020-12-31 2021-05-07 中国船舶重工集团公司第七0九研究所 Image feature extraction method based on combined attention mechanism
CN112927172A (en) * 2021-05-10 2021-06-08 北京市商汤科技开发有限公司 Training method and device of image processing network, electronic equipment and storage medium
CN113379655A (en) * 2021-05-18 2021-09-10 电子科技大学 Image synthesis method for generating antagonistic network based on dynamic self-attention
CN113642452A (en) * 2021-08-10 2021-11-12 汇纳科技股份有限公司 Human body image quality evaluation method, device, system and storage medium
CN113888443A (en) * 2021-10-21 2022-01-04 福州大学 Sing concert shooting method based on adaptive layer instance normalization GAN
CN114900619A (en) * 2022-05-06 2022-08-12 北京航空航天大学 Self-adaptive exposure driving camera shooting underwater image processing system
CN113888443B (en) * 2021-10-21 2024-08-02 福州大学 Concert shooting method based on adaptive layer instance normalization GAN

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104899845A (en) * 2015-05-10 2015-09-09 北京工业大学 Method for fusing multiple exposure images based on 1 alphabeta space scene migration
CN110225260A (en) * 2019-05-24 2019-09-10 宁波大学 A kind of three-dimensional high dynamic range imaging method based on generation confrontation network
CN110555458A (en) * 2019-07-24 2019-12-10 中北大学 Multi-band image feature level fusion method for generating countermeasure network based on attention mechanism

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104899845A (en) * 2015-05-10 2015-09-09 北京工业大学 Method for fusing multiple exposure images based on 1 alphabeta space scene migration
CN110225260A (en) * 2019-05-24 2019-09-10 宁波大学 A kind of three-dimensional high dynamic range imaging method based on generation confrontation network
CN110555458A (en) * 2019-07-24 2019-12-10 中北大学 Multi-band image feature level fusion method for generating countermeasure network based on attention mechanism

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
吴超玮: "基于注意力机制的图像/视频动态范围增强技术研究", 《万方硕士专业学位论文》, 31 May 2021 (2021-05-31) *

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112132790A (en) * 2020-09-02 2020-12-25 西安国际医学中心有限公司 DAC-GAN model construction method and application in mammary gland MR image
CN112132790B (en) * 2020-09-02 2024-05-14 西安国际医学中心有限公司 DAC-GAN model construction method and application thereof in mammary gland MR image
CN112488971A (en) * 2020-11-23 2021-03-12 石家庄铁路职业技术学院 Medical image fusion method for generating countermeasure network based on spatial attention mechanism and depth convolution
CN112580782B (en) * 2020-12-14 2024-02-09 华东理工大学 Channel-enhanced dual-attention generation countermeasure network and image generation method
CN112580782A (en) * 2020-12-14 2021-03-30 华东理工大学 Channel enhancement-based double-attention generation countermeasure network and image generation method
CN112766279A (en) * 2020-12-31 2021-05-07 中国船舶重工集团公司第七0九研究所 Image feature extraction method based on combined attention mechanism
CN112766087A (en) * 2021-01-04 2021-05-07 武汉大学 Optical remote sensing image ship detection method based on knowledge distillation
CN112927172A (en) * 2021-05-10 2021-06-08 北京市商汤科技开发有限公司 Training method and device of image processing network, electronic equipment and storage medium
CN113379655A (en) * 2021-05-18 2021-09-10 电子科技大学 Image synthesis method for generating antagonistic network based on dynamic self-attention
CN113379655B (en) * 2021-05-18 2022-07-29 电子科技大学 Image synthesis method for generating antagonistic network based on dynamic self-attention
CN113642452B (en) * 2021-08-10 2023-11-21 汇纳科技股份有限公司 Human body image quality evaluation method, device, system and storage medium
CN113642452A (en) * 2021-08-10 2021-11-12 汇纳科技股份有限公司 Human body image quality evaluation method, device, system and storage medium
CN113888443A (en) * 2021-10-21 2022-01-04 福州大学 Sing concert shooting method based on adaptive layer instance normalization GAN
CN113888443B (en) * 2021-10-21 2024-08-02 福州大学 Concert shooting method based on adaptive layer instance normalization GAN
CN114900619A (en) * 2022-05-06 2022-08-12 北京航空航天大学 Self-adaptive exposure driving camera shooting underwater image processing system

Similar Documents

Publication Publication Date Title
CN111429433A (en) Multi-exposure image fusion method based on attention generation countermeasure network
Liang et al. Cameranet: A two-stage framework for effective camera isp learning
CN110210608B (en) Low-illumination image enhancement method based on attention mechanism and multi-level feature fusion
CN109447907B (en) Single image enhancement method based on full convolution neural network
CN112233038A (en) True image denoising method based on multi-scale fusion and edge enhancement
CN110458765B (en) Image quality enhancement method based on perception preserving convolution network
CN111292264A (en) Image high dynamic range reconstruction method based on deep learning
CN110225260B (en) Three-dimensional high dynamic range imaging method based on generation countermeasure network
CN110148088B (en) Image processing method, image rain removing method, device, terminal and medium
CN111986084A (en) Multi-camera low-illumination image quality enhancement method based on multi-task fusion
CN115223004A (en) Method for generating confrontation network image enhancement based on improved multi-scale fusion
CN113284061B (en) Underwater image enhancement method based on gradient network
CN112508812A (en) Image color cast correction method, model training method, device and equipment
CN113284070A (en) Non-uniform fog image defogging algorithm based on attention transfer mechanism
CN114648508A (en) Multi-exposure image fusion method based on multi-dimensional collaborative refined network
Ouyang et al. Neural camera simulators
Ye et al. Progressive and selective fusion network for high dynamic range imaging
CN116468625A (en) Single image defogging method and system based on pyramid efficient channel attention mechanism
Hu et al. Hierarchical discrepancy learning for image restoration quality assessment
Sun et al. Mipi 2023 challenge on rgbw remosaic: Methods and results
Liang et al. Method for reconstructing a high dynamic range image based on a single-shot filtered low dynamic range image
CN116245968A (en) Method for generating HDR image based on LDR image of transducer
CN116433516A (en) Low-illumination image denoising and enhancing method based on attention mechanism
CN115661012A (en) Multi-exposure image fusion system based on global-local aggregation learning
CN115601792A (en) Cow face image enhancement method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination