CN115063304A - End-to-end multi-size fusion-based pyramid neural network image defogging method and system - Google Patents

End-to-end multi-size fusion-based pyramid neural network image defogging method and system Download PDF

Info

Publication number
CN115063304A
CN115063304A CN202210557615.9A CN202210557615A CN115063304A CN 115063304 A CN115063304 A CN 115063304A CN 202210557615 A CN202210557615 A CN 202210557615A CN 115063304 A CN115063304 A CN 115063304A
Authority
CN
China
Prior art keywords
feature
image
network
fusion
groups
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210557615.9A
Other languages
Chinese (zh)
Other versions
CN115063304B (en
Inventor
王胜春
陈培棋
蔡荣辉
叶成志
刘炼烨
黄金贵
田斌
葛晶晶
罗颖光
计君伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hunan Normal University
Original Assignee
Hunan Normal University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hunan Normal University filed Critical Hunan Normal University
Priority to CN202210557615.9A priority Critical patent/CN115063304B/en
Publication of CN115063304A publication Critical patent/CN115063304A/en
Application granted granted Critical
Publication of CN115063304B publication Critical patent/CN115063304B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/73Deblurring; Sharpening
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/50Image enhancement or restoration using two or more images, e.g. averaging or subtraction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • G06V10/806Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A90/00Technologies having an indirect contribution to adaptation to climate change
    • Y02A90/10Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Software Systems (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Multimedia (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Image Processing (AREA)

Abstract

The invention discloses an end-to-end multi-size fusion-based pyramid neural network image defogging method and system, wherein five groups of characteristic diagrams of a foggy image under different sizes and sub-regions are extracted by utilizing a main network in an image defogging model; carrying out feature enhancement on the five groups of feature graphs by using a feature pyramid network structure in the image defogging model to obtain five groups of feature graphs after feature enhancement; fusing the five groups of feature maps by a space multi-size feature superposition fusion method to obtain fusion features; further fusing and decoding the fusion characteristics by using a decoder in the image defogging model to obtain an intermediate estimation parameter of the network; and reconstructing the network intermediate estimation parameters and the original foggy image input by the network by using a physical recovery module to obtain a fogless image. The method can fuse local features and global features, combines low-level semantic features and high-level semantic features, avoids the problem of feature loss in the down-sampling process, fully utilizes the feature information of the foggy image in the convolution process, and can achieve better defogging effect.

Description

End-to-end multi-size fusion-based pyramid neural network image defogging method and system
Technical Field
The invention relates to the technical field of image processing, in particular to an end-to-end multi-size fusion-based pyramid neural network image defogging method and system.
Background
With the development of the technology, computer vision tasks such as target detection, target tracking, behavior analysis, face recognition and the like make a great breakthrough. However, advanced visual tasks such as detection, tracking rely on clear video and image data, the performance of which is often greatly affected in real scenes, such as heavy fog, heavy rain, etc. Image defogging has received attention from many researchers in recent years as a preliminary task to some advanced vision tasks.
Earlier methods that did not utilize deep learning tend to have significant drawbacks for single image defogging. The non-deep learning methods are based on prior, some methods rely on a physical scattering model to remove fog by estimating the atmospheric light and the transmission map, but the estimation of the atmospheric light and the transmission map by a fog image is an ill-posed problem, and the estimation result is often inaccurate. There are some methods to remove fog in images by using the statistical characteristics of the images. Although they have some effect, they are not applicable to many real-world scenarios. In addition, other data-driven methods based on deep learning have significant disadvantages, such methods all learn a one-to-one mapping between a given fog image and a corresponding sharp image, which is contrary to the unsuitability of the defogging problem, and such methods only give a certain output for a fog image, and lack diversity.
Therefore, it is highly desirable for workers in the field to provide a defogging method that can continue to effectively defogge a foggy image and obtain a clear image.
Disclosure of Invention
The invention provides an end-to-end-based multi-size fusion pyramid neural network image defogging method and system, which are used for solving the technical problem of effectively defogging an actually shot fog image.
In order to solve the technical problems, the technical scheme provided by the invention is as follows: an end-to-end-based multi-size fusion pyramid neural network image defogging method comprises the following steps:
extracting five groups of characteristic diagrams of the foggy image under different sizes and sub-areas by using a main network in the image defogging model;
carrying out feature enhancement on the five groups of feature graphs by using a feature pyramid network structure in the image defogging model to obtain five groups of feature graphs after feature enhancement;
fusing the feature graphs with the five groups of enhanced features by a spatial multi-size feature superposition fusion method to obtain fusion features;
further fusing and decoding the fusion characteristics by using a decoder in the image defogging model to obtain an intermediate estimation parameter of the network;
reconstructing the intermediate estimation parameters of the network and the original foggy image input by the network by using a physical recovery module to obtain a fogless image;
the image defogging model comprises the backbone network, the feature pyramid network structure, the decoder and the physical recovery module.
Preferably, the constructing the image defogging model comprises:
acquiring a training set, wherein the training set comprises a foggy image and a clear image corresponding to the foggy image;
initializing the image defogging model;
inputting the foggy image into the main network and outputting to obtain five groups of characteristic diagrams of the foggy image under different sizes and subregions;
then, carrying out feature reinforcement on the five groups of feature graphs through the feature pyramid network structure to obtain five groups of feature graphs after feature reinforcement;
fusing the feature graphs with the five groups of enhanced features by a spatial multi-size feature superposition fusion method to obtain fusion features;
inputting the fusion characteristics into a decoder to obtain intermediate estimation parameters of the network;
reconstructing the intermediate estimation parameters of the network and the original foggy image input by the network by using a physical recovery module to obtain a fogless image;
and training by taking the mean square error of the fog-free image reconstructed by the fog-containing image and the clear image corresponding to the fog-containing image as a loss function to obtain a convergent image defogging model.
Preferably, the loss function satisfies the following equation:
Figure BDA0003652831760000021
wherein ,Lmse And for network loss, N is the number of pixels of the foggy image participating in the establishment of the image defogging model, Y is a fogless image reconstructed by the foggy image, and X is a clear image corresponding to the foggy image.
Preferably, the backbone network consists of eight convolution modules, wherein the first convolution module consists of a 3 × 3 convolution layer and a batch normalization layer; the second, third, fifth and eight convolution modules are mobile reversible convolution blocks with convolution kernel size of 3 x 3; the fourth, sixth and seventh convolution modules are mobile invertible convolution blocks with convolution kernel size 5 x 5; the 2 nd to 8 th convolution modules all use a residual network structure, and the number of network layers in the 2 nd to 8 th convolution modules is 1, 2, 3, 4 and 1 respectively.
Preferably, the decoder is composed of three decoding modules, each decoding module is composed of a 3 × 3 convolutional layer and an upsampling layer, and the fused features can obtain intermediate estimation parameters of the network through the three decoding modules.
Preferably, the characteristic pyramid network structure in the image defogging model is used for carrying out characteristic enhancement on five groups of characteristic diagrams, and the step of obtaining the five groups of characteristic diagrams after characteristic enhancement comprises the step of carrying out characteristic enhancement on five groups of characteristic diagrams containing different sizes and subregions by using a multilayer characteristic pyramid network structure.
Preferably, the feature fusion of the feature maps obtained by enhancing the five groups of features is a method of using spatial multi-size feature superposition fusion, and based on the size of the feature map with the largest size in the feature maps obtained by enhancing the five groups of features, the sizes of the other feature maps are kept consistent with the largest feature map by using a mixed interpolation mode, and the fusion feature is obtained by using the feature maps obtained by spatially multi-size superposition fusion of the five groups of features.
Preferably, the physical recovery module is used for reconstructing the intermediate estimation parameters of the network and the original foggy image input by the network to obtain a fogless image satisfying an atmospheric scattering model, and the atmospheric scattering model is rewritten as shown in the following formula:
I(x)=J(x)t(x)+A(1-t(x))
wherein, I (x) is a foggy image, J (x) is a fogless image, A is a global atmospheric light value, and t (x) is a transmittance;
combining the global atmospheric light value A and the transmissivity t (x) in the atmospheric scattering model to obtain a physical recovery model which is depended by the physical recovery module; as shown in the following equation:
J(x)=k(x)I(x)-k(x)+b
wherein ,
Figure BDA0003652831760000031
b is a constant 1, k (x) is an intermediate estimation parameter of the network;
and reconstructing the intermediate estimation parameters of the network and the foggy image input by the network by using a physical recovery module to obtain a fogless image.
The embodiment of the invention also provides an end-to-end-based multi-size fusion pyramid neural network image defogging system, which comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the processor realizes the steps of any one of the methods when executing the computer program.
The invention has the following beneficial effects:
the method comprises the steps of extracting five groups of characteristic diagrams of the foggy image under different sizes and subregions by utilizing a main network in an image defogging model; carrying out feature enhancement on the five groups of feature graphs by using a feature pyramid network structure in the image defogging model to obtain five groups of feature graphs after feature enhancement; fusing the five groups of feature maps by a space multi-size feature superposition fusion method to obtain fusion features; further fusing and decoding the fusion characteristics by using a decoder in the image defogging model to obtain an intermediate estimation parameter of the network; and reconstructing the network intermediate estimation parameters and the original foggy image input by the network by using a physical recovery module to obtain a fogless image. The feature pyramid network structure can further integrate local features and global features in the foggy image, and combines low-level semantic features and high-level semantic features, so that the problem of feature loss caused in the feature extraction process is avoided, and feature information obtained in the convolution process of the foggy image is fully utilized. The image defogging model of the invention reduces the requirement on hardware and reduces the time required in the defogging process while achieving better defogging effect. In addition to the objects, features and advantages described above, other objects, features and advantages of the present invention are also provided. Compared with other defogging methods, the fog-free image obtained by the invention has better performance on the specific details of the image, clearer and more complete image texture details and more gorgeous and more real image color. Compared with other defogging methods based on deep learning, the method has no specific requirement on the resolution of the input image, and can receive the input image with any resolution and obtain the fog-free image with the same resolution. In the actual defogging process, the global concentration and the local concentration of the fog are comprehensively considered, the situation of excessive defogging or insufficient defogging in the obtained fog-free image is avoided, and the fog-free image has better fidelity. The present invention will be described in further detail below with reference to the accompanying drawings.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this application, illustrate embodiments of the invention and, together with the description, serve to explain the invention and not to limit the invention. In the drawings:
FIG. 1 is a schematic diagram of a preferred embodiment of an end-to-end multi-size fusion-based pyramid neural network image defogging method according to the invention;
fig. 2 is a schematic structural diagram of an image defogging model according to a preferred embodiment of the present invention.
Detailed Description
Embodiments of the invention will be described in detail below with reference to the drawings, but the invention can be implemented in many different ways as defined and covered by the claims.
Example 1:
referring to fig. 1 and 2, an end-to-end multi-size fusion-based pyramid neural network image defogging method includes:
s1, extracting five groups of characteristic diagrams of the foggy image under different sizes and sub-regions by using a backbone network in the image defogging model;
s2, performing feature reinforcement on the five groups of feature graphs by using a feature pyramid network structure in the image defogging model to obtain five groups of feature graphs after feature reinforcement;
s3, fusing the feature graphs after the five groups of features are strengthened by a space multi-size feature superposition and fusion method to obtain fusion features;
s4, further fusing and decoding the fusion features by using a decoder in the image defogging model to obtain an intermediate estimation parameter of the network;
s5, reconstructing the intermediate estimation parameters of the network and the original foggy image input by the network by using a physical recovery module to obtain a fogless image;
the image defogging model comprises a backbone network, a characteristic pyramid network structure, a decoder and a physical recovery module.
Optionally, the constructing an image defogging model includes:
acquiring a training set, wherein the training set comprises a foggy image and a clear image corresponding to the foggy image;
initializing an image defogging model;
inputting the foggy image into the main network and outputting to obtain five groups of characteristic diagrams of the foggy image under different sizes and subregions; then, carrying out feature reinforcement on the five groups of feature graphs through the feature pyramid network structure to obtain five groups of feature graphs after feature reinforcement; fusing the feature graphs with the five groups of enhanced features by a spatial multi-size feature superposition fusion method to obtain fusion features; inputting the fusion characteristics into a decoder, and further fusing and decoding to obtain intermediate estimation parameters of the network;
reconstructing the intermediate estimation parameters of the network and the original foggy image input by the network by using a physical recovery module to obtain a fogless image;
and training by taking the mean square error of the fog-free image reconstructed by the fog image and the clear image corresponding to the fog-free image as a loss function to obtain a converged image defogging model.
In the present alternative embodiment of the method,
optionally, the loss function satisfies the following equation:
Figure BDA0003652831760000051
wherein ,Lmse For network loss, N is the number of pixels of the foggy image participating in building the image defogging model, Y is the fogless image reconstructed by the foggy image, and X is the clear image corresponding to the foggy image.
In an optional implementation manner, in order to reduce the visual difference between a defogged image and a corresponding clear image obtained by an image defogging model, a mean square error between a defogged image reconstructed from a foggy image and a clear image corresponding to the foggy image is used as a loss function to train to obtain a converged image defogging model, iterative optimization is performed on the loss function through a back propagation matched optimization algorithm, and when the number of iteration rounds of the loss function is set and is not reduced any more, the training is stopped to obtain the converged image defogging model.
Optionally, the backbone network is composed of eight convolution modules, where a first convolution module is composed of a 3 × 3 convolution layer and a batch normalization layer; the second, third, fifth and eight convolution modules are mobile invertible convolution blocks with convolution kernel size of 3 x 3; the fourth, sixth and seventh convolution modules are mobile invertible convolution blocks with convolution kernel size 5 x 5; the 2 nd to 8 th convolution modules all use a residual network structure, and the number of network layers in the 2 nd to 8 th convolution modules is 1, 2, 3, 4 and 1 respectively.
In this optional embodiment, EfficientNet is used as the network structure of the main network, the sizes of the feature maps are 1/2, 1/4, 1/8, 1/16, 1/32, 1/64 and 1/128 of the size of the foggy image, respectively, the number of channels of the feature maps is 16, 24, 40, 80, 112, 192 and 320, respectively, the five groups of feature maps are taken, the number of channels is uniformly changed to 64 by using convolutional layers, and the original size is kept unchanged, so that five groups of feature maps of the foggy image under different sizes and sub-regions are obtained.
Optionally, the decoder includes three decoding modules, each decoding module includes a 3 × 3 convolutional layer and an upsampling layer, and the fused feature may obtain an intermediate estimation parameter of the network through the three decoding modules.
Optionally, the feature pyramid network structure in the image defogging model is used for performing feature enhancement on five groups of feature graphs, and obtaining five groups of feature graphs after feature enhancement includes performing feature enhancement on five groups of feature graphs containing different sizes and sub-regions by using a multilayer feature pyramid network structure.
It should be noted that, in order to better obtain the local features and the global features of the image, a multi-layer feature pyramid network structure is used to perform feature enhancement on five groups of feature maps containing different sizes and sub-regions, and in order to better fuse the local features and the global features, the fusion process uses adaptive weight fusion, and the image defogging model can better obtain effective features.
Optionally, the feature fusion of the feature maps after the five groups of features are strengthened is a method of using spatial multi-size feature superposition fusion, where the size of the feature map with the largest size in the feature maps after the five groups of features are strengthened is used as a reference, and the sizes of other feature maps are kept consistent with the largest feature map by using a mixed interpolation mode, and the fusion feature is obtained by spatially multi-size superposition and fusion of the feature maps after the five groups of features are strengthened.
Optionally, the subsequent feature map obtains a fusion feature.
Preferably, the physical recovery module is used for reconstructing the intermediate estimation parameters of the network and the original foggy image input by the network to obtain a fogless image satisfying an atmospheric scattering model, and the atmospheric scattering model is rewritten as shown in the following formula:
I(x)=J(x)t(x)+A(1-t(x))
wherein, I (x) is a foggy image, J (x) is a fogless image, A is a global atmospheric light value, and t (x) is a transmittance;
and combining the global atmospheric light value A and the transmittance t (x) in the atmospheric scattering model to obtain a physical recovery model depended by the physical recovery module, wherein the physical recovery model is shown as the following formula:
J(x)=k(x)I(x)-k(x)+b
wherein ,
Figure BDA0003652831760000061
b is a constant 1, k (x) is an intermediate estimation parameter of the network;
and reconstructing the intermediate estimation parameters of the network and the foggy image input by the network by using a physical recovery module to obtain a fogless image.
Example 2:
an end-to-end-based multi-size fused pyramid neural network image defogging system comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the processor implements the steps of the method of the embodiment 1 when executing the computer program.
In conclusion, five groups of characteristic diagrams of the foggy image under different sizes and sub-regions are extracted by utilizing a main network in the image defogging model; carrying out feature enhancement on the five groups of feature graphs by using a feature pyramid network structure in the image defogging model to obtain five groups of feature graphs after feature enhancement; fusing the five groups of feature maps by a space multi-size feature superposition fusion method to obtain fusion features; further fusing and decoding the fusion characteristics by using a decoder in the image defogging model to obtain an intermediate estimation parameter of the network; and reconstructing the network intermediate estimation parameters and the original foggy image input by the network by using a physical recovery module to obtain a fogless image. The feature pyramid network structure can further fuse local features and global features, combines low-level semantic features and high-level semantic features, avoids the problem of feature loss in the down-sampling process, fully utilizes feature information of a foggy image in the convolution process, reduces the requirement on hardware while achieving a better defogging effect, and reduces the time required in the defogging process.
In the description of the embodiments of the present invention, "multilayer" means two or more layers unless specifically defined otherwise.
Any process or method descriptions in flow charts or otherwise described herein may be understood as representing modules, segments, or portions of code which include one or more executable instructions for implementing specific logical functions or steps of the process, and alternate implementations are included within the scope of the preferred embodiment of the present invention in which functions may be executed out of order from that shown or discussed, including substantially concurrently or in reverse order, depending on the functionality involved, as would be understood by those reasonably skilled in the art of the present invention.
It should be understood that portions of embodiments of the present invention may be implemented in hardware, software, firmware, or a combination thereof. In the above embodiments, the various steps or methods may be implemented in software or firmware stored in memory and executed by a suitable instruction execution system. For example, if implemented in hardware, as in another embodiment, any one or combination of the following techniques, which are known in the art, may be used: a discrete logic circuit having a logic gate circuit for implementing a logic function on a data signal, an application specific integrated circuit having an appropriate combinational logic gate circuit, a Programmable Gate Array (PGA), a Field Programmable Gate Array (FPGA), or the like.
It will be understood by those skilled in the art that all or part of the steps carried by the method for implementing the above embodiments may be implemented by hardware related to instructions of a program, which may be stored in a computer readable storage medium, and when the program is executed, the program includes one or a combination of the steps of the method embodiments.
In addition, functional units in the embodiments of the present invention may be integrated into one processing module, or each unit may exist alone physically, or two or more units are integrated into one module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode. The integrated module, if implemented in the form of a software functional module and sold or used as a stand-alone product, may also be stored in a computer readable storage medium.
Although embodiments of the present invention have been shown and described above, it is understood that the above embodiments are exemplary and should not be construed as limiting the present invention, and that variations, modifications, substitutions and alterations can be made in the above embodiments by those of ordinary skill in the art within the scope of the present invention.

Claims (9)

1. An end-to-end-based multi-size fusion pyramid neural network image defogging method is characterized by comprising the following steps of:
extracting five groups of characteristic diagrams of the foggy image under different sizes and sub-regions by using a main network in the image defogging model;
carrying out feature enhancement on the five groups of feature graphs by using a feature pyramid network structure in the image defogging model to obtain five groups of feature graphs after feature enhancement;
fusing the feature graphs with the five groups of enhanced features by a spatial multi-size feature superposition fusion method to obtain fusion features;
further fusing and decoding the fusion characteristics by using a decoder in the image defogging model to obtain an intermediate estimation parameter of the network;
reconstructing the intermediate estimation parameters of the network and the original foggy image input by the network by using a physical recovery module to obtain a fogless image;
the image defogging model comprises the backbone network, the feature pyramid network structure, the decoder and the physical recovery module.
2. The end-to-end-based multi-size fusion pyramid neural network image defogging method according to claim 1, wherein the building of the image defogging model comprises:
acquiring a training set, wherein the training set comprises a foggy image and a clear image corresponding to the foggy image;
initializing the image defogging model;
inputting the foggy image into the main network and outputting to obtain five groups of characteristic diagrams of the foggy image under different sizes and subregions;
then, carrying out feature reinforcement on the five groups of feature graphs through the feature pyramid network structure to obtain five groups of feature graphs after feature reinforcement;
fusing the feature graphs with the five groups of enhanced features by a spatial multi-size feature superposition fusion method to obtain fusion features;
inputting the fusion characteristics into a decoder, and further fusing and decoding to obtain intermediate estimation parameters of the network;
reconstructing the intermediate estimation parameters of the network and the original foggy image input by the network by using a physical recovery module to obtain a fogless image;
and training by taking the mean square error of the fog-free image reconstructed by the fog-containing image and the clear image corresponding to the fog-containing image as a loss function to obtain a convergent image defogging model.
3. The end-to-end multi-size fusion-based pyramid neural network image defogging method according to claim 2, wherein said loss function satisfies the following formula:
Figure FDA0003652831750000011
wherein ,Lmse And for network loss, N is the number of pixels of the foggy image participating in the establishment of the image defogging model, Y is a fogless image reconstructed by the foggy image, and X is a clear image corresponding to the foggy image.
4. The end-to-end based multi-size fused pyramid neural network image defogging method according to claim 1, wherein the backbone network is composed of eight convolution modules, wherein a first convolution module is composed of a 3 x 3 convolution layer and a batch normalization layer; the second, third, fifth and eight convolution modules are mobile invertible convolution blocks with convolution kernel size of 3 x 3; the fourth, sixth and seventh convolution modules are mobile invertible convolution blocks with convolution kernel size 5 x 5; the 2 nd to 8 th convolution modules all use a residual network structure, and the number of network layers in the 2 nd to 8 th convolution modules is 1, 2, 3, 4 and 1 respectively.
5. The pyramid neural network image defogging method based on end-to-end multi-size fusion as claimed in claim 4, wherein said decoder is composed of three decoding modules, each decoding module is composed of a 3 x 3 convolutional layer and an upsampling layer, said fusion feature can obtain the intermediate estimation parameters of the network through the three decoding modules.
6. The end-to-end-based multi-size fusion pyramid neural network image defogging method according to claim 1, wherein the feature enhancement is performed on five groups of feature maps by using a feature pyramid network structure in an image defogging model, and the obtaining of the five groups of feature-enhanced feature maps comprises performing feature enhancement on five groups of feature maps containing different sizes and subregions by using a multilayer feature pyramid network structure.
7. The pyramid neural network image defogging method based on end-to-end multi-size fusion as claimed in claim 1, wherein said feature fusion of the five groups of feature enhanced feature maps to obtain the fusion feature is a method using spatial multi-size feature superposition fusion, taking the size of the feature map with the largest size in the five groups of feature enhanced feature maps as a reference, and using a mixed interpolation mode to make the size of other feature maps consistent with the size of the largest feature map, and obtaining the fusion feature by spatial multi-size superposition fusion of the five groups of feature enhanced feature maps.
8. The end-to-end-based multi-size fusion pyramid neural network image defogging method according to claim 1, wherein the haze-free image obtained by reconstructing the intermediate estimation parameters of the network and the original haze image input by the network by using the physical recovery module satisfies an atmospheric scattering model, and the atmospheric scattering model is rewritten, wherein the atmospheric scattering model is represented by the following formula:
I(x)=J(x)t(x)+A(1-t(x))
wherein I (x) is the hazy image, J (x) is the haze-free image, A is the global atmospheric light value, and t (x) is the transmittance;
and combining the global atmospheric light value A and the transmittance t (x) in the atmospheric scattering model to obtain a physical recovery model depended by the physical recovery module, wherein the physical recovery model is shown as the following formula:
J(x)=k(x)I(x)-k(x)+b
wherein ,
Figure FDA0003652831750000021
b is a constant 1, k (x) is an intermediate estimation parameter of the network;
and reconstructing the intermediate estimation parameters of the network and the foggy image input by the network by using a physical recovery module to obtain a fogless image.
9. An end-to-end based multi-size fused pyramid neural network image defogging system comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor when executing the computer program implements the steps of the method of any one of claims 1 to 8.
CN202210557615.9A 2022-05-19 2022-05-19 Multi-size fused pyramid neural network image defogging method and system Active CN115063304B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210557615.9A CN115063304B (en) 2022-05-19 2022-05-19 Multi-size fused pyramid neural network image defogging method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210557615.9A CN115063304B (en) 2022-05-19 2022-05-19 Multi-size fused pyramid neural network image defogging method and system

Publications (2)

Publication Number Publication Date
CN115063304A true CN115063304A (en) 2022-09-16
CN115063304B CN115063304B (en) 2023-08-25

Family

ID=83197927

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210557615.9A Active CN115063304B (en) 2022-05-19 2022-05-19 Multi-size fused pyramid neural network image defogging method and system

Country Status (1)

Country Link
CN (1) CN115063304B (en)

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20180050832A (en) * 2016-11-07 2018-05-16 한국과학기술원 Method and system for dehazing image using convolutional neural network
CN110097519A (en) * 2019-04-28 2019-08-06 暨南大学 Double supervision image defogging methods, system, medium and equipment based on deep learning
CN110210354A (en) * 2019-05-23 2019-09-06 南京邮电大学 A kind of detection of haze weather traffic mark with know method for distinguishing
CN110930320A (en) * 2019-11-06 2020-03-27 南京邮电大学 Image defogging method based on lightweight convolutional neural network
CN111192219A (en) * 2020-01-02 2020-05-22 南京邮电大学 Image defogging method based on improved inverse atmospheric scattering model convolution network
CN111461291A (en) * 2020-03-13 2020-07-28 西安科技大学 Long-distance pipeline inspection method based on YO L Ov3 pruning network and deep learning defogging model
CN112381723A (en) * 2020-09-21 2021-02-19 清华大学 Light-weight and high-efficiency single image smog removing method
CN112767283A (en) * 2021-02-03 2021-05-07 西安理工大学 Non-uniform image defogging method based on multi-image block division
CN113344806A (en) * 2021-07-23 2021-09-03 中山大学 Image defogging method and system based on global feature fusion attention network
CN113673534A (en) * 2021-04-22 2021-11-19 江苏大学 RGB-D image fruit detection method based on fast RCNN
CN114155572A (en) * 2021-11-04 2022-03-08 华中师范大学 Facial expression recognition method and system

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20180050832A (en) * 2016-11-07 2018-05-16 한국과학기술원 Method and system for dehazing image using convolutional neural network
CN110097519A (en) * 2019-04-28 2019-08-06 暨南大学 Double supervision image defogging methods, system, medium and equipment based on deep learning
CN110210354A (en) * 2019-05-23 2019-09-06 南京邮电大学 A kind of detection of haze weather traffic mark with know method for distinguishing
CN110930320A (en) * 2019-11-06 2020-03-27 南京邮电大学 Image defogging method based on lightweight convolutional neural network
CN111192219A (en) * 2020-01-02 2020-05-22 南京邮电大学 Image defogging method based on improved inverse atmospheric scattering model convolution network
CN111461291A (en) * 2020-03-13 2020-07-28 西安科技大学 Long-distance pipeline inspection method based on YO L Ov3 pruning network and deep learning defogging model
CN112381723A (en) * 2020-09-21 2021-02-19 清华大学 Light-weight and high-efficiency single image smog removing method
CN112767283A (en) * 2021-02-03 2021-05-07 西安理工大学 Non-uniform image defogging method based on multi-image block division
CN113673534A (en) * 2021-04-22 2021-11-19 江苏大学 RGB-D image fruit detection method based on fast RCNN
CN113344806A (en) * 2021-07-23 2021-09-03 中山大学 Image defogging method and system based on global feature fusion attention network
CN114155572A (en) * 2021-11-04 2022-03-08 华中师范大学 Facial expression recognition method and system

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
BOYI LI ET AL: "An All-in-One Network for Dehazing and Beyond", 《ARXIV》 *
吕建威等: "结合天空分割和雾气浓度估计的图像去雾", 《光学精密工程》, vol. 30, no. 4 *
李永福等: "一种基于改进AOD-Net的航拍图像去雾算法", 《自动化学报》, vol. 48, no. 6 *

Also Published As

Publication number Publication date
CN115063304B (en) 2023-08-25

Similar Documents

Publication Publication Date Title
CN106910175B (en) Single image defogging algorithm based on deep learning
CN109859120B (en) Image defogging method based on multi-scale residual error network
CN110349087B (en) RGB-D image high-quality grid generation method based on adaptive convolution
CN112001914A (en) Depth image completion method and device
CN107103285B (en) Face depth prediction method based on convolutional neural network
CN112184585B (en) Image completion method and system based on semantic edge fusion
CN109509156B (en) Image defogging processing method based on generation countermeasure model
CN110148088B (en) Image processing method, image rain removing method, device, terminal and medium
WO2021258959A1 (en) Image restoration method and apparatus, and electronic device
CN112508960A (en) Low-precision image semantic segmentation method based on improved attention mechanism
CN112241939B (en) Multi-scale and non-local-based light rain removal method
CA3137297A1 (en) Adaptive convolutions in neural networks
CN112184573A (en) Context aggregation residual single image rain removing method based on convolutional neural network
CN114140346A (en) Image processing method and device
CN116777764A (en) Diffusion model-based cloud and mist removing method and system for optical remote sensing image
CN113160286A (en) Near-infrared and visible light image fusion method based on convolutional neural network
CN116665156A (en) Multi-scale attention-fused traffic helmet small target detection system and method
CN116977531A (en) Three-dimensional texture image generation method, three-dimensional texture image generation device, computer equipment and storage medium
CN116863194A (en) Foot ulcer image classification method, system, equipment and medium
CN112669431B (en) Image processing method, apparatus, device, storage medium, and program product
CN115810112A (en) Image processing method, image processing device, storage medium and electronic equipment
CN115760641B (en) Remote sensing image cloud and fog removing method and equipment based on multiscale characteristic attention network
CN115063304B (en) Multi-size fused pyramid neural network image defogging method and system
CN115631108A (en) RGBD-based image defogging method and related equipment
CN113450267B (en) Transfer learning method capable of rapidly acquiring multiple natural degradation image restoration models

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant