CN112767269B - Panoramic image defogging method and device - Google Patents

Panoramic image defogging method and device Download PDF

Info

Publication number
CN112767269B
CN112767269B CN202110061876.7A CN202110061876A CN112767269B CN 112767269 B CN112767269 B CN 112767269B CN 202110061876 A CN202110061876 A CN 202110061876A CN 112767269 B CN112767269 B CN 112767269B
Authority
CN
China
Prior art keywords
feature
feature map
panoramic image
convolution
sequence
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110061876.7A
Other languages
Chinese (zh)
Other versions
CN112767269A (en
Inventor
李甲
赵栋
李红雨
赵沁平
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beihang University
Original Assignee
Beihang University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beihang University filed Critical Beihang University
Priority to CN202110061876.7A priority Critical patent/CN112767269B/en
Publication of CN112767269A publication Critical patent/CN112767269A/en
Application granted granted Critical
Publication of CN112767269B publication Critical patent/CN112767269B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/73Deblurring; Sharpening
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/50Depth or shape recovery
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10028Range image; Depth image; 3D point clouds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Processing (AREA)
  • Studio Devices (AREA)

Abstract

The embodiment of the disclosure discloses a method and a device for defogging a panoramic image. One embodiment of the method comprises: giving a panoramic image with light intensity smaller than a preset threshold value, and performing convolution processing on the panoramic image through a strip sensitive convolution method to generate a first feature map sequence set; adding feature graphs with the same sequence number in the first feature graph sequence set to generate a second feature graph sequence, and obtaining a third feature vector corresponding to each feature graph sequence in the first feature graph sequence set; based on the third feature vector set, weighting and summing the first feature map sequence set to generate a third feature map sequence; inputting the third feature map sequence into a depth estimation module to obtain a depth map; and inputting the panoramic image, the depth map and the first feature map sequence set into a defogging module to obtain a defogged panoramic image. The embodiment effectively improves the accuracy of the defogging result, defoggs the panoramic image and generates a result with higher accuracy.

Description

Panoramic image defogging method and device
Technical Field
The embodiment of the disclosure relates to the technical field of computers, in particular to a method and a device for defogging a panoramic image.
Background
The object of the method and the device for defogging the panoramic image is to perform a series of processing on the image, remove the fog in the image and restore the fog-free state of the image for a panoramic input image with fog, and the task can be regarded as a sub-field of image enhancement. The image aimed at by the method is a panoramic image, and different from the traditional plan view, the defogging algorithm of the invention is more targeted to the panoramic image than the traditional defogging algorithm. The defogging work of the panoramic image has significance for a plurality of downstream visual tasks, such as target detection in foggy days, semantic segmentation and the like. Meanwhile, the method is also significant for scenes such as unmanned driving and daily photography.
For image defogging tasks, existing defogging algorithms almost work around floor plans. Many methods based on deep learning, the existing methods achieve good results on the defogging work of the traditional plane images, but the defogging performance on the panoramic images cannot be satisfactory.
However, when the above-described manner is adopted for defogging of the panoramic image, there are often technical problems as follows:
for panoramic image processing, the previous convolution methods have obvious defects, and the existing convolution methods have poor flexibility and overlarge limitation on introducing artificial priori knowledge. Compared with the method, the method has the advantages of large calculated amount and low efficiency. The existing method is applied to panoramic image convolution and can dynamically select the receptive field of a convolution kernel, but the method finally fuses the features at a channel level, the method has overlarge limitation on the flexibility of feature selection, and the precision is reduced because the features can only be fused at the channel level.
Disclosure of Invention
This summary is provided to introduce a selection of concepts in a simplified form that are further described below in the detailed description. This summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter.
Some embodiments of the present disclosure propose a panoramic image defogging method and apparatus to solve one or more of the technical problems mentioned in the background section above.
In a first aspect, some embodiments of the present disclosure provide a method of defogging a panoramic image, the method including: giving a panoramic image with light intensity smaller than a preset threshold value, and performing convolution processing on the panoramic image through a strip sensitive convolution method to generate a first feature map sequence set; adding feature graphs with the same sequence number in the first feature graph sequence set to generate a second feature graph sequence, performing global average pooling on the second feature graph sequence to generate a first feature vector, and inputting the first feature vector into at least one full-connection layer to obtain a third feature vector corresponding to each feature graph sequence in the first feature graph sequence set; based on the third feature vector set, weighting and summing the first feature map sequence set to generate a third feature map sequence; inputting the third feature map sequence into a depth estimation module to obtain a depth map; and inputting the panoramic image, the depth map and the first feature map sequence set into a defogging module to obtain a defogged panoramic image.
In a second aspect, some embodiments of the present disclosure provide a panoramic image defogging device including: the convolution processing unit is configured to give a panoramic image with light intensity smaller than a preset threshold value, and perform convolution processing on the panoramic image through a strip sensitive convolution method to generate a first feature map sequence set; the first input unit is configured to add feature maps with the same sequence number in the first feature map sequence set to generate a second feature map sequence, perform global average pooling on the second feature map sequence to generate a first feature vector, and input the first feature vector to at least one full-connection layer to obtain a third feature vector corresponding to each feature map sequence in the first feature map sequence set; a summation processing unit configured to weight and sum the first feature map sequence set based on the third feature vector set to generate a third feature map sequence; the second input unit is configured to input the third feature map sequence to the depth estimation module to obtain a depth map; and the third input unit is configured to input the panoramic image, the depth map and the first feature map sequence set into the defogging module to obtain a defogged panoramic image.
The above embodiments of the present disclosure have the following beneficial effects: compared with the traditional network for image defogging and panoramic image processing, the method disclosed by the invention belongs to a panoramic image defogging method and a panoramic image defogging device, and has three beneficial characteristics: 1) And the characteristics are fused in a band level, so that more semantic information is added, and the accuracy of a demisting result is effectively improved. 2) Compared with the prior task that only the planar image can be demisted, the method can demist the panoramic image, can generate more accurate results and improves the precision of the downstream task. 3) The method is an improvement based on a convolution kernel and a characteristic fusion mode, and can be applied to other panoramic image processing tasks in theory to further improve the performance of the panoramic image processing tasks.
Drawings
The above and other features, advantages, and aspects of embodiments of the present disclosure will become more apparent by referring to the following detailed description when taken in conjunction with the accompanying drawings. Throughout the drawings, the same or similar reference numbers refer to the same or similar elements. It should be understood that the drawings are schematic and that elements and components are not necessarily drawn to scale.
Fig. 1 is a schematic diagram of one application scenario of a panoramic image defogging method according to some embodiments of the present disclosure;
fig. 2 is a schematic diagram of yet another application scenario of a panoramic image defogging method according to some embodiments of the present disclosure;
fig. 3 is a flow diagram of some embodiments of a panoramic image defogging method according to the present disclosure.
Detailed Description
Embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While certain embodiments of the present disclosure are shown in the drawings, it is to be understood that the disclosure may be embodied in various forms and should not be construed as limited to the embodiments set forth herein. Rather, these embodiments are provided for a more thorough and complete understanding of the present disclosure. It should be understood that the drawings and embodiments of the disclosure are for illustration purposes only and are not intended to limit the scope of the disclosure.
It should be noted that, for convenience of description, only the portions related to the related invention are shown in the drawings. The embodiments and features of the embodiments in the present disclosure may be combined with each other without conflict.
It is noted that references to "a", "an", and "the" modifications in this disclosure are intended to be illustrative rather than limiting, and that those skilled in the art will recognize that "one or more" may be used unless the context clearly dictates otherwise.
The names of messages or information exchanged between devices in the embodiments of the present disclosure are for illustrative purposes only, and are not intended to limit the scope of the messages or information.
The present disclosure will be described in detail below with reference to the accompanying drawings in conjunction with embodiments.
Referring to fig. 1, there is shown a schematic diagram of one application scenario of a panoramic image defogging method according to some embodiments of the panoramic image defogging method of the present disclosure.
Fig. 1 is a flow chart of the overall network, describing the overall process of obtaining a corresponding sharp panoramic image from a fogged panoramic image. Firstly, a group of feature maps are generated after a fog-carrying panoramic image passes through a strip sensitive convolution module, the group of feature maps are input into a depth estimation module and pass through a plurality of coding blocks, a Residual in Residual Block module and a plurality of decoding blocks to generate an estimation result of the depth map, and the estimation result is constrained by two loss functions. And then inputting the depth estimation result and the group of feature maps into a demisting module, adding the depth estimation result and the group of feature maps to an input image after passing through a plurality of coding blocks, a Residual in Residual Block module and a plurality of decoding blocks, performing pyramid pooling, and outputting a final fog-free clear image, wherein the generated result is constrained by four loss functions. The defogging module can be a neural network and is used for fusing various characteristics to perform defogging treatment on the fogged panoramic image to generate a clear image corresponding to the input image. The Residual in Residual Block is a group of neural network layers, comprises Residual module long jump links and the like, and is a manually designed module. The strip sensitive convolution can use convolution cores with different sizes to perform convolution processing on a panoramic image so as to sense the convolution method with different image position distortion degrees, and the specific flow is shown in the claims. The coding block can be a multilayer neural network, and the input features are coded to generate features of a certain dimension for subsequent calculation.
With continued reference to fig. 2, a schematic illustration of yet another application scenario of a panoramic image defogging method is shown in accordance with some embodiments of the panoramic image defogging method of the present disclosure.
Fig. 2 describes a detailed process of the stripe sensitive convolution, which includes firstly applying rectangular convolution kernels of different sizes to an input feature map (or an input image) to perform convolution to generate K feature maps, summing the K groups of feature maps to obtain a second feature map, performing global average pooling on the second feature map to obtain a first feature vector, performing 1 × 1 convolution and a PReLU activation function on the first feature vector, then performing K1 × 1 convolutions on obtained results to generate K third feature vectors, performing weighting operation on the obtained weight vectors and a feature map group obtained at the beginning after normalization and bilinear interpolation of the weight vectors, performing weighting operation on feature stripes of the feature maps by the weight vectors during weighting operation, and finally obtaining a third feature map sequence. The feature strip may be a horizontal strip of the feature map, or a horizontal strip taken from each feature map.
With continued reference to fig. 3, a flow of some embodiments of a panoramic image defogging method according to the present disclosure is illustrated. The defogging method for the panoramic image comprises the following steps:
step S100, a panoramic image with light intensity smaller than a preset threshold value is given, and the panoramic image is subjected to convolution processing through a strip sensitive convolution method to generate a first feature map sequence set.
In some embodiments, the execution subject of the panoramic image defogging method may give the panoramic image with the light intensity less than a certain preset threshold value by means of wired connection or wireless connection. The panoramic image can be shot by shooting equipment, is a spherical image, is developed into a plane image by using an algorithm and then is subjected to defogging work, and the panoramic image can be referred to photos shot by a panoramic camera. The stripe sensitive convolution method may be to first divide an image into a plurality of stripes horizontally during the convolution operation, and then apply rectangular convolutions of different sizes on the stripes. The panoramic image is subjected to convolution processing by convolution of different sizes. Different sized convolutions may use a predetermined number of different sized convolutions. The convolution with different sizes refers to the size of a convolution kernel of the convolutional neural network, and a plurality of feature maps corresponding to each convolution kernel are generated by performing a convolution operation on one input by using a plurality of convolution kernels having the same width but different lengths. The feature map may be a three-dimensional tensor. This is the result of the convolution operation, which is the intermediate result of the computation of the neural network. The feature map may be output by convolving the panoramic image with convolution kernels of different sizes. The feature map is essentially intermediate data during the code run. The first feature map sequence set includes a predetermined number of first feature map sequences. The first feature map may be an output result of the convolution filter. The panoramic image having the light intensity smaller than a certain preset threshold may be a fogged panoramic image. The device used for shooting can be a mobile phone.
As an example, in order to adapt to the image stretching effect of the panoramic image at different latitude positions, four convolution kernels of different sizes, 1 × 1, 1 × 3, 1 × 5 and 1 × 7, are used, and 4 corresponding feature maps are generated for each input image for subsequent calculation, wherein the feature maps have the size of H × W × C.
And S200, adding feature graphs with the same sequence number in the first feature graph sequence set to generate a second feature graph sequence, performing global average pooling on the second feature graph sequence to generate a first feature vector, and inputting the first feature vector into at least one full-connection layer to obtain a third feature vector corresponding to each feature graph sequence in the first feature graph sequence set.
In some embodiments, the execution body may generate the second feature map sequence by adding feature maps with the same sequence number in the first feature map sequence set. And obtaining a first feature vector after global average pooling is carried out on the second feature map sequence. The first feature vector is convolved with a 1 x 1 convolution again to obtain an intermediate vector. Applying a predetermined number of 1 x 1 convolutions to the intermediate vector generates a second vector in the convolution process corresponding to the predetermined number of convolution kernels. The second vector may be a so-called weight. Global average pooling may be used to average all pixel values of the feature map to obtain a value. That is, the corresponding characteristic diagram is expressed by the numerical value. The full connectivity layer is used for classification based on features. Each node in the fully connected layer may be connected to all nodes in the previous layer for integrating the features extracted from the previous layer. The fully-connected layer may serve to map the learned "distributed feature representation" to the sample label space. A fully-connected layer can be converted into a convolution with a convolution kernel of 1 x 1 by a fully-connected layer that is fully-connected to the previous layer. While the fully connected layer whose front layer is convolution layer can be converted into the global convolution of convolution kernel, and H and W are respectively the height and width of the convolution result of the front layer. Wherein the second feature map sequence includes a predetermined number of second feature maps.
In an optional implementation manner of some embodiments, the executing body may add feature maps with the same sequence number in the first feature map sequence set to generate a second feature map sequence, perform global average pooling on the second feature map sequence to generate a first feature vector, input the first feature vector into at least one full-connection layer, and obtain a third feature vector corresponding to each feature map sequence in the first feature map sequence set, and may include the following steps:
in a first step, a convolution is performed on a given panoramic image to generate an initial feature map F0. K-1 rectangular-shaped convolution kernels and one square-shaped convolution kernel are applied to the initial feature map, respectively, to generate K first feature map sequences. The dimension of the first feature map is H × W × C. And summing the feature maps with the same sequence number in the first feature map sequence set to obtain a second feature map sequence. For each second feature map F in the sequence of second feature mapsadd. Performing global average pooling on dimension W × C to obtain a first feature vector sfvThe method comprises the following specific operations:
Figure BDA0002902665260000071
wherein, FaddA second characteristic diagram is shown. k represents a serial number. K represents the number of convolution kernels. And F represents a characteristic diagram. FkAnd showing a first characteristic diagram in the convolution process corresponding to the kth convolution kernel.
Figure BDA0002902665260000072
Figure BDA0002902665260000073
A set of tensors representing dimensions H × W × C. sfvRepresenting a first feature vector.
Figure BDA0002902665260000074
Figure BDA0002902665260000075
A set of tensors representing dimensions H × 1. w represents a serial number. c represents a serial number. W represents a panoramic image width value. C denotes the number of channels of the first signature set. H denotes a panoramic image height value. The number of channels of the first set of signatures is used to characterize the color channels of the panoramic image.
Secondly, through global average pooling, an overall feature representation can be obtained, and then K feature vectors are generated through a full-connection layer and used for weighting different positions of each feature map, wherein the specific operations are as follows:
Figure BDA0002902665260000076
wherein s isfdRepresenting the second feature vector. k represents a serial number.
Figure BDA0002902665260000077
And representing a second feature vector in the convolution process corresponding to the kth convolution kernel.
Figure BDA0002902665260000078
Figure BDA0002902665260000079
The dimension of expression is
Figure BDA00029026652600000710
A set of tensors of (a). s isfvRepresenting a first feature vector.
Figure BDA00029026652600000711
Figure BDA00029026652600000712
A set of tensors representing dimensions H x 1. W is a group ofex1The operation of the first 1 × 1 convolution is shown. Delta [ 2 ]]Representing the sigmoid function. Wex2The operation of the second 1 × 1 convolution is shown.
Figure BDA0002902665260000081
Figure BDA0002902665260000082
The dimension of expression is
Figure BDA0002902665260000083
A set of tensors of (a). H denotes a panoramic image height value.
Figure BDA0002902665260000084
Figure BDA0002902665260000085
The dimension of expression is
Figure BDA0002902665260000086
A set of tensors of (a). r isdRepresenting a first parameter. r is a radical of hydrogeneRepresenting the second parameter. satRepresenting a third feature vector.
Figure BDA0002902665260000087
And representing a third feature vector in the convolution process corresponding to the kth convolution kernel.
Figure BDA0002902665260000088
q denotes the corresponding dimension element of the vector. K represents the number of convolution kernels. The operation of the first convolution may be used to convolve the first feature vector. The operation of the second convolution may be used to convolve the result of the sigmoid function. The first parameter and the first parameter may be used to control the length of the vector.
And step S300, based on the third feature vector set, weighting and summing the first feature map sequence set to generate a third feature map sequence.
In some embodiments, the executing subject may perform weighting and summing processing on the first feature map sequence set to generate the third feature map sequence, and may include the following steps:
and performing weighted calculation based on the first feature map sequence set and the third feature vector corresponding to each first feature map in the first feature map sequence set to obtain a weighted feature map sequence set.
And adding the weighted feature maps with the same sequence number in the weighted feature map sequence set to obtain a third feature map sequence.
In some optional implementations of some embodiments, the executing subject may perform weighting and summing processing on the first feature map sequence set to generate a third feature map sequence, and may include the following steps:
the third feature vector generated in step S200
Figure BDA0002902665260000089
Using a bilinear interpolation method
Figure BDA00029026652600000810
Dimension extension to H dimension, followed by use of a third feature vector
Figure BDA00029026652600000811
For feature map FkAre weighted and summed to form the final third feature map FsccThe method comprises the following specific operations:
Figure BDA00029026652600000812
wherein, FsccA third characteristic diagram is shown. k represents a serial number. K represents the number of convolution kernels. FkAnd showing a first characteristic diagram in the convolution process corresponding to the kth convolution kernel. g () represents a bilinear interpolation algorithm. satRepresenting a third feature vector.
Figure BDA0002902665260000091
And representing a third feature vector in the convolution process corresponding to the kth convolution kernel.
Figure BDA0002902665260000092
Figure BDA0002902665260000093
The dimension of expression is
Figure BDA0002902665260000094
A set of tensors of (a). r isdRepresenting a first parameter. r iseRepresenting the second parameter. H denotes a panoramic image height value.
And step S400, inputting the third feature map sequence into a depth estimation module to obtain a depth map.
In some embodiments, the executing entity may input the third feature map sequence to the depth estimation module to obtain the depth map. The third feature map sequence may include a predetermined number of third feature maps. The structure of the depth estimation module may be GAN (generic adaptive Networks, generative countermeasure network). The generator can be in a U-Net network structure. U-Net is an algorithm for semantic segmentation using a full convolutional network. The U-Net network structure may be a network structure in which a picture is divided as a whole.
In the encoder part, each coding block samples the characteristics of the upper layer to be one half of the original size, the number of channels is changed to be 2 times of the original size, and a ResNet bottomresidue block is arranged behind each coding block. The ResNet bottleck is a neural network module, and is composed of a convolutional layer and long-jump links. Similarly, each decoding block of the decoder section upsamples a feature of a previous layer by a factor of 2 and reduces the number of channels of the feature by a factor of two, after which it also has a ResNet bottomblock. The encoder and decoder are connected by a Residual in Residual Block, which contains a plurality of base Residual blocks and a long-jump connection. The Residual in Residual Block is a group of neural network layers, comprises Residual module long jump links and the like, and is a manually designed module. The generator receives the feature map generated by the strip convolution and generates a depth map estimation result. The depth map may be obtained by taking the distance of points in the scene relative to the camera. The distance of points in the scene relative to the camera can be represented by a depth map. I.e. each pixel value in the depth map may represent the distance between a point in the scene and the camera. The technology for acquiring the scene depth map by the machine vision system can be divided into two categories, namely passive distance measurement sensing and active depth sensing.
And S500, inputting the panoramic image, the depth map and the first feature map sequence set into a defogging module to obtain a defogged panoramic image.
In some embodiments, the execution subject may input the panoramic image, the depth map, and the first feature map sequence set to the defogging module, so as to obtain a defogged panoramic image. Wherein, the structure of the defogging module can be GAN. The generator is in a U-Net structure. The network structure is almost the same as that of the defogging module in step S400. The difference is that the depth map estimation result of step S400 is added as an additional feature in each layer of the generator encoder and decoder.
As an example, the execution body described above may be constrained using five penalty functions. Wherein the five loss functions include: GAN generation result loss function LganCharacteristic consistent constraint loss function LfmFunction of perceptual loss LvggDepth estimation result loss function Ll2Sum depth estimation multiscale smoothing loss function Ledge. The five loss functions are summed. And (5) performing combined training by using an Adam optimization method to obtain a final training model. The Adam optimization method is an extension of a random gradient descent method and is widely applied to deep learning in computer vision and natural language processing. The model can output the demisted image after obtaining an input single image.
The foregoing description is only exemplary of the preferred embodiments of the disclosure and is illustrative of the principles of the technology employed. It will be appreciated by those skilled in the art that the scope of the invention in the embodiments of the present disclosure is not limited to the specific combination of the above-mentioned features, but also encompasses other embodiments in which any combination of the above-mentioned features or their equivalents is made without departing from the inventive concept as defined above. For example, the above features and (but not limited to) technical features with similar functions disclosed in the embodiments of the present disclosure are mutually replaced to form the technical solution.

Claims (4)

1. A panoramic image defogging method comprises the following steps:
step S100, a panoramic image with light intensity smaller than a preset threshold value is given, and the given panoramic image is convoluted to generate an initial characteristic map F0Respectively applying K-1 convolution kernels in a rectangular shape and a convolution kernel in a square shape on the initial feature map to generate K first feature map sequences as a first feature map sequence set;
step S200, adding feature maps with the same sequence number in the first feature map sequence set to generate a second feature map sequence, performing global average pooling on the second feature map sequence to generate a first feature vector, and inputting the first feature vector to at least one full-connection layer to obtain a third feature vector corresponding to each feature map sequence in the first feature map sequence set;
step S300, based on the third feature vector set, weighting and summing the first feature map sequence set to generate a third feature map sequence;
step S400, inputting the third feature map sequence into a depth estimation module to obtain a depth map;
and S500, inputting the panoramic image, the depth map and the first feature map sequence set into a defogging module to obtain a defogged panoramic image.
2. The method according to claim 1, wherein the adding the feature maps with the same sequence number in the first feature map sequence set to generate a second feature map sequence, performing global average pooling on the second feature map sequence to generate a first feature vector, and inputting the first feature vector into at least one full-connected layer to obtain a third feature vector corresponding to each feature map sequence in the first feature map sequence set comprises:
summing the feature maps with the same sequence number in the first feature map sequence set to obtain a second feature map sequence, wherein the dimension of the first feature map is H multiplied by W multiplied by C;
for each second feature map F in the sequence of second feature mapsaddPerforming global average pooling on dimension W × C to obtain a first feature vector SfvThe method comprises the following specific operations:
Figure FDA0003824071560000021
wherein, FaddRepresenting a second signature, K a serial number, K the number of convolution kernels, F a signature, FkA first characteristic diagram in the convolution process corresponding to the kth convolution kernel is shown,
Figure FDA0003824071560000022
Figure FDA0003824071560000023
set of tensors, s, representing dimensions H × W × CfvA first feature vector is represented that represents a first feature vector,
Figure FDA0003824071560000024
Figure FDA0003824071560000025
representing a set of tensors with dimension H multiplied by 1, wherein W represents a serial number, C represents a serial number, W represents a panoramic image width value, C represents the channel number of the first characteristic diagram group, and H represents a panoramic image height value;
obtaining integral feature representation through global average pooling, and then generating K feature vectors through a full-connection layer for weighting different positions of each feature map, wherein the specific operations are as follows:
Figure FDA0003824071560000026
wherein s isfdRepresenting a second feature vector, k a sequence number,
Figure FDA0003824071560000027
representing a second eigenvector in the convolution process corresponding to the kth convolution kernel,
Figure FDA0003824071560000028
Figure FDA0003824071560000029
the dimension of expression is
Figure FDA00038240715600000210
Set of tensors of, sfvA first feature vector is represented that represents a first feature vector,
Figure FDA00038240715600000211
Figure FDA00038240715600000212
set of tensors, V, representing dimensions H x 1ex1Denotes the operation of the first 1X 1 convolution, δ [, ]]Denotes the sigmoid function, Wex2The operation of the second 1 x 1 convolution is shown,
Figure FDA0003824071560000031
Figure FDA0003824071560000032
the representation dimension is
Figure FDA0003824071560000033
H denotes a panoramic image height value,
Figure FDA0003824071560000034
Figure FDA0003824071560000035
the dimension of expression is
Figure FDA0003824071560000036
Set of tensors of rdDenotes a first parameter, reDenotes a second parameter, SatRepresents the third featureThe number of the eigenvectors is the sum of the average,
Figure FDA0003824071560000037
representing a third eigenvector in the convolution process corresponding to the kth convolution kernel,
Figure FDA0003824071560000038
q represents the corresponding dimension element of the vector, and K represents the number of convolution kernels.
3. The method of claim 2, wherein the weighting and summing the first set of feature map sequences based on the third set of feature vectors to generate a third feature map sequence comprises:
the third feature vector generated in step S200
Figure FDA0003824071560000039
Using a bilinear interpolation method
Figure FDA00038240715600000310
Dimension extension to H dimension, followed by use of a third feature vector
Figure FDA00038240715600000311
For feature map FkAre weighted and summed to form the final third feature map FsccThe method comprises the following specific operations:
Figure FDA00038240715600000312
wherein, FsccDenotes a third characteristic diagram, K denotes a serial number, K denotes the number of convolution kernels, FkA first characteristic diagram in the convolution process corresponding to the kth convolution kernel is shown,
Figure FDA00038240715600000313
representing a bilinear interpolation algorithm, SatRepresents a third characteristic directionThe amount of the compound (A) is,
Figure FDA00038240715600000314
representing a third eigenvector in the convolution process corresponding to the kth convolution kernel,
Figure FDA00038240715600000315
Figure FDA00038240715600000316
the dimension of expression is
Figure FDA00038240715600000317
Set of tensors of rdDenotes a first parameter, reDenotes a second parameter, and H denotes a panoramic image height value.
4. A panoramic image defogging device comprising:
a convolution processing unit configured to give a panoramic image with a light intensity smaller than a preset threshold, and perform convolution on the given panoramic image to generate an initial feature map F0Respectively applying K-1 convolution kernels in a rectangular shape and a convolution kernel in a square shape on the initial feature map to generate K first feature map sequences as a first feature map sequence set;
the first input unit is configured to add feature maps with the same sequence number in the first feature map sequence set to generate a second feature map sequence, perform global average pooling on the second feature map sequence to generate a first feature vector, and input the first feature vector to at least one full-connection layer to obtain a third feature vector corresponding to each feature map sequence in the first feature map sequence set;
a summation processing unit configured to weight and sum the first feature map sequence set based on the third feature vector set to generate a third feature map sequence;
the second input unit is configured to input the third feature map sequence to the depth estimation module to obtain a depth map;
and the third input unit is configured to input the panoramic image, the depth map and the first feature map sequence set into the defogging module to obtain a defogged panoramic image.
CN202110061876.7A 2021-01-18 2021-01-18 Panoramic image defogging method and device Active CN112767269B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110061876.7A CN112767269B (en) 2021-01-18 2021-01-18 Panoramic image defogging method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110061876.7A CN112767269B (en) 2021-01-18 2021-01-18 Panoramic image defogging method and device

Publications (2)

Publication Number Publication Date
CN112767269A CN112767269A (en) 2021-05-07
CN112767269B true CN112767269B (en) 2022-11-01

Family

ID=75702733

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110061876.7A Active CN112767269B (en) 2021-01-18 2021-01-18 Panoramic image defogging method and device

Country Status (1)

Country Link
CN (1) CN112767269B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113781363B (en) * 2021-09-29 2024-03-05 北京航空航天大学 Image enhancement method with adjustable defogging effect

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108830199A (en) * 2018-05-31 2018-11-16 京东方科技集团股份有限公司 Identify method, apparatus, readable medium and the electronic equipment of traffic light signals
CN109584188A (en) * 2019-01-15 2019-04-05 东北大学 A kind of image defogging method based on convolutional neural networks
CN109918951A (en) * 2019-03-12 2019-06-21 中国科学院信息工程研究所 A kind of artificial intelligence process device side channel system of defense based on interlayer fusion
CN112001923A (en) * 2020-11-02 2020-11-27 中国人民解放军国防科技大学 Retina image segmentation method and device

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108830199A (en) * 2018-05-31 2018-11-16 京东方科技集团股份有限公司 Identify method, apparatus, readable medium and the electronic equipment of traffic light signals
CN109584188A (en) * 2019-01-15 2019-04-05 东北大学 A kind of image defogging method based on convolutional neural networks
CN109918951A (en) * 2019-03-12 2019-06-21 中国科学院信息工程研究所 A kind of artificial intelligence process device side channel system of defense based on interlayer fusion
CN112001923A (en) * 2020-11-02 2020-11-27 中国人民解放军国防科技大学 Retina image segmentation method and device

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
A Novel Residual Dense Pyramid Network for Image Dehazing;Shibai Yin 等;《entropy》;20191115;全文 *
Pyramid Global Context Network for Image Dehazing;Dong Zhao 等;《IEEE Transactions on Circuits and Systems for Video Technology》;20201109;全文 *
基于多特征融合的卷积神经网络图像去雾算法;徐岩 等;《激光与光电子学进展》;20180331;全文 *
基于特征金字塔的多尺度特征融合网络;郭启帆等;《工程数学学报》;20201015(第05期);全文 *

Also Published As

Publication number Publication date
CN112767269A (en) 2021-05-07

Similar Documents

Publication Publication Date Title
CN109360171B (en) Real-time deblurring method for video image based on neural network
CN108537746B (en) Fuzzy variable image blind restoration method based on deep convolutional network
Cao et al. Underwater image restoration using deep networks to estimate background light and scene depth
EP1026631A2 (en) Method for inferring scenes from test images and training data using probability propagation in a markov network
CN112541877B (en) Defuzzification method, system, equipment and medium for generating countermeasure network based on condition
CN111179196B (en) Multi-resolution depth network image highlight removing method based on divide-and-conquer
CN112149563A (en) Method and system for estimating postures of key points of attention mechanism human body image
CN113781659A (en) Three-dimensional reconstruction method and device, electronic equipment and readable storage medium
CN112767279A (en) Underwater image enhancement method for generating countermeasure network based on discrete wavelet integration
CN112767269B (en) Panoramic image defogging method and device
CN114419392A (en) Hyperspectral snapshot image recovery method, device, equipment and medium
CN115546505A (en) Unsupervised monocular image depth estimation method based on deep learning
CN110942484A (en) Camera self-motion estimation method based on occlusion perception and feature pyramid matching
CN106709862B (en) A kind of image processing method and device
CN112085674B (en) Aerial image deblurring algorithm based on neural network
CN111445465B (en) Method and equipment for detecting and removing snow or rain belt of light field image based on deep learning
CN114078149A (en) Image estimation method, electronic equipment and storage medium
CN114565953A (en) Image processing method, image processing device, electronic equipment and computer readable storage medium
CN116645300A (en) Simple lens point spread function estimation method
CN109064430B (en) Cloud removing method and system for aerial region cloud-containing image
CN115937048A (en) Illumination controllable defogging method based on non-supervision layer embedding and vision conversion model
CN112767264B (en) Image deblurring method and system based on graph convolution neural network
CN114494065A (en) Image deblurring method, device and equipment and readable storage medium
CN113902933A (en) Ground segmentation network model training method, device, equipment and medium
CN114782980A (en) Light-weight pedestrian detection method based on attention mechanism

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant