CN112767269B - Panoramic image defogging method and device - Google Patents
Panoramic image defogging method and device Download PDFInfo
- Publication number
- CN112767269B CN112767269B CN202110061876.7A CN202110061876A CN112767269B CN 112767269 B CN112767269 B CN 112767269B CN 202110061876 A CN202110061876 A CN 202110061876A CN 112767269 B CN112767269 B CN 112767269B
- Authority
- CN
- China
- Prior art keywords
- feature
- feature map
- panoramic image
- convolution
- sequence
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 64
- 239000013598 vector Substances 0.000 claims abstract description 67
- 238000012545 processing Methods 0.000 claims abstract description 17
- 238000011176 pooling Methods 0.000 claims description 15
- 238000010586 diagram Methods 0.000 claims description 14
- 230000008569 process Effects 0.000 claims description 13
- 150000001875 compounds Chemical class 0.000 claims 1
- 230000006870 function Effects 0.000 description 13
- 208000037170 Delayed Emergence from Anesthesia Diseases 0.000 description 6
- 238000013528 artificial neural network Methods 0.000 description 6
- 238000004364 calculation method Methods 0.000 description 3
- ORILYTVJVMAKLC-UHFFFAOYSA-N Adamantane Natural products C1C(C2)CC3CC1CC2C3 ORILYTVJVMAKLC-UHFFFAOYSA-N 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 2
- 238000013135 deep learning Methods 0.000 description 2
- 238000005457 optimization Methods 0.000 description 2
- 230000011218 segmentation Effects 0.000 description 2
- 238000012549 training Methods 0.000 description 2
- 230000004913 activation Effects 0.000 description 1
- 230000003044 adaptive effect Effects 0.000 description 1
- 238000013527 convolutional neural network Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000004927 fusion Effects 0.000 description 1
- 238000011478 gradient descent method Methods 0.000 description 1
- 238000009499 grossing Methods 0.000 description 1
- -1 hydrogen Chemical class 0.000 description 1
- 239000001257 hydrogen Substances 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000003058 natural language processing Methods 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
- G06T5/73—Deblurring; Sharpening
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/50—Depth or shape recovery
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10028—Range image; Depth image; 3D point clouds
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Image Processing (AREA)
- Studio Devices (AREA)
Abstract
The embodiment of the disclosure discloses a method and a device for defogging a panoramic image. One embodiment of the method comprises: giving a panoramic image with light intensity smaller than a preset threshold value, and performing convolution processing on the panoramic image through a strip sensitive convolution method to generate a first feature map sequence set; adding feature graphs with the same sequence number in the first feature graph sequence set to generate a second feature graph sequence, and obtaining a third feature vector corresponding to each feature graph sequence in the first feature graph sequence set; based on the third feature vector set, weighting and summing the first feature map sequence set to generate a third feature map sequence; inputting the third feature map sequence into a depth estimation module to obtain a depth map; and inputting the panoramic image, the depth map and the first feature map sequence set into a defogging module to obtain a defogged panoramic image. The embodiment effectively improves the accuracy of the defogging result, defoggs the panoramic image and generates a result with higher accuracy.
Description
Technical Field
The embodiment of the disclosure relates to the technical field of computers, in particular to a method and a device for defogging a panoramic image.
Background
The object of the method and the device for defogging the panoramic image is to perform a series of processing on the image, remove the fog in the image and restore the fog-free state of the image for a panoramic input image with fog, and the task can be regarded as a sub-field of image enhancement. The image aimed at by the method is a panoramic image, and different from the traditional plan view, the defogging algorithm of the invention is more targeted to the panoramic image than the traditional defogging algorithm. The defogging work of the panoramic image has significance for a plurality of downstream visual tasks, such as target detection in foggy days, semantic segmentation and the like. Meanwhile, the method is also significant for scenes such as unmanned driving and daily photography.
For image defogging tasks, existing defogging algorithms almost work around floor plans. Many methods based on deep learning, the existing methods achieve good results on the defogging work of the traditional plane images, but the defogging performance on the panoramic images cannot be satisfactory.
However, when the above-described manner is adopted for defogging of the panoramic image, there are often technical problems as follows:
for panoramic image processing, the previous convolution methods have obvious defects, and the existing convolution methods have poor flexibility and overlarge limitation on introducing artificial priori knowledge. Compared with the method, the method has the advantages of large calculated amount and low efficiency. The existing method is applied to panoramic image convolution and can dynamically select the receptive field of a convolution kernel, but the method finally fuses the features at a channel level, the method has overlarge limitation on the flexibility of feature selection, and the precision is reduced because the features can only be fused at the channel level.
Disclosure of Invention
This summary is provided to introduce a selection of concepts in a simplified form that are further described below in the detailed description. This summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter.
Some embodiments of the present disclosure propose a panoramic image defogging method and apparatus to solve one or more of the technical problems mentioned in the background section above.
In a first aspect, some embodiments of the present disclosure provide a method of defogging a panoramic image, the method including: giving a panoramic image with light intensity smaller than a preset threshold value, and performing convolution processing on the panoramic image through a strip sensitive convolution method to generate a first feature map sequence set; adding feature graphs with the same sequence number in the first feature graph sequence set to generate a second feature graph sequence, performing global average pooling on the second feature graph sequence to generate a first feature vector, and inputting the first feature vector into at least one full-connection layer to obtain a third feature vector corresponding to each feature graph sequence in the first feature graph sequence set; based on the third feature vector set, weighting and summing the first feature map sequence set to generate a third feature map sequence; inputting the third feature map sequence into a depth estimation module to obtain a depth map; and inputting the panoramic image, the depth map and the first feature map sequence set into a defogging module to obtain a defogged panoramic image.
In a second aspect, some embodiments of the present disclosure provide a panoramic image defogging device including: the convolution processing unit is configured to give a panoramic image with light intensity smaller than a preset threshold value, and perform convolution processing on the panoramic image through a strip sensitive convolution method to generate a first feature map sequence set; the first input unit is configured to add feature maps with the same sequence number in the first feature map sequence set to generate a second feature map sequence, perform global average pooling on the second feature map sequence to generate a first feature vector, and input the first feature vector to at least one full-connection layer to obtain a third feature vector corresponding to each feature map sequence in the first feature map sequence set; a summation processing unit configured to weight and sum the first feature map sequence set based on the third feature vector set to generate a third feature map sequence; the second input unit is configured to input the third feature map sequence to the depth estimation module to obtain a depth map; and the third input unit is configured to input the panoramic image, the depth map and the first feature map sequence set into the defogging module to obtain a defogged panoramic image.
The above embodiments of the present disclosure have the following beneficial effects: compared with the traditional network for image defogging and panoramic image processing, the method disclosed by the invention belongs to a panoramic image defogging method and a panoramic image defogging device, and has three beneficial characteristics: 1) And the characteristics are fused in a band level, so that more semantic information is added, and the accuracy of a demisting result is effectively improved. 2) Compared with the prior task that only the planar image can be demisted, the method can demist the panoramic image, can generate more accurate results and improves the precision of the downstream task. 3) The method is an improvement based on a convolution kernel and a characteristic fusion mode, and can be applied to other panoramic image processing tasks in theory to further improve the performance of the panoramic image processing tasks.
Drawings
The above and other features, advantages, and aspects of embodiments of the present disclosure will become more apparent by referring to the following detailed description when taken in conjunction with the accompanying drawings. Throughout the drawings, the same or similar reference numbers refer to the same or similar elements. It should be understood that the drawings are schematic and that elements and components are not necessarily drawn to scale.
Fig. 1 is a schematic diagram of one application scenario of a panoramic image defogging method according to some embodiments of the present disclosure;
fig. 2 is a schematic diagram of yet another application scenario of a panoramic image defogging method according to some embodiments of the present disclosure;
fig. 3 is a flow diagram of some embodiments of a panoramic image defogging method according to the present disclosure.
Detailed Description
Embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While certain embodiments of the present disclosure are shown in the drawings, it is to be understood that the disclosure may be embodied in various forms and should not be construed as limited to the embodiments set forth herein. Rather, these embodiments are provided for a more thorough and complete understanding of the present disclosure. It should be understood that the drawings and embodiments of the disclosure are for illustration purposes only and are not intended to limit the scope of the disclosure.
It should be noted that, for convenience of description, only the portions related to the related invention are shown in the drawings. The embodiments and features of the embodiments in the present disclosure may be combined with each other without conflict.
It is noted that references to "a", "an", and "the" modifications in this disclosure are intended to be illustrative rather than limiting, and that those skilled in the art will recognize that "one or more" may be used unless the context clearly dictates otherwise.
The names of messages or information exchanged between devices in the embodiments of the present disclosure are for illustrative purposes only, and are not intended to limit the scope of the messages or information.
The present disclosure will be described in detail below with reference to the accompanying drawings in conjunction with embodiments.
Referring to fig. 1, there is shown a schematic diagram of one application scenario of a panoramic image defogging method according to some embodiments of the panoramic image defogging method of the present disclosure.
Fig. 1 is a flow chart of the overall network, describing the overall process of obtaining a corresponding sharp panoramic image from a fogged panoramic image. Firstly, a group of feature maps are generated after a fog-carrying panoramic image passes through a strip sensitive convolution module, the group of feature maps are input into a depth estimation module and pass through a plurality of coding blocks, a Residual in Residual Block module and a plurality of decoding blocks to generate an estimation result of the depth map, and the estimation result is constrained by two loss functions. And then inputting the depth estimation result and the group of feature maps into a demisting module, adding the depth estimation result and the group of feature maps to an input image after passing through a plurality of coding blocks, a Residual in Residual Block module and a plurality of decoding blocks, performing pyramid pooling, and outputting a final fog-free clear image, wherein the generated result is constrained by four loss functions. The defogging module can be a neural network and is used for fusing various characteristics to perform defogging treatment on the fogged panoramic image to generate a clear image corresponding to the input image. The Residual in Residual Block is a group of neural network layers, comprises Residual module long jump links and the like, and is a manually designed module. The strip sensitive convolution can use convolution cores with different sizes to perform convolution processing on a panoramic image so as to sense the convolution method with different image position distortion degrees, and the specific flow is shown in the claims. The coding block can be a multilayer neural network, and the input features are coded to generate features of a certain dimension for subsequent calculation.
With continued reference to fig. 2, a schematic illustration of yet another application scenario of a panoramic image defogging method is shown in accordance with some embodiments of the panoramic image defogging method of the present disclosure.
Fig. 2 describes a detailed process of the stripe sensitive convolution, which includes firstly applying rectangular convolution kernels of different sizes to an input feature map (or an input image) to perform convolution to generate K feature maps, summing the K groups of feature maps to obtain a second feature map, performing global average pooling on the second feature map to obtain a first feature vector, performing 1 × 1 convolution and a PReLU activation function on the first feature vector, then performing K1 × 1 convolutions on obtained results to generate K third feature vectors, performing weighting operation on the obtained weight vectors and a feature map group obtained at the beginning after normalization and bilinear interpolation of the weight vectors, performing weighting operation on feature stripes of the feature maps by the weight vectors during weighting operation, and finally obtaining a third feature map sequence. The feature strip may be a horizontal strip of the feature map, or a horizontal strip taken from each feature map.
With continued reference to fig. 3, a flow of some embodiments of a panoramic image defogging method according to the present disclosure is illustrated. The defogging method for the panoramic image comprises the following steps:
step S100, a panoramic image with light intensity smaller than a preset threshold value is given, and the panoramic image is subjected to convolution processing through a strip sensitive convolution method to generate a first feature map sequence set.
In some embodiments, the execution subject of the panoramic image defogging method may give the panoramic image with the light intensity less than a certain preset threshold value by means of wired connection or wireless connection. The panoramic image can be shot by shooting equipment, is a spherical image, is developed into a plane image by using an algorithm and then is subjected to defogging work, and the panoramic image can be referred to photos shot by a panoramic camera. The stripe sensitive convolution method may be to first divide an image into a plurality of stripes horizontally during the convolution operation, and then apply rectangular convolutions of different sizes on the stripes. The panoramic image is subjected to convolution processing by convolution of different sizes. Different sized convolutions may use a predetermined number of different sized convolutions. The convolution with different sizes refers to the size of a convolution kernel of the convolutional neural network, and a plurality of feature maps corresponding to each convolution kernel are generated by performing a convolution operation on one input by using a plurality of convolution kernels having the same width but different lengths. The feature map may be a three-dimensional tensor. This is the result of the convolution operation, which is the intermediate result of the computation of the neural network. The feature map may be output by convolving the panoramic image with convolution kernels of different sizes. The feature map is essentially intermediate data during the code run. The first feature map sequence set includes a predetermined number of first feature map sequences. The first feature map may be an output result of the convolution filter. The panoramic image having the light intensity smaller than a certain preset threshold may be a fogged panoramic image. The device used for shooting can be a mobile phone.
As an example, in order to adapt to the image stretching effect of the panoramic image at different latitude positions, four convolution kernels of different sizes, 1 × 1, 1 × 3, 1 × 5 and 1 × 7, are used, and 4 corresponding feature maps are generated for each input image for subsequent calculation, wherein the feature maps have the size of H × W × C.
And S200, adding feature graphs with the same sequence number in the first feature graph sequence set to generate a second feature graph sequence, performing global average pooling on the second feature graph sequence to generate a first feature vector, and inputting the first feature vector into at least one full-connection layer to obtain a third feature vector corresponding to each feature graph sequence in the first feature graph sequence set.
In some embodiments, the execution body may generate the second feature map sequence by adding feature maps with the same sequence number in the first feature map sequence set. And obtaining a first feature vector after global average pooling is carried out on the second feature map sequence. The first feature vector is convolved with a 1 x 1 convolution again to obtain an intermediate vector. Applying a predetermined number of 1 x 1 convolutions to the intermediate vector generates a second vector in the convolution process corresponding to the predetermined number of convolution kernels. The second vector may be a so-called weight. Global average pooling may be used to average all pixel values of the feature map to obtain a value. That is, the corresponding characteristic diagram is expressed by the numerical value. The full connectivity layer is used for classification based on features. Each node in the fully connected layer may be connected to all nodes in the previous layer for integrating the features extracted from the previous layer. The fully-connected layer may serve to map the learned "distributed feature representation" to the sample label space. A fully-connected layer can be converted into a convolution with a convolution kernel of 1 x 1 by a fully-connected layer that is fully-connected to the previous layer. While the fully connected layer whose front layer is convolution layer can be converted into the global convolution of convolution kernel, and H and W are respectively the height and width of the convolution result of the front layer. Wherein the second feature map sequence includes a predetermined number of second feature maps.
In an optional implementation manner of some embodiments, the executing body may add feature maps with the same sequence number in the first feature map sequence set to generate a second feature map sequence, perform global average pooling on the second feature map sequence to generate a first feature vector, input the first feature vector into at least one full-connection layer, and obtain a third feature vector corresponding to each feature map sequence in the first feature map sequence set, and may include the following steps:
in a first step, a convolution is performed on a given panoramic image to generate an initial feature map F0. K-1 rectangular-shaped convolution kernels and one square-shaped convolution kernel are applied to the initial feature map, respectively, to generate K first feature map sequences. The dimension of the first feature map is H × W × C. And summing the feature maps with the same sequence number in the first feature map sequence set to obtain a second feature map sequence. For each second feature map F in the sequence of second feature mapsadd. Performing global average pooling on dimension W × C to obtain a first feature vector sfvThe method comprises the following specific operations:
wherein, FaddA second characteristic diagram is shown. k represents a serial number. K represents the number of convolution kernels. And F represents a characteristic diagram. FkAnd showing a first characteristic diagram in the convolution process corresponding to the kth convolution kernel. A set of tensors representing dimensions H × W × C. sfvRepresenting a first feature vector. A set of tensors representing dimensions H × 1. w represents a serial number. c represents a serial number. W represents a panoramic image width value. C denotes the number of channels of the first signature set. H denotes a panoramic image height value. The number of channels of the first set of signatures is used to characterize the color channels of the panoramic image.
Secondly, through global average pooling, an overall feature representation can be obtained, and then K feature vectors are generated through a full-connection layer and used for weighting different positions of each feature map, wherein the specific operations are as follows:
wherein s isfdRepresenting the second feature vector. k represents a serial number.And representing a second feature vector in the convolution process corresponding to the kth convolution kernel. The dimension of expression isA set of tensors of (a). s isfvRepresenting a first feature vector. A set of tensors representing dimensions H x 1. W is a group ofex1The operation of the first 1 × 1 convolution is shown. Delta [ 2 ]]Representing the sigmoid function. Wex2The operation of the second 1 × 1 convolution is shown. The dimension of expression isA set of tensors of (a). H denotes a panoramic image height value. The dimension of expression isA set of tensors of (a). r isdRepresenting a first parameter. r is a radical of hydrogeneRepresenting the second parameter. satRepresenting a third feature vector.And representing a third feature vector in the convolution process corresponding to the kth convolution kernel.q denotes the corresponding dimension element of the vector. K represents the number of convolution kernels. The operation of the first convolution may be used to convolve the first feature vector. The operation of the second convolution may be used to convolve the result of the sigmoid function. The first parameter and the first parameter may be used to control the length of the vector.
And step S300, based on the third feature vector set, weighting and summing the first feature map sequence set to generate a third feature map sequence.
In some embodiments, the executing subject may perform weighting and summing processing on the first feature map sequence set to generate the third feature map sequence, and may include the following steps:
and performing weighted calculation based on the first feature map sequence set and the third feature vector corresponding to each first feature map in the first feature map sequence set to obtain a weighted feature map sequence set.
And adding the weighted feature maps with the same sequence number in the weighted feature map sequence set to obtain a third feature map sequence.
In some optional implementations of some embodiments, the executing subject may perform weighting and summing processing on the first feature map sequence set to generate a third feature map sequence, and may include the following steps:
the third feature vector generated in step S200Using a bilinear interpolation methodDimension extension to H dimension, followed by use of a third feature vectorFor feature map FkAre weighted and summed to form the final third feature map FsccThe method comprises the following specific operations:
wherein, FsccA third characteristic diagram is shown. k represents a serial number. K represents the number of convolution kernels. FkAnd showing a first characteristic diagram in the convolution process corresponding to the kth convolution kernel. g () represents a bilinear interpolation algorithm. satRepresenting a third feature vector.And representing a third feature vector in the convolution process corresponding to the kth convolution kernel. The dimension of expression isA set of tensors of (a). r isdRepresenting a first parameter. r iseRepresenting the second parameter. H denotes a panoramic image height value.
And step S400, inputting the third feature map sequence into a depth estimation module to obtain a depth map.
In some embodiments, the executing entity may input the third feature map sequence to the depth estimation module to obtain the depth map. The third feature map sequence may include a predetermined number of third feature maps. The structure of the depth estimation module may be GAN (generic adaptive Networks, generative countermeasure network). The generator can be in a U-Net network structure. U-Net is an algorithm for semantic segmentation using a full convolutional network. The U-Net network structure may be a network structure in which a picture is divided as a whole.
In the encoder part, each coding block samples the characteristics of the upper layer to be one half of the original size, the number of channels is changed to be 2 times of the original size, and a ResNet bottomresidue block is arranged behind each coding block. The ResNet bottleck is a neural network module, and is composed of a convolutional layer and long-jump links. Similarly, each decoding block of the decoder section upsamples a feature of a previous layer by a factor of 2 and reduces the number of channels of the feature by a factor of two, after which it also has a ResNet bottomblock. The encoder and decoder are connected by a Residual in Residual Block, which contains a plurality of base Residual blocks and a long-jump connection. The Residual in Residual Block is a group of neural network layers, comprises Residual module long jump links and the like, and is a manually designed module. The generator receives the feature map generated by the strip convolution and generates a depth map estimation result. The depth map may be obtained by taking the distance of points in the scene relative to the camera. The distance of points in the scene relative to the camera can be represented by a depth map. I.e. each pixel value in the depth map may represent the distance between a point in the scene and the camera. The technology for acquiring the scene depth map by the machine vision system can be divided into two categories, namely passive distance measurement sensing and active depth sensing.
And S500, inputting the panoramic image, the depth map and the first feature map sequence set into a defogging module to obtain a defogged panoramic image.
In some embodiments, the execution subject may input the panoramic image, the depth map, and the first feature map sequence set to the defogging module, so as to obtain a defogged panoramic image. Wherein, the structure of the defogging module can be GAN. The generator is in a U-Net structure. The network structure is almost the same as that of the defogging module in step S400. The difference is that the depth map estimation result of step S400 is added as an additional feature in each layer of the generator encoder and decoder.
As an example, the execution body described above may be constrained using five penalty functions. Wherein the five loss functions include: GAN generation result loss function LganCharacteristic consistent constraint loss function LfmFunction of perceptual loss LvggDepth estimation result loss function Ll2Sum depth estimation multiscale smoothing loss function Ledge. The five loss functions are summed. And (5) performing combined training by using an Adam optimization method to obtain a final training model. The Adam optimization method is an extension of a random gradient descent method and is widely applied to deep learning in computer vision and natural language processing. The model can output the demisted image after obtaining an input single image.
The foregoing description is only exemplary of the preferred embodiments of the disclosure and is illustrative of the principles of the technology employed. It will be appreciated by those skilled in the art that the scope of the invention in the embodiments of the present disclosure is not limited to the specific combination of the above-mentioned features, but also encompasses other embodiments in which any combination of the above-mentioned features or their equivalents is made without departing from the inventive concept as defined above. For example, the above features and (but not limited to) technical features with similar functions disclosed in the embodiments of the present disclosure are mutually replaced to form the technical solution.
Claims (4)
1. A panoramic image defogging method comprises the following steps:
step S100, a panoramic image with light intensity smaller than a preset threshold value is given, and the given panoramic image is convoluted to generate an initial characteristic map F0Respectively applying K-1 convolution kernels in a rectangular shape and a convolution kernel in a square shape on the initial feature map to generate K first feature map sequences as a first feature map sequence set;
step S200, adding feature maps with the same sequence number in the first feature map sequence set to generate a second feature map sequence, performing global average pooling on the second feature map sequence to generate a first feature vector, and inputting the first feature vector to at least one full-connection layer to obtain a third feature vector corresponding to each feature map sequence in the first feature map sequence set;
step S300, based on the third feature vector set, weighting and summing the first feature map sequence set to generate a third feature map sequence;
step S400, inputting the third feature map sequence into a depth estimation module to obtain a depth map;
and S500, inputting the panoramic image, the depth map and the first feature map sequence set into a defogging module to obtain a defogged panoramic image.
2. The method according to claim 1, wherein the adding the feature maps with the same sequence number in the first feature map sequence set to generate a second feature map sequence, performing global average pooling on the second feature map sequence to generate a first feature vector, and inputting the first feature vector into at least one full-connected layer to obtain a third feature vector corresponding to each feature map sequence in the first feature map sequence set comprises:
summing the feature maps with the same sequence number in the first feature map sequence set to obtain a second feature map sequence, wherein the dimension of the first feature map is H multiplied by W multiplied by C;
for each second feature map F in the sequence of second feature mapsaddPerforming global average pooling on dimension W × C to obtain a first feature vector SfvThe method comprises the following specific operations:
wherein, FaddRepresenting a second signature, K a serial number, K the number of convolution kernels, F a signature, FkA first characteristic diagram in the convolution process corresponding to the kth convolution kernel is shown, set of tensors, s, representing dimensions H × W × CfvA first feature vector is represented that represents a first feature vector, representing a set of tensors with dimension H multiplied by 1, wherein W represents a serial number, C represents a serial number, W represents a panoramic image width value, C represents the channel number of the first characteristic diagram group, and H represents a panoramic image height value;
obtaining integral feature representation through global average pooling, and then generating K feature vectors through a full-connection layer for weighting different positions of each feature map, wherein the specific operations are as follows:
wherein s isfdRepresenting a second feature vector, k a sequence number,representing a second eigenvector in the convolution process corresponding to the kth convolution kernel, the dimension of expression isSet of tensors of, sfvA first feature vector is represented that represents a first feature vector, set of tensors, V, representing dimensions H x 1ex1Denotes the operation of the first 1X 1 convolution, δ [, ]]Denotes the sigmoid function, Wex2The operation of the second 1 x 1 convolution is shown, the representation dimension isH denotes a panoramic image height value, the dimension of expression isSet of tensors of rdDenotes a first parameter, reDenotes a second parameter, SatRepresents the third featureThe number of the eigenvectors is the sum of the average,representing a third eigenvector in the convolution process corresponding to the kth convolution kernel,q represents the corresponding dimension element of the vector, and K represents the number of convolution kernels.
3. The method of claim 2, wherein the weighting and summing the first set of feature map sequences based on the third set of feature vectors to generate a third feature map sequence comprises:
the third feature vector generated in step S200Using a bilinear interpolation methodDimension extension to H dimension, followed by use of a third feature vectorFor feature map FkAre weighted and summed to form the final third feature map FsccThe method comprises the following specific operations:
wherein, FsccDenotes a third characteristic diagram, K denotes a serial number, K denotes the number of convolution kernels, FkA first characteristic diagram in the convolution process corresponding to the kth convolution kernel is shown,representing a bilinear interpolation algorithm, SatRepresents a third characteristic directionThe amount of the compound (A) is,representing a third eigenvector in the convolution process corresponding to the kth convolution kernel, the dimension of expression isSet of tensors of rdDenotes a first parameter, reDenotes a second parameter, and H denotes a panoramic image height value.
4. A panoramic image defogging device comprising:
a convolution processing unit configured to give a panoramic image with a light intensity smaller than a preset threshold, and perform convolution on the given panoramic image to generate an initial feature map F0Respectively applying K-1 convolution kernels in a rectangular shape and a convolution kernel in a square shape on the initial feature map to generate K first feature map sequences as a first feature map sequence set;
the first input unit is configured to add feature maps with the same sequence number in the first feature map sequence set to generate a second feature map sequence, perform global average pooling on the second feature map sequence to generate a first feature vector, and input the first feature vector to at least one full-connection layer to obtain a third feature vector corresponding to each feature map sequence in the first feature map sequence set;
a summation processing unit configured to weight and sum the first feature map sequence set based on the third feature vector set to generate a third feature map sequence;
the second input unit is configured to input the third feature map sequence to the depth estimation module to obtain a depth map;
and the third input unit is configured to input the panoramic image, the depth map and the first feature map sequence set into the defogging module to obtain a defogged panoramic image.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110061876.7A CN112767269B (en) | 2021-01-18 | 2021-01-18 | Panoramic image defogging method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110061876.7A CN112767269B (en) | 2021-01-18 | 2021-01-18 | Panoramic image defogging method and device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112767269A CN112767269A (en) | 2021-05-07 |
CN112767269B true CN112767269B (en) | 2022-11-01 |
Family
ID=75702733
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110061876.7A Active CN112767269B (en) | 2021-01-18 | 2021-01-18 | Panoramic image defogging method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112767269B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113781363B (en) * | 2021-09-29 | 2024-03-05 | 北京航空航天大学 | Image enhancement method with adjustable defogging effect |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108830199A (en) * | 2018-05-31 | 2018-11-16 | 京东方科技集团股份有限公司 | Identify method, apparatus, readable medium and the electronic equipment of traffic light signals |
CN109584188A (en) * | 2019-01-15 | 2019-04-05 | 东北大学 | A kind of image defogging method based on convolutional neural networks |
CN109918951A (en) * | 2019-03-12 | 2019-06-21 | 中国科学院信息工程研究所 | A kind of artificial intelligence process device side channel system of defense based on interlayer fusion |
CN112001923A (en) * | 2020-11-02 | 2020-11-27 | 中国人民解放军国防科技大学 | Retina image segmentation method and device |
-
2021
- 2021-01-18 CN CN202110061876.7A patent/CN112767269B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108830199A (en) * | 2018-05-31 | 2018-11-16 | 京东方科技集团股份有限公司 | Identify method, apparatus, readable medium and the electronic equipment of traffic light signals |
CN109584188A (en) * | 2019-01-15 | 2019-04-05 | 东北大学 | A kind of image defogging method based on convolutional neural networks |
CN109918951A (en) * | 2019-03-12 | 2019-06-21 | 中国科学院信息工程研究所 | A kind of artificial intelligence process device side channel system of defense based on interlayer fusion |
CN112001923A (en) * | 2020-11-02 | 2020-11-27 | 中国人民解放军国防科技大学 | Retina image segmentation method and device |
Non-Patent Citations (4)
Title |
---|
A Novel Residual Dense Pyramid Network for Image Dehazing;Shibai Yin 等;《entropy》;20191115;全文 * |
Pyramid Global Context Network for Image Dehazing;Dong Zhao 等;《IEEE Transactions on Circuits and Systems for Video Technology》;20201109;全文 * |
基于多特征融合的卷积神经网络图像去雾算法;徐岩 等;《激光与光电子学进展》;20180331;全文 * |
基于特征金字塔的多尺度特征融合网络;郭启帆等;《工程数学学报》;20201015(第05期);全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN112767269A (en) | 2021-05-07 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109360171B (en) | Real-time deblurring method for video image based on neural network | |
CN108537746B (en) | Fuzzy variable image blind restoration method based on deep convolutional network | |
Cao et al. | Underwater image restoration using deep networks to estimate background light and scene depth | |
EP1026631A2 (en) | Method for inferring scenes from test images and training data using probability propagation in a markov network | |
CN112541877B (en) | Defuzzification method, system, equipment and medium for generating countermeasure network based on condition | |
CN111179196B (en) | Multi-resolution depth network image highlight removing method based on divide-and-conquer | |
CN112149563A (en) | Method and system for estimating postures of key points of attention mechanism human body image | |
CN113781659A (en) | Three-dimensional reconstruction method and device, electronic equipment and readable storage medium | |
CN112767279A (en) | Underwater image enhancement method for generating countermeasure network based on discrete wavelet integration | |
CN112767269B (en) | Panoramic image defogging method and device | |
CN114419392A (en) | Hyperspectral snapshot image recovery method, device, equipment and medium | |
CN115546505A (en) | Unsupervised monocular image depth estimation method based on deep learning | |
CN110942484A (en) | Camera self-motion estimation method based on occlusion perception and feature pyramid matching | |
CN106709862B (en) | A kind of image processing method and device | |
CN112085674B (en) | Aerial image deblurring algorithm based on neural network | |
CN111445465B (en) | Method and equipment for detecting and removing snow or rain belt of light field image based on deep learning | |
CN114078149A (en) | Image estimation method, electronic equipment and storage medium | |
CN114565953A (en) | Image processing method, image processing device, electronic equipment and computer readable storage medium | |
CN116645300A (en) | Simple lens point spread function estimation method | |
CN109064430B (en) | Cloud removing method and system for aerial region cloud-containing image | |
CN115937048A (en) | Illumination controllable defogging method based on non-supervision layer embedding and vision conversion model | |
CN112767264B (en) | Image deblurring method and system based on graph convolution neural network | |
CN114494065A (en) | Image deblurring method, device and equipment and readable storage medium | |
CN113902933A (en) | Ground segmentation network model training method, device, equipment and medium | |
CN114782980A (en) | Light-weight pedestrian detection method based on attention mechanism |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |