CN110097522B

CN110097522B - Single outdoor image defogging method based on multi-scale convolution neural network

Info

Publication number: CN110097522B
Application number: CN201910397724.7A
Authority: CN
Inventors: 张世辉; 桑榆; 陈宇翔; 张健
Original assignee: Yanshan University; Beijing Institute of Computer Technology and Applications
Current assignee: Yanshan University; Beijing Institute of Computer Technology and Applications
Priority date: 2019-05-14
Filing date: 2019-05-14
Publication date: 2021-03-19
Anticipated expiration: 2039-05-14
Also published as: CN110097522A

Abstract

The invention discloses a single outdoor image defogging method based on a multi-scale convolutional neural network, and belongs to the field of computer vision. The invention comprises the following steps: constructing a training sample set according to the atmospheric scattering model; building a multi-scale convolutional neural network based on the deep learning idea; constructing a target function according to the built multi-scale convolutional neural network; and training the multi-scale convolutional neural network based on the constructed objective function. The invention does not need to acquire the prior knowledge of the outdoor image and can effectively store the information of the edge, the texture, the color, the contrast, the saturation and the like of the image.

Description

Single outdoor image defogging method based on multi-scale convolution neural network

Technical Field

The invention belongs to the field of computer vision, and particularly relates to a single outdoor image defogging method based on a multi-scale convolutional neural network.

Background

Fog is a traditional atmospheric phenomenon formed by particles of water vapor, dust, smoke, and the like. Fog can cause blurring, contrast reduction, saturation deviation of images processed by a vision system, further hinder the performance of visual tasks such as classification, identification, detection and tracking, and even cause failure of related visual tasks. Therefore, how to remove fog from outdoor images becomes a difficult problem in the field of computer vision and is receiving wide attention from scholars.

There are two main types of information handled by existing defogging methods: outdoor video and a single outdoor image. The defogging method based on the outdoor video is relatively few, and the main reason is that the defogging method based on the video firstly needs to divide the video into a plurality of video frames and then sequentially defogges the divided video frames, and essentially still processes a single outdoor image. Therefore, existing defogging methods are generally implemented based on a single outdoor image. Meanwhile, the existing defogging method based on the single outdoor image has the problems of prior knowledge acquisition, edge and texture loss, color, contrast and saturation distortion and the like. The defogging method based on dark channel prior proposed by K.M.He and J.Sun in the article "Single image size removal using dark channel prior. proceedings of the IEEE Conference on Computer Vision and Pattern registration works: IEEE Computer Society,2009: 1956-. The defogging method based on the improved dark channel prior, which is proposed in the article of Chengzhou Zhen and Zhanzhu Guangzhou, "Single image defogging algorithm based on the improved dark channel prior and guided filtering," automatic chemistry report, 2016,42(3):455-465, "has the problems that the dark channel threshold and the maximum value of the mixed dark channel brightness cannot be selected in a self-adaptive manner, and the color of the defogged image is distorted. The methods proposed by C.Z.He and C.D.Zhang in the article "A size dense adaptive texture size removal algorithm. proceedings of the IEEE International Conference on Information and Automation, IEEE Computer Society,2016: 1933-. The methods proposed by B Cai and X Xu in the article "DehazeNet: An End-to-End System for Single Image Haze removal. IEEE Transactions on Image Processing,2016,25(11): 5187-. The methods proposed by z.g.ling and g.f.fan in the article "performance oriented transmission estimation for high quality image smoothing. neuro-compressing, 2017,224(2): 82-95" have problems with distortion of contrast and saturation. The outdoor images, defogged by the methods proposed by Z.G.Li and H.J. in the article "Single Image De-Hazing Using Global Guided Image Filter, IEEE Transactions on Image Processing,2018,27(1): 442-450", are color distorted and have low contrast.

Disclosure of Invention

Aiming at the problems of the existing single outdoor image defogging method, the invention aims to provide the single outdoor image defogging method based on the multi-scale convolutional neural network.

In order to solve the technical problems, the technical scheme adopted by the invention is as follows: a single outdoor image defogging method based on a multi-scale convolution neural network is characterized by comprising the following steps:

(1) obtaining a training sample: acquiring a fog-free image sample, carrying out atomization treatment on the fog-free image sample by using an atmospheric scattering model to obtain a fog image sample, and taking the fog-free image sample and the corresponding fog image sample as training samples;

(2) multi-scale convolutional neural network model: constructing at least three convolution layers in parallel; the output end of each convolution layer is connected with a Max Pooling Pooling layer, and the output end of each Max Pooling Pooling layer is connected with a nonlinear mapping layer based on a modified linear unit ReLu; the output ends of all the nonlinear mapping layers are connected with the characteristic fusion layer; the output end of the characteristic fusion layer is connected with a bilateral filter layer for processing the transmissivity, and the transmissivity output by the bilateral filter layer is used for defogging the foggy image sample input by the convolution layer;

(3) training a multi-scale convolution neural network model: training the multi-scale convolutional neural network model by using the foggy image sample in the step (1) as the input of the multi-scale convolutional neural network model and using the fogless image sample in the step (1) as the discrimination standard of the output of the multi-scale neural network, and aiming at the minimization of an objective function, and performing parameter solution; wherein the objective function is:

wherein, c_i、s_iAnd h_iThe average RGB value, the average contrast and the average saturation corresponding to the ith sample are respectively; the parameter of the multi-scale convolution neural network is phi, and the ith hazy image sample is I_iAnd the fog-free image sample corresponding to the ith fog image sample is J_iThe number of training samples is N;

(4) and (4) carrying out defogging treatment on the foggy image to be treated by utilizing the multi-scale convolution neural network model solved in the step (3).

The further technical scheme is that the number of the convolution layers is three, and the convolution kernels are respectively 7 × 7,5 × 5 and 3 × 3; or four convolution layers, the convolution kernels being 11 × 11,7 × 7,5 × 5 and 3 × 3, respectively.

The further technical scheme is that the objective function is constructed according to Mean Square Error (MSE) and L-2 norm.

The further technical scheme is that the objective function is minimized according to a random gradient descent method.

The further technical scheme is that in the step (1) of obtaining the sample, the calculation method of the fog-free image sample structure corresponding to the fog image sample is as follows:

I_T(x)＝J_T(x)t_T(x)+α_T(1-t_T(x))

wherein, J_T(x) Is a fog-free image sample, i.e. the image collected, t_T(x) Is a transmittance, α_TIs a global atmospheric light value, I_T(x) Is a hazy image sample.

The further technical proposal is that the form of the convolution layer is as follows:

where I (x) is the hazy image sample to be dehazed, q is the convolution kernel size,

is a convolutional layer filterThe wave filter is used for filtering the received signal,

is convolution layer bias, is convolution operation,

is the output of the convolution layer of the multi-scale convolution neural network.

The further technical scheme is that the Max Pooling layer is shown as follows:

wherein the content of the first and second substances,

is the output of the multi-scale convolutional neural network pooling layer,

The further technical scheme is that the nonlinear mapping layer performs nonlinear mapping on the reduced-dimension features to obtain a multi-scale feature map, and the form of the constructed activation layer is as follows:

wherein the content of the first and second substances,

is an active-layer filter that is,

it is the bias of the active layer that,

is the output of the multi-scale convolutional neural network activation layer,

is the output of the multi-scale convolutional neural network pooling layer.

The further technical scheme is that the characteristic fusion layer fuses the multi-scale characteristic graphs so as to obtain the transmissivity

The multi-scale feature map fusion mode is as follows:

wherein λ is₁,λ₂,…,λ_nAre respectively the feature map weight coefficients, h^(q)、c^(q)And s^(q)Respectively obtaining an average RGB value, an average contrast value and an average saturation value of each scale feature map;

is the output of the pooling layer of the n-scale convolutional neural network.

The technical scheme is that the bilateral filter layer utilizes bilateral filter pair transmissivity

Processing is carried out, so that refined transmittance t (x) is obtained, and the calculation method is as follows:

d(ξ,y)＝||ξ-y||₂

wherein y is the transmittance

Where xi is the pixel adjacent to y, c (xi, y) is the space weight function, and σ is_cIs the variance between two pixels, d (ξ, y) is the distance metric between two pixels,

and

respectively, the transmissivity formed by 8 neighborhood pixel blocks taking xi and y as centers,

is a function of similarity weight calculation, σ_sIs the variance between the two transmittances and,

is a function of two transmittance distance measures.

Compared with the prior art, the invention has the advantages that:

(1) based on the deep learning idea, a multi-scale convolutional neural network is built, and the characteristics of feature maps of all scales are fully excavated, so that the information of the defogged outdoor image, such as color, contrast, saturation and the like, is similar to the initial foggy outdoor image.

(2) And processing the transmissivity obtained by fusing the characteristic graphs by utilizing bilateral filtering, so that the edge and texture information of the defogged outdoor image is completely stored.

(3) And constructing an objective function based on MSE and L-2 norm, thereby realizing effective fitting of the multi-scale convolutional neural network and more effective removal of fog in the outdoor image.

Drawings

FIG. 1 is a flow chart of a defogging method according to the present invention;

FIG. 2 is a schematic diagram of a portion of a training sample;

FIG. 3 is a schematic diagram of a 3-scale convolutional neural network structure;

fig. 4 is a schematic diagram of a 4-scale convolutional neural network structure.

Detailed Description

In order to make the technical scheme of the present invention clearer, the present invention is further explained with reference to the accompanying drawings.

The embodiment of the invention discloses a single outdoor image defogging method based on a multi-scale convolution neural network, which is characterized by comprising the following steps of:

In the embodiment of the invention, the number of the convolution layers is three, and the convolution kernels are respectively 7 × 7,5 × 5 and 3 × 3; or four convolution layers, the convolution kernels being 11 × 11,7 × 7,5 × 5 and 3 × 3, respectively.

The objective function in the embodiment of the invention is constructed according to Mean Square Error (MSE) and L-2 norm.

The objective function in the embodiment of the invention is minimized according to a random gradient descent method.

In the embodiment of the invention, in the step (1) of obtaining the sample, a method for calculating a fog image sample corresponding to a fog-free image sample structure is as follows:

I_T(x)＝J_T(x)t_T(x)+α_T(1-t_T(x))

The form of the convolutional layer in the embodiment of the present invention is as follows:

is a convolutional layer filter which is a convolutional layer filter,

is convolution layer bias, is convolution operation,

The Max Pooling layer in the embodiment of the present invention is shown as follows:

wherein the content of the first and second substances,

is the output of the multi-scale convolutional neural network pooling layer,

In the embodiment of the invention, the nonlinear mapping layer performs nonlinear mapping on the reduced-dimension features to obtain a multi-scale feature map, and the form of the constructed activation layer is as follows:

wherein the content of the first and second substances,

is an active-layer filter that is,

it is the bias of the active layer that,

is the output of the multi-scale convolutional neural network activation layer,

is the output of the multi-scale convolutional neural network pooling layer.

The characteristic fusion layer in the embodiment of the invention fuses the multi-scale characteristic graphs to obtain the transmissivity

The multi-scale feature map fusion mode is as follows:

is the output of the pooling layer of the n-scale convolutional neural network.

The bilateral filter layer in the embodiment of the invention utilizes bilateral filter pair transmissivity

d(ξ,y)＝||ξ-y||₂

wherein y is the transmittance

and

is a function of two transmittance distance measures.

In the embodiment of the present invention, as shown in fig. 1, the single outdoor image defogging method based on the multi-scale convolutional neural network includes the following steps:

step 1: and obtaining a training sample according to the atmospheric scattering model, and constructing a training sample data set.

1.1) collecting 3000 fog-free outdoor images under different scenes from the Internet.

1.2) for 3000 outdoor images collected, and setting J_T(x) Is a fog-free outdoor image, i.e. an image collected, t_T(x) Is a transmittance, α_TIs a global atmospheric light value, I_T(x) Is a foggy outdoor image. Different t is selected to ensure that the training sample contains multiple conditions as much as possible_T(x) And global atmospheric light value alpha_TSelecting a fixed value, and defining a foggy outdoor image I_T(x) Is composed of

I_T(x)＝J_T(x)t_T(x)+α_T(1-t_T(x)) (1)

1.3, traversing 3000 outdoor images, and acquiring 3000 foggy outdoor images as training samples according to the above formula, thereby constructing a training sample data set.

Some training samples in the training sample data set and their group Truth are shown in FIG. 2. The first column is a fog-free outdoor image acquired from the internet, and the second column is a fog-containing outdoor image calculated by the formula (1).

Step 2: and constructing a multi-scale convolutional neural network.

2.1) the built convolution layer of the multi-scale convolution neural network consists of convolution kernels with three different scales of 7 x 7,5 x 5 and 3 x 3, a training sample to be defogged is set as I (x), q represents the size of the convolution kernel and belongs to {7,5 and 3},

which represents a convolutional layer filter, is,

representing convolutional layer bias, representing convolution operation, and outputting convolutional layer of multi-scale convolutional neural network

Can be expressed as

2.2) the Pooling layer of the multi-scale neural network built by Max Pooling is constructed, aiming at reducing the dimension of the calculated characteristics after the convolution layer, thereby obtaining the characteristics with translation invariance and rotation invariance, and then the output of the Pooling layer of the multi-scale convolutional neural network

Can be expressed as

2.3) constructing an activation layer by the built multi-scale convolution neural network according to the modified linear unit ReLu, carrying out nonlinear mapping on the reduced-dimension characteristics to obtain a multi-scale characteristic diagram,

which represents the filter of the active layer,

representing the bias of the active layer, the output of the active layer of the multi-scale convolutional neural network

Can be expressed as

2.4) after obtaining the multi-scale characteristic diagram, fully excavating the characteristics of the color, the saturation and the contrast of each scale characteristic diagram, and fusing each scale characteristic diagram, thereby obtaining the transmissivity corresponding to the input image I (x) by calculation

The fusion function is defined as

Wherein, λ, μ and γ are feature map weight coefficients, h^(q)、c^(q)And s^(q)The average RGB value, the average contrast value and the average saturation value of each scale feature map are respectively.

2.5) because the transmissivity calculated by the existing defogging method is rough, the defogged outdoor image generally has incomplete boundary and texture preservation and the likeAnd (5) problems are solved. Therefore, bilateral filtering is selected to process the transmittance obtained by feature map fusion to obtain refined transmittance t (x), and the transmittance is processed

Performing bilateral filtering process may be expressed as

Wherein y is the transmittance

and

is a function of two transmittance distance measures.

2.6) after obtaining the refined transmittance t (x), selecting the global atmospheric light value alpha as the maximum brightness value corresponding to each pixel point in the input image I (x). At the moment, the atmosphere scattering model is deformed, so that the outdoor image J (x) after defogging is obtained through calculation, and the calculation method is that

The multi-scale convolutional neural network structure is shown in fig. 3.

And step 3: and constructing an objective function according to the mean square error MSE and the L-2 norm.

The single outdoor image defogging problem is a typical supervised learning problem, and the supervised learning needs to establish a mapping relation G between the input (foggy outdoor image) and the output (fogless outdoor image) of a convolutional neural network. Setting parameters of the multi-scale convolutional neural network as

The ith training sample is I_iThe group Truth corresponding to the ith training sample is J_iThe number of training samples is N, the parameter phi of the multi-scale convolutional neural network can be obtained by minimizing an objective function, and the objective function is constructed by mean square error MSE and L-2 norm and has the form of

Wherein, c_i、s_iAnd h_iThe average RGB value, the average contrast and the average saturation corresponding to the ith sample are respectively.

And 4, step 4: and training the multi-scale convolutional neural network.

Firstly, 20000 64 multiplied by 64 foggy image blocks are randomly extracted from a constructed training sample set, and each foggy image block has a corresponding Ground Truth; then, minimizing the constructed objective function by utilizing a random gradient descent method; finally, setting a threshold value for the target function, and when the result of the minimized target function is smaller than the set threshold value, namely the parameter phi representing the multi-scale convolutional neural network is determined, finishing the training of the convolutional neural network at the moment, and further realizing the defogging treatment on any outdoor image; as shown in fig. 3, the output of the multi-scale convolutional neural network is a processed picture.

The embodiment of the invention improves the previous embodiment, wherein in the improvement, the convolutional layer for constructing the multi-scale convolutional neural network in the step 2 is composed of convolution kernels with four different scales of 11 × 11,7 × 7,5 × 5 and 3 × 3, a training sample to be defogged is set as I (x), q represents the size of the convolution kernel and belongs to the {11,7,5 and 3}, and the fusion mode of the multi-scale feature map is as follows:

wherein, λ, μ, γ and β are characteristic map weight coefficients, h^(q)、c^(q)And s^(q)The average RGB value, the average contrast value and the average saturation value of each scale feature map are respectively. As shown in fig. 4, the output of the four-scale convolutional neural network is a processed picture.

The above-mentioned embodiments are merely illustrative of the preferred embodiments of the present invention, and do not limit the scope of the present invention, and various modifications and improvements made to the technical solution of the present invention by those skilled in the art without departing from the spirit of the present invention shall fall within the protection scope defined by the claims of the present invention.

Claims

1. A single outdoor image defogging method based on a multi-scale convolution neural network is characterized by comprising the following steps:

2. The single outdoor image defogging method based on the multi-scale convolutional neural network as claimed in claim 1, wherein the number of the convolutional layers is three, and the convolutional kernels are respectively 7 x 7,5 x 5 and 3 x 3; or four convolution layers, the convolution kernels being 11 × 11,7 × 7,5 × 5 and 3 × 3, respectively.

3. The single outdoor image defogging method based on the multi-scale convolutional neural network as claimed in claim 1, wherein the objective function is constructed according to Mean Square Error (MSE) and L-2 norm.

4. The method of claim 1, wherein the objective function is minimized according to a stochastic gradient descent method.

5. The single outdoor image defogging method based on the multi-scale convolutional neural network as claimed in claim 1, wherein in the step (1) of acquiring the samples, the calculation method of the fog-free image sample structure corresponding to the fog image sample is as follows:

I_T(x)＝J_T(x)t_T(x)+α_T(1-t_T(x))

6. The single outdoor image defogging method based on the multi-scale convolutional neural network as claimed in claim 1 or 2, wherein the convolutional layer form is as follows:

is a convolutional layer filter which is a convolutional layer filter,

is convolution layer bias, is convolution operation,

7. The single outdoor image defogging method based on the multi-scale convolutional neural network as claimed in claim 1, wherein the Max Pooling Pooling layer form is as follows:

wherein the content of the first and second substances,

is the output of the multi-scale convolutional neural network pooling layer,

8. The single outdoor image defogging method based on the multi-scale convolutional neural network as claimed in claim 1, wherein the nonlinear mapping layer is used for carrying out nonlinear mapping on the dimensionality reduced features to obtain a multi-scale feature map, and the constructed activation layer is in the following form:

wherein the content of the first and second substances,

is an active-layer filter that is,

it is the bias of the active layer that,

is the output of the multi-scale convolutional neural network activation layer,

is the output of the multi-scale convolutional neural network pooling layer.

9. The single outdoor image defogger based on the multi-scale convolutional neural network as claimed in claim 1A method wherein the feature fusion layer fuses multi-scale feature maps to obtain the transmittance