CN112669242A

CN112669242A - Night scene restoration method based on improved image enhancement algorithm and generation countermeasure network

Info

Publication number: CN112669242A
Application number: CN202110278464.9A
Authority: CN
Inventors: 李昊伶; 朱锐; 徐天怡; 殷嫦藜; 吴林芮
Original assignee: Sichuan University
Current assignee: Sichuan University
Priority date: 2021-03-16
Filing date: 2021-03-16
Publication date: 2021-04-16

Abstract

The invention discloses a night scene restoration method based on an improved image enhancement algorithm and a generated confrontation network, which comprises the following steps: s1: acquiring a night image, and performing enhancement processing on the night image by using a MSRCP algorithm; s2: judging whether the enhanced nighttime image needs style migration, if so, entering step S3, otherwise, entering step S4; s3: performing style migration, and proceeding to step S4; s4: and (4) carrying out dark channel prior defogging and definition processing on the night image after the enhancement processing in sequence to finish night scene restoration. The night scene restoration method is suitable for the fields of security monitoring and regional night view finding, the training set is easy to collect and train, the requirement on the generalization of the model is not high, the method is particularly excellent in single-style regional experiments, and the method has high practicability and feasibility.

Description

Night scene restoration method based on improved image enhancement algorithm and generation countermeasure network

Technical Field

The invention belongs to the technical field of image processing and machine vision, and particularly relates to a night scene restoration method based on an improved image enhancement algorithm and a generated confrontation network.

Background

With the development of socio-economic, the activities of human beings at night become continuously abundant. Humans and many other organisms have an inherent fear of nighttime. Because the vision is poor in the night, the surrounding situation can not be well observed, and the defense is weak. In modern society, many activities and researches of people are difficult to be carried out due to the sight line, the night photographing effect and the like, the starting of war, the marching of military and the stealing frequently occur at night, and lawbreakers often choose to carry out illegal activities under the protection of night curtains. At night, the cells that human dominate night vision are significantly less sensitive to color. Under the weak condition of light, rely on self vision system to hardly catch peripheral information, the camera is also difficult to catch the key information of hiding in the dark because of the light reason, and the purpose of this application is to realize that a technique makes up the visual defect at night. In the field of scientific research, many animals are diurnal and sensitive to light, and scientists need to take picture videos at night in order to study the night behavior of the animals. In order to avoid disturbing that the animals cannot turn on the light, an infrared camera is generally adopted, but part of animal characteristic information is lost, so that the method is greatly helpful if the shot night images can be recovered. In the field of public security, under the shielding of night curtains, crimes frequently occur at night, and videos collected by monitoring equipment in public places are often in night due to poor light effects, so that key information is easily missed when security guards watch the monitoring, and police investigation and monitoring are not sufficient for dim dead corners. Through recovering the acquired night scene video, security personnel can find dangers in the night scene and make quick response, recover bright scenes for case handling personnel, quickly find important information such as clothing colors and vehicles of the personnel and the like, and provide help for case detection. In the fields of national defense and war, military forces in war like to travel under the shelter of night screen and launch the attack of theft. Assuming that the enemy is kept away from radar monitoring, the enemy is difficult to find by the weak night vision of a tired post sentry alone. At this time, if the night scene can be recovered, enemies can be found to approach quickly, and the method has strategic and practical significance. Through research, the method and the device realize a technology for combining an image processing algorithm with deep learning of a neural network, can well restore night scenes, restore the daytime style and help people to capture night sensitive key information.

Disclosure of Invention

The invention aims to solve the problem of night scene image restoration and provides a night scene restoration method based on an improved image enhancement algorithm and a generation countermeasure network.

The technical scheme of the invention is as follows: a night scene restoration method based on an improved image enhancement algorithm and a generation countermeasure network comprises the following steps:

s1: acquiring a night image, and performing enhancement processing on the night image by using a MSRCP algorithm;

s2: judging whether the enhanced nighttime image needs style migration, if so, entering step S3, otherwise, entering step S4;

s3: training a non-paired image translation neural network of the night image region, inputting the enhanced night image as a night domain of a training set, performing style migration, and entering step S4;

s4: and (4) carrying out dark channel prior defogging and definition processing on the night image after the enhancement processing in sequence to finish night scene restoration.

The invention has the beneficial effects that: the night scene restoration method is suitable for the fields of security monitoring and regional night view finding, the training set is easy to collect and train, the requirement on the generalization of the model is not high, the method is particularly excellent in single-style regional experiments, and the method has high practicability and feasibility. The innovative neural network structure greatly relieves the pressure of network training, protects image content, improves the quality of generated images, solves the problem of local overexposure at night, and finally outputs clear and soft images with prominent details.

Further, in step S1, if the acquired nighttime image is a single-channel image, the enhancement processing includes the following sub-steps:

a11: according to a Retinex algorithm, decomposing and item shifting processing are carried out on the night image to obtain a relational expression of the night image, which specifically comprises the following steps:

wherein,

a matrix representing the acquired single-channel night-time images,

a matrix of ambient light components representing the acquired night images as a single channel image,

a matrix of target object reflection components representing a single channel image,

the abscissa representing the coordinates of the pixel,

the ordinate representing the pixel coordinate,

representation matrix

Upper coordinate is

The value of the pixel of (a) is,

representation matrix

Upper coordinate is

The value of the pixel of (a) is,

representation matrix

Upper coordinate is

The value of the pixel of (a) is,

which represents a convolution operation, is a function of,

represents a logarithmic operation;

a12: carrying out convolution operation on the relational expression of the night image by utilizing a Gaussian core to obtain the reflection component of the target object after enhancement processing;

a13: mapping the reflection component of the target object after enhancement processing to a pixel value domain [0, 255] to obtain an image after enhancement processing;

in step a12, the method of performing convolution operation using the relational expression of the gaussian kernel for the nighttime image includes: selecting a sigma value and calculating a corresponding weight matrix, and performing convolution operation by taking the weight matrix as a Gaussian kernel, wherein the calculation formula is as follows:

wherein,

a matrix of a gaussian kernel is represented,

representing a Gaussian kernel matrix

Upper coordinate is

A pixel value of (a);

in step S1, if the acquired nighttime image is a multi-channel image, the enhancement processing includes the following sub-steps:

b11: according to a Retinex algorithm, decomposing the night image and performing item shifting processing to obtain a relational expression of the night image, which specifically comprises the following steps:

wherein,

a matrix of multi-channel night images representing the acquisition,

an ambient light component matrix representing the acquired night images as a multi-channel image,

a matrix of target object reflection components representing the multi-channel image,

representation matrix

Upper coordinate is

The value of the pixel of (a) is,

representation matrix

Upper coordinate is

The value of the pixel of (a) is,

representation matrix

Upper coordinate is

A pixel value of (a);

b12: according to the relational expression of the night image, Gaussian filtering and weighted addition are carried out on different channels to obtain an image after enhancement processing;

in step B12, the calculation formula of gaussian filtering and weighted addition on different channels is:

wherein,

a matrix of target object reflection components representing the next multi-channel image,

representation matrix

Upper coordinate is

The value of the pixel of (a) is,

the corresponding weight for each channel is represented,

representing the acquired night images

The coordinates in each channel layer are

The value of the pixel of (a) is,

is shown as

The coordinates of each channel layer after Gaussian blur are

The pixel value of (c).

The beneficial effects of the further scheme are as follows: in the invention, the night chart shot by the camera is used

Image

Considering that the image is a single-channel image (such as a gray-scale image), the image can be regarded as an ambient illumination component according to Retinex theory

And target object reflection component

The result of the multiplication is that,

the image detail information is carried, and the image detail information is the image after the enhancement. By shifting terms and taking logarithms on two sides, the nature of incident light can be abandoned to obtain the original appearance of the object.

Further, in step S3, the non-paired image translation neural network for training the nighttime image region adopts a small batch gradient descent method, and the training parameter setting includes: the batch size is set,

cutting out images in training processCut into 256 pieces

256, setting an initial learning rate, adjusting the learning rate by adopting an ADAM optimizer, setting optimizer parameters and setting training algebras, wherein the first half of the training algebras adopts the initial learning rate to learn, and the learning rate of the second half of the training algebras is gradually decreased by the same value.

Further, in step S3, the method of performing style migration by using the enhanced nighttime image as the nighttime domain input of the training set includes: performing non-paired image translation by adopting cycle consistency; and finishing style migration by adopting least square antagonism loss and self identity mapping loss in sequence.

The beneficial effects of the further scheme are as follows: in the invention, considering that the requirement of image matching training on a training set is very strict, the day and night of the same place are used as a pair of training samples for each training, which is not realistic, so that the cycle consistency loss is adopted to realize non-matching training, images are converted from a source domain x to a target domain y under the condition of no matching sample, and if the target is a learning mapping g: x → y, currently only the discriminators are trained using the penalty of antagonism, but this mapping is highly under-constrained, yet unpaired and difficult to find the reconstruction error term. The cyclic consistency loss at this time uses the inverse mapping f: y → x couples them, pushing f (g (x)) x, f (g (y)) y (making the picture of a domain approximate back to the original value after two generator mapping transformations). Meanwhile, in order to protect the color composition of an output image, self-identity mapping consistency loss is introduced, and the aim is to promote f (x) to be approximately equal to x, g (y) to be approximately equal to y.

Further, in step S3, the functional expression for performing image unpaired translation using loop consistency is:

wherein,

representing the generation of the sum of the cyclic consistency losses produced by the two generators in the countermeasure network,

to represent

Is a real sample from the nighttime image domain,

to represent

Is a real sample from the daytime image domain,

a generator responsible for the night-to-day mapping is shown,

a generator responsible for day-to-night mapping is shown,

representing the placement of real night photographs into a generator

The generated false daytime picture is taken,

representing the placement of real day photos into the generator

The generated photo at the pseudo-night time is taken,

representing a manhattan distance operation between images;

the beneficial effects of the further scheme are as follows: in the invention, the constraint with almost the same strength as the pairing training is applied by adopting the cycle consistency loss, the ground real output is considered, the good effect is obtained, and the maximum advantage is that the training set is easy to obtain in a non-pairing way. But the cost of easy to train is also huge, that is, one more discriminator and one more generator, but they only play a role in providing half of the work that produces the cyclic consistency loss. Many unnecessary by-products are generated, parameters are increased, and the training pressure of the machine is higher. However, the effect is not good when the VGG characteristic distance is adopted to replace the cycle consistency loss. Considering that the effect of the cyclic consistency loss is excellent, once the neural network model is trained, the speed of the subsequent recovery has no relation with the complexity of the network training, and the cyclic consistency loss is finally adopted in the method, which is also the only method for improving the constraint of the current image translation unpaired training.

The functional expression of the least squares antagonism loss is:

wherein,

represents the least squares antagonism loss produced in the night-to-day mapping,

a domain of the image at night is represented,

a field of an image in the daytime is represented,

indicating a distinguishing daytime image domain

A discriminator of true or false, and a discriminator of false,

representing the placement of real night photographs into a generator

The generated false daytime picture is taken,

indicating that a real day sample is input into the discriminator

Output between [0,1]]The predicted value of (a) is determined,

represent to generator

False daytime photo input discriminator

Output between [0,1]]The predicted value of (2);

the beneficial effects of the further scheme are as follows: in the present invention, for style conversion, the post-enhancement generation countermeasure network will learn to convert the image from the enhancement domain x to the daylight domain y without a pairing example. Three losses are necessary here. First, the antagonism loss is provided by the discriminator in order to minimize the distance between the actual daylight scene and the output. Least squares antagonism Loss (LSGAN) loss is used instead of negative log-likelihood or sigmoid functions because this loss is more stable, the model oscillates less, and tends to produce higher quality results.

The functional expression of the self identity mapping loss is:

wherein,

representing the generation of the sum of the self-identity mapping losses produced by two generators in the countermeasure network,

representing the placement of real day photos into the generator

The generated false daytime picture is taken,

representing the placement of real night photographs into a generator

Generating a pseudo-night photo;

the beneficial effects of the further scheme are as follows: in the present invention, the loss of cyclic consistency, although serving some supervisory role, makes the generated image closer to the ground truth output than the direct learning mapping, still does not provide enough constraints on the color composition of the image. In order to protect the original color of the image from being damaged during the mapping process, another beneficial loss is introduced, which is called self-identity mapping loss. This loss is the real image of the target domain input into its own generator without much change, i.e. push f (x) x and g (y) y, such improvement significantly preserves the image color.

The round robin consistency loss and the self identity mapping loss are used to impose constraints on "unpaired" translations. The loss of circular consistency is just like a traditional loss, but the generator's task is not only to protect the content, but also to approximate the ground truth output in the L1 sense. The L2 distance is not considered by this application as it has been shown to encourage ambiguity. In many image translation tasks that do not concern the image color composition of the source domain (e.g., image edge to image, caricature to real photo), this loss and antagonism loss is sufficient to perform the translation task well. But for translations between day and night, they still do not impose sufficient constraints regarding the color composition of the original image. Identity mapping loss is necessary to protect the color composition between the input to the output, which would otherwise be strange.

Further, in step S3, the non-paired image translation neural network includes a local discriminator and a generator, where the generator combines the U-net and the residual network and introduces an illumination intensity self-specification graph to perform self-regularization operation on the night view image to be restored and each feature map generated by the generator network; the local arbiter uses a convolutional neural network to determine whether a randomly cropped 70x70 image block is true or false.

The beneficial effects of the further scheme are as follows: in the invention, for a generator network, a skeleton part of the network comprises two 2-step convolution layers, two 1/2-step deconvolution layers and a plurality of residual blocks, example normalization is carried out, partial image content information is shared between input and output corresponding feature maps, and thus the image content is protected and the network is easy to train. Meanwhile, an illumination intensity self-specification graph is added to improve illumination distribution, and excessively bright objects such as street lamps and advertising boards are often found in a low-illumination image at night. The specific method comprises the following steps: the input RGB image is saved with a copy of its corresponding gray map, the pixel values of this copy L are normalized to the interval [0,1] since the value of the gray map represents the image illumination intensity, and then 1-L is used as a self-normative attention (the value is lower where the image is brighter). This attention map is resized to match and multiply the feature maps after each convolution process, which improves the visual effect of the output as a form of luminance specification.

For the discriminator network, a common convolutional neural network is used for outputting a relative probability to judge whether the picture is true or false, and 70 is adopted

The local discriminator with the size of 70 reduces the over-constraint and sharp high-frequency details of the pixel discriminator, avoids the huge training parameters of the global discriminator and relieves the machine training pressure. While the local discriminator more easily recognizes the exposed objectThe method can completely solve the problem of the peripheral false rendering effect caused by local overexposure (the local discriminator can also erase the peripheral false rendering effect because the overexposure is solved in the brightness sense from the specification diagram). In order to reduce model oscillation, the classifier parameters are updated using 50 histories of the images received by the classifier, i.e. the training parameter batch _ size for the classifier is 50 (here, 50 local cuts of 70X70 are used instead of 50 input images), and the generator takes batch _ size 1-4 (here, the whole picture) for machine performance reasons, and they all update the parameters more frequently.

Further, step S4 includes the following sub-steps:

s41: carrying out dark channel prior defogging treatment on the nighttime image after the enhancement treatment;

s42: and performing definition processing on the image subjected to dark channel prior defogging processing to finish night scene restoration.

The beneficial effects of the further scheme are as follows: in the invention, the enhanced night image generally has a serious atomization condition, so that the defogging is carried out by adopting a dark channel prior theory. The MSRCP is used for processing night scenes, and when the light of the image is uniformly distributed and is sufficient (such as street scenes at night), the recovery effect is good. However, in practical application, many night scene pictures have large dark light, no light and no information areas, and when the areas are processed by using the MSRCP algorithm, color error rendering and atomization effects with different degrees are easily caused, and the noise is serious. In addition, the scene and the picture at night are not clear at all times, because the image does not have enough light to enter human eyes or a photosensitive element of a camera, and the image is not clear even through color restoration. The Retinex algorithm generates a color error rendering result due to the lack of information in a dark place, improves the output effect by adjusting the color gain and the deviation, and simultaneously finds that the reduction of the image resolution is favorable for inhibiting the expansion of the error rendering (meanwhile, the small resolution can better meet the real-time processing).

Further, step S41 includes the following sub-steps:

s411: determining a dark channel of the night image according to the enhanced night image

The calculation formula is as follows:

wherein,

each channel of the color image is represented by a color image,

which is indicative of the color red,

the color of the green is represented and,

which represents the color of blue,

a set of three primary colors is represented,

the red channel of the color image is represented,

the green channel representing a color image is represented,

the blue channel of the color image is represented,

representing a window centered on a pixel,

representation of belonging to a window

Any one of (1) to (2)A pixel;

s412: based on dark passageway

Defogging the night image;

in step S412, the fog pattern modeling equation is:

wherein,

which represents the image to be defogged and,

it is shown that there is no fog image,

which represents the light component of the atmosphere in the world,

represents a transmittance;

based on dark passageway

The prior theory and the fog pattern form a model equation to obtain a fog-free image, and the calculation formula is as follows:

wherein,

which represents a threshold value of the transmittance,

indicating a maximum operation.

Further, in step S42, the method of performing the sharpness processing on the image is: sequentially carrying out sharpening and filtering;

the sharpening method comprises the following steps: carrying out convolution operation on the image subjected to the dark channel defogging treatment by utilizing a sharpening kernel, wherein the sharpening kernel is as follows:

；

the filtering method comprises the following steps: and carrying out bilateral filtering on the sharpened image by using a library function bilatelfilter (src, d, sigmacor and sigmaSpace) of opencv, wherein scr represents the input sharpened image, d represents the size of a filtering template, sigmacor represents sigma parameters of a color space, and sigmaSpace represents sigma parameters of a coordinate space.

The beneficial effects of the further scheme are as follows: in the present invention, because there is not enough light, the night image is generally blurred because the captured image has a large amount of information that is lost and cannot be changed by using Retinex. The method of sharpening the image and then filtering is adopted to improve the definition. Sharpening the image so simply processing adds more noise. In order to eliminate noise, the result of blurring techniques such as median filtering and gaussian filtering is not good, and the processing result is that the image background is blurred or dark, and the effect of improving the definition fails. The bilateral filtering for retaining the edge is a compromise treatment combining the spatial proximity and the pixel value similarity of the image, and simultaneously considers the spatial and information and the gray level similarity to achieve the purpose of retaining the edge and removing the noise. The sharpened inner core is utilized to carry out convolution operation on the image subjected to the dark channel defogging processing for once so as to enhance the edge of the image to achieve the purpose of sharpening, and the noise of the sharpened image is obviously increased, so that the image is subjected to bilateral filtering of edge preservation and denoising for once.

Drawings

FIG. 1 is a flow chart of a night scene restoration method;

FIG. 2 is an experimental diagram of defogging;

FIG. 3 is an experimental graph of self identity mapping loss protection color;

FIG. 4 is a loss function composition diagram;

FIG. 5 is a comparison of two types of discriminators;

FIG. 6 is a comparison graph of the illumination intensity self-norm + local discriminator;

FIG. 7 is a diagram of a restoration process, a generator network structure and a U-net structure with a residual block;

FIG. 8 is an experimental contrast plot for different scenarios;

fig. 9 is a flowchart of step S4.

Detailed Description

The embodiments of the present invention will be further described with reference to the accompanying drawings.

As shown in fig. 1, the present invention provides a night scene restoration method based on an improved image enhancement algorithm and a generation countermeasure network, comprising the following steps:

In the embodiment of the invention, the generation countermeasure network (GAN) is a neural network with wide application, and is developed based on game theory. The application aims to apply the method to the image translation problem and achieve daytime style migration. The technology adopted by the present application is GAN with a loss of round robin consistency, which was developed from the pixtpix translation network. The PixtoPix realizes the image one-to-one pairing training, is a typical unsupervised training network with high constraint, and is an ideal scheme of the application. But this technique is very demanding on the training set. To realize the night-day mapping model, the same place of day and night is used for each training, and if the real acquisition is needed, the real acquisition can only be performed in a fixed-point swinging manner, so that the training is not suitable for the project which needs a large training set.

The cycle consistency loss realizes non-matching training on the basis of PixtoPix, only two image sets with different styles are needed for training, the requirement that the training set is easy to obtain is met, the work of the training set is suitable for style migration of image colors and textures, and images with the style in the daytime can be comprehensively restored. The application aims at improving the details of specific mapping, and particularly finds that objects which are too bright such as street lamps at night are easily over-exposed in the process of restoring the style of the day in an experiment, so that the application is not only uncomfortable in brightness, but also renders surrounding things to be whitish. In order to make the neural network better adapt to the task of the application, an illumination intensity self-specification graph is introduced as an improved method to improve the recovery effect. Considering that the night scene image has insufficient light, less information and ambiguous features, the effect of directly using the night scene image as a training set is not good, especially the aforementioned places where the five fingers are not seen at outstretched hands, i.e., different images in dark places look like a lot, the neural network is difficult to learn the features of the image and the expected corresponding mapping relation, although the neural network is strong enough to generate reasonable imagination and output for dark areas, the situation is not real, and the neural network is difficult to be put into practical application.

The direct night-time image translation in the experiments of the present application is such that many dark places are considered as the day, the pictures are widely whitened or bluish (depending on the colors of the day in the training set), and the experimental effect is poor. Therefore, the night image restored by the improved Retinex algorithm is input into the neural network as a night domain of the training set instead of the original image, and a good effect is achieved.

In the embodiment of the present invention, as shown in fig. 1, if the acquired nighttime image is a single-channel image in step S1, the enhancement processing includes the following sub-steps:

wherein,

a matrix representing the acquired single-channel night-time images,

the abscissa representing the coordinates of the pixel,

the ordinate representing the pixel coordinate,

representation matrix

Upper coordinate is

The value of the pixel of (a) is,

representation matrix

Upper coordinate is

The value of the pixel of (a) is,

representation matrix

Upper coordinate is

The value of the pixel of (a) is,

which represents a convolution operation, is a function of,

represents a logarithmic operation;

because the ambient illumination component can not be directly obtained, carrying out convolution operation on a relational expression of the night image by utilizing a Gaussian core to express the ambient illumination component, and substituting the convolution operation into the formula of the step A11 to obtain the target object reflection component after enhancement processing;

wherein,

a matrix of a gaussian kernel is represented,

representing a Gaussian kernel matrix

Upper coordinate is

A pixel value of (a);

the matrix of the reflection components of the target object for the single-channel image cannot be obtained from the formula of step A11

Because the collected night image is an ambient illumination component matrix of a single-channel image

Cannot be obtained, so step A13 is performed

Instead of the former

. Because the sigma values of Gaussian blur corresponding to three different scales (red, green and blue) are found to be [15,80,200 ] through comparison]The best effect is achieved (different sigma values can calculate different weight matrixes, the fuzzy radius is 1, the weight matrix is calculated through the sigma values, the final weight matrix is obtained under the condition that the sum of 9 point weights is 1, and the convolution operation is carried out on the weight matrix, namely the used Gaussian kernel

. By using

Instead of the former

I.e. the gaussian blurred image replaces the ambient illumination component.

Gaussian kernel matrix

Like the single-channel image matrix, it is also a two-dimensional matrix, calculated from artificially specified sigma values. It performs a convolution operation with other image matrices to blur the image to represent the ambient light component of the image;

wherein,

a matrix of multi-channel night images representing the acquisition,

representation matrix

Upper coordinate is

The value of the pixel of (a) is,

representation matrix

Upper coordinate is

The value of the pixel of (a) is,

representation matrix

Upper coordinate is

A pixel value of (a);

wherein,

representation matrix

Upper coordinate is

The value of the pixel of (a) is,

the corresponding weight for each channel is represented,

representing the acquired night images

The coordinates in each channel layer are

The value of the pixel of (a) is,

is shown as

The coordinates of each channel layer after Gaussian blur are

The pixel value of (c).

In a multi-channel image, each time it is the result of its own plus next channel processing, similar toi = i+1。

In the invention, the night image shot by the camera is taken

And target object reflection component

The result of the multiplication is that,

the image detail information is carried, and the image detail information is the image after the enhancement. By shifting terms and taking logarithm on two sides, the incident light property can be abandonedThe body has original appearance.

In the embodiment of the present invention, as shown in fig. 1, in step S3, the non-paired image translation neural network for training the nighttime image region adopts a small batch gradient descent method, and the training parameter setting includes: the batch size is set,

cutting the image into 256 in the training process

In the embodiment of the present invention, as shown in fig. 3, in step S3, the method for performing style migration by using the enhanced nighttime image as the nighttime domain input of the training set includes: performing non-paired image translation by adopting cycle consistency; and finishing style migration by adopting least square antagonism loss and self identity mapping loss in sequence.

In the invention, considering that the requirement of image matching training on a training set is very strict, the day and night of the same place are used as a pair of training samples for each training, which is not realistic, so that the cycle consistency loss is adopted to realize non-matching training, images are converted from a source domain x to a target domain y under the condition of no matching sample, and if the target is a learning mapping g: x → y, currently only the discriminators are trained using the penalty of antagonism, but this mapping is highly under-constrained, yet unpaired and difficult to find the reconstruction error term. The cyclic consistency loss at this time uses the inverse mapping f: y → x couples them, pushing f (g (x)) x, f (g (y)) y (making the picture of a domain approximate back to the original value after two generator mapping transformations). Meanwhile, in order to protect the color composition of an output image, self-identity mapping consistency loss is introduced, and the aim is to promote f (x) to be approximately equal to x, g (y) to be approximately equal to y.

In the embodiment of the present invention, as shown in fig. 1, in step S3, the functional expression for performing image unpaired translation with loop consistency is as follows:

wherein,

to represent

Is a real sample from the nighttime image domain,

to represent

Is a real sample from the daytime image domain,

a generator responsible for the night-to-day mapping is shown,

a generator responsible for day-to-night mapping is shown,

representing the placement of real night photographs into a generator

The generated false daytime picture is taken,

representing the placement of real day photos into the generator

The generated photo at the pseudo-night time is taken,

representing a manhattan distance operation between images;

in the invention, the constraint with almost the same strength as the pairing training is applied by adopting the cycle consistency loss, the ground real output is considered, the good effect is obtained, and the maximum advantage is that the training set is easy to obtain in a non-pairing way. But the cost of easy to train is also huge, that is, one more discriminator and one more generator, but they only play a role in providing half of the work that produces the cyclic consistency loss. Many unnecessary by-products are generated, parameters are increased, and the training pressure of the machine is higher. However, the effect is not good when the VGG characteristic distance is adopted to replace the cycle consistency loss. Considering that the effect of the cyclic consistency loss is excellent, once the neural network model is trained, the speed of the subsequent recovery has no relation with the complexity of the network training, and the cyclic consistency loss is finally adopted in the method, which is also the only method for improving the constraint of the current image translation unpaired training.

The functional expression of the least squares antagonism loss is:

wherein,

a domain of the image at night is represented,

a field of an image in the daytime is represented,

indicating a distinguishing daytime image domain

A discriminator of true or false, and a discriminator of false,

representing the placement of real night photographs into a generator

The generated false daytime picture is taken,

indicating that a real day sample is input into the discriminator

Output between [0,1]]The predicted value of (a) is determined,

represent to generator

False daytime photo input discriminator

Output between [0,1]]The predicted value of (2);

is a real sample from the nighttime image domain,

is a real sample from the daytime image domain. The input of the loss function being a true sample

And

to do so

Representing the domain of the image from which the samples came. Since the adversarial loss is finally generated by the discriminator in the generation of the adversarial network, i.e. the distance between the predicted value of the discriminator and the true or false (0 or 1) of the image is finally output, there is no function parameter

And

but they are input to the generator or discriminator.

In the present invention, for style conversion, the post-enhancement generation countermeasure network will learn to convert the image from the enhancement domain x to the daylight domain y without a pairing example. Three losses are necessary here. First, the antagonism loss is provided by the discriminator in order to minimize the distance between the actual daylight scene and the output. Least squares antagonism Loss (LSGAN) loss is used instead of negative log-likelihood or sigmoid functions because this loss is more stable, the model oscillates less, and tends to produce higher quality results.

The functional expression of the self identity mapping loss is:

wherein,

representing the placement of real day photos into the generator

The generated false daytime picture is taken,

representing the placement of real night photographs into a generator

Generating a pseudo-night photo;

in the present invention, the loss of cyclic consistency, although serving some supervisory role, makes the generated image closer to the ground truth output than the direct learning mapping, still does not provide enough constraints on the color composition of the image. In order to protect the original color of the image from being damaged during the mapping process, another beneficial loss is introduced, which is called self-identity mapping loss. This loss is the real image of the target domain input into its own generator without much change, i.e. push f (x) x and g (y) y, such improvement significantly preserves the image color.

In the embodiment of the present invention, as shown in fig. 4, a schematic diagram of the loss function is shown, which represents all the loss caused by the image domain X (enhanced night image) and vice versa for the day domain Y.

In the embodiment of the present invention, as shown in fig. 1, in step S3, the unpaired image translation neural network includes a local discriminator and a generator, where the generator combines the U-net and the residual error network and introduces an illumination intensity self-specification graph to perform a self-regularization operation on the night view image to be restored and each feature map generated by the generator network; the local arbiter uses a convolutional neural network to determine whether a randomly cropped 70x70 image block is true or false.

In the invention, for a generator network, a skeleton part of the network comprises two 2-step convolution layers, two 1/2-step deconvolution layers and a plurality of residual blocks, example normalization is carried out, partial image content information is shared between input and output corresponding feature maps, and thus the image content is protected and the network is easy to train. Meanwhile, an illumination intensity self-specification graph is added to improve illumination distribution, and excessively bright objects such as street lamps and advertising boards are often found in a low-illumination image at night. The specific method comprises the following steps: the input RGB image is saved with a copy of its corresponding gray map, the pixel values of this copy L are normalized to the interval [0,1] since the value of the gray map represents the image illumination intensity, and then 1-L is used as a self-normative attention (the value is lower where the image is brighter). This attention map is resized to match and multiply the feature maps after each convolution process, which improves the visual effect of the output as a form of luminance specification.

The local discriminator with the size of 70 reduces the over-constraint and sharp high-frequency details of the pixel discriminator, avoids the huge training parameters of the global discriminator and relieves the machine training pressure. Meanwhile, the local discriminator can identify the exposed object more easily and can completely solve the problem of the false rendering effect of local overexposure on the periphery (the self-specification graph is the solution in the meaning of brightness)Overexposure, the local evaluator may also wipe out surrounding false rendering effects). In order to reduce model oscillation, the classifier parameters are updated using 50 histories of the images received by the classifier, i.e. the training parameter batch _ size for the classifier is 50 (here, 50 local cuts of 70X70 are used instead of 50 input images), and the generator takes batch _ size 1-4 (here, the whole picture) for machine performance reasons, and they all update the parameters more frequently.

In the present invention, as shown by comparing fig. 5 and 6, the local discriminator can recognize objects such as night street lamps and billboards which are over-exposed, and together with the illumination intensity self-specification map, the problem of local over-exposure at night is solved well, and the pixel discriminator cannot recognize these objects and cannot solve the over-exposure (even if the self-specification map is introduced), so that it is difficult to learn the style of day, and it cares too much about local details, resulting in abnormal color. And the global discriminator has more parameters and is difficult to train.

In the embodiment of the present invention, as shown in fig. 7, the defogging is performed in the last step, because it is found in the experimental process that the direct defogging of the enhanced image may cause color distortion, false rendering, generation of artifacts, etc., and may further increase the image style of different areas. These all result in learning barriers for the network. This side effect is particularly evident on multi-style image training sets, which are attenuated but not eliminated on single-style training sets, so the last defogging is selected.

In the embodiment of the invention, as shown in fig. 8, it can be seen that the night scene restoration method of the present application is more excellent in the experiment of the single-style region, the quality and identifiability of the generated image are improved, and the finally output image is clear, and the details are obvious and soft.

In embodiments of the present invention, the techniques of the present application achieve the specific task of unpaired nightly image translation, which benefits from the combination of image processing algorithms and deep convolution generation competing networks. The method adopts the MSCRP algorithm suitable for processing the image enhancement of the night scene image, and improves a plurality of defects of the algorithm on the night scene restoration by means of dark channel defogging, sharpening convolution, bilateral filtering, color gain adjustment, deviation improvement error rendering and the like, so that the output image is clear and bright. If only the night sensitive information is highlighted, the partial technology can be directly used for night scene restoration. And then, a confrontation network neural network model is generated by means of GPU cloud server training, and the images after enhancement processing replace images at night for input training. Introducing cyclic consistency loss to realize unpaired training and antagonistic loss together to realize style migration. A loss of self-consistency is also introduced to protect the image from losing color composition during the conversion process. The method adopts the ideas of the local discriminator and the generator, the generator combines the U-net and the residual error network, network training pressure is greatly relieved, the quality and identifiability of generated images are improved, overexposure processing is realized by using the illumination intensity self-regular attention map, and finally output images are clear, and the details are obvious and soft.

In the embodiment of the present invention, as shown in fig. 9, step S4 includes the following sub-steps:

In the invention, the enhanced night image generally has a serious atomization condition, so that the defogging is carried out by adopting a dark channel prior theory. The MSRCP is used for processing night scenes, and when the light of the image is uniformly distributed and is sufficient (such as street scenes at night), the recovery effect is good. However, in practical application, many night scene pictures have large dark light, no light and no information areas, and when the areas are processed by using the MSRCP algorithm, color error rendering and atomization effects with different degrees are easily caused, and the noise is serious. In addition, the scene and the picture at night are not clear at all times, because the image does not have enough light to enter human eyes or a photosensitive element of a camera, and the image is not clear even through color restoration. The Retinex algorithm generates a color error rendering result due to the lack of information in a dark place, improves the output effect by adjusting the color gain and the deviation, and simultaneously finds that the reduction of the image resolution is favorable for inhibiting the expansion of the error rendering (meanwhile, the small resolution can better meet the real-time processing).

The Retinex theory in the field of image enhancement simulates the human visual system, and calculates and enhances the positions with light rays in the image, and the principle is to calculate the original pattern of an object reflected into human eyes by the light rays. Retinex is not a technique for processing night scenes, but an image enhancement algorithm. But the theory of the method is inspired by the application, because the method can fully utilize weak light at night, enlarge useful visual information, and perform a plurality of experiments on night scene restoration with the method, the effect is good. The Retinex technology is various, and after a large number of experiments, the MSRCP with the color recovery multi-scale algorithm is found to be the most excellent for color recovery and enhancement of night scene images, but has a plurality of defects and shortcomings, so that the application improves on the basis of the MSRCP.

In the embodiment of the present invention, as shown in fig. 2, the defogging method is superior to enhance the detail effect in the dark, and direct defogging after enhancement may cause color distortion, erroneous rendering, and halo artifacts in the second row image, thereby increasing the difference in image style of the training set, and making the neural network difficult to learn. These effects are all shown in fig. 8, so the enhanced non-defogged images are put into neural network training, and the daytime style images are output and then defogged.

In the embodiment of the present invention, as shown in fig. 9, step S41 includes the following sub-steps:

The calculation formula is as follows:

wherein,

each channel of the color image is represented by a color image,

which is indicative of the color red,

the color of the green is represented and,

which represents the color of blue,

a set of three primary colors is represented,

the red channel of the color image is represented,

the green channel representing a color image is represented,

the blue channel of the color image is represented,

representing a window centered on a pixel,

representation of belonging to a window

Any one pixel of (1);

s412: based on dark passageway

Defogging the night image;

in step S412, the fog pattern modeling equation is:

wherein,

which represents the image to be defogged and,

it is shown that there is no fog image,

which represents the light component of the atmosphere in the world,

represents a transmittance;

based on dark passageway

wherein,

which represents a threshold value of the transmittance,

indicating a maximum operation.

Dark channel prior theory shows that:

that is, in most local areas in the outdoor fog-free image, there are some pixel points having very low values in at least one color channel, approaching 0.

In the embodiment of the present invention, as shown in fig. 9, in step S42, the method for performing sharpness processing on the image is as follows: sequentially carrying out sharpening and filtering;

；

In the present invention, because there is not enough light, the night image is generally blurred because the captured image has a large amount of information that is lost and cannot be changed by using Retinex. The method of sharpening the image and then filtering is adopted to improve the definition. Sharpening the image so simply processing adds more noise. In order to eliminate noise, the result of blurring techniques such as median filtering and gaussian filtering is not good, and the processing result is that the image background is blurred or dark, and the effect of improving the definition fails. The bilateral filtering for retaining the edge is a compromise treatment combining the spatial proximity and the pixel value similarity of the image, and simultaneously considers the spatial and information and the gray level similarity to achieve the purpose of retaining the edge and removing the noise. The sharpened inner core is utilized to carry out convolution operation on the image subjected to the dark channel defogging processing for once so as to enhance the edge of the image to achieve the purpose of sharpening, and the noise of the sharpened image is obviously increased, so that the image is subjected to bilateral filtering of edge preservation and denoising for once.

The filtering process adopts opencv library function bilatelfilter (src, d, sigmaColor, sigmasepace), where scr is the input original image, d is the size of the filtering template (the diameter of each pixel neighborhood used during filtering), and sigmaColor is a sigma parameter of a color space, and the larger the parameter is, it indicates that broader colors in the pixel neighborhood will be mixed together to generate a larger semi-equal color region, and sigmasepace is a sigma parameter of a coordinate space, and the larger the parameter is, it means that pixels with similar colors in a larger range will affect each other, so that sufficiently similar colors in a larger region become a same color, and usually, the sigmacor and sigmasepace are equal, and then the sizes of them and d are adjusted. An excessively large filter (d > 5) performs inefficiently and the input noise is not severe, so experiments with d 3 are carried out, it is widely believed that sigmacor and sigmasace cannot be too large (greater than 150), otherwise the effect of the filter is so strong that the image appears very cartoon. However, no such side effect was found in the experiment. And increasing sigmaColor and sigmaSpace are provided, the denoising effect is improved continuously, convergence is achieved when the number of the noise reduction units is 400, and good effect is achieved.

The working principle and the process of the invention are as follows: the night scene restoration is divided into two parts of image enhancement and style migration. The image enhancement uses a multi-scale Retinex algorithm with color recovery, namely MSRCP, and the improvement of dark channel defogging, sharpening, bilateral filtering denoising and error rendering reduction is carried out on the MSRCP. The style migration part uses a cloud server to train and generate an antagonistic neural network, realizes image non-pairing translation by adopting cycle consistency loss, completes style migration together with antagonistic loss and self-identity mapping loss, and also introduces illumination intensity self-standardization as a self-regularization means to adjust the distribution of generated image illumination, so that the problem of low exposure and overexposure is solved, and the final visual effect is soft.

After the research area is determined, the image enhancement algorithm performs a first step of restoration, the color and definition of the image are greatly enhanced, most requirements can be met, then the night of the area is trained to use the non-paired image translation neural network in the day, and the image restored by the image enhancement algorithm replaces the original night scene image to be used as the night domain input of the training set. The final output picture will be a bright and clear daytime scene. After the image restoration is realized, the restoration of the night scene video can be easily realized by means of opencv + wave + ffmpeg, and the real-time restoration is realized under the support of GPU accelerated calculation.

The video restoration is not complex on the basis of the prior art, wave is written by python to be responsible for extracting audio temporary storage of the video, then each extracted frame picture is completely restored to an enhanced mode picture or a daytime mode picture through batch processing by the technology, openCV restores the picture frame to the video, and finally ffmpeg realizes the combination of the video and the audio to complete the final restoration. In the aspect of real-time restoration, only the picture of the camera needs to be called, and the picture is processed into a frame for restoration. In the case of the image enhancement section, the computational performance of a conventional computer can support real-time restoration, provided that the picture resolution cannot be too high. If the daytime style restoration and high-resolution picture restoration are to be realized, the support of GPU accelerated calculation is needed.

It will be appreciated by those of ordinary skill in the art that the embodiments described herein are intended to assist the reader in understanding the principles of the invention and are to be construed as being without limitation to such specifically recited embodiments and examples. Those skilled in the art can make various other specific changes and combinations based on the teachings of the present invention without departing from the spirit of the invention, and these changes and combinations are within the scope of the invention.

Claims

1. A night scene restoration method based on an improved image enhancement algorithm and a generation countermeasure network is characterized by comprising the following steps:

2. The night scene restoration method based on the improved image enhancement algorithm and the generation of the countermeasure network as claimed in claim 1, wherein the enhancement processing in step S1 comprises the following sub-steps if the captured night image is a single-channel image:

wherein,

a matrix representing the acquired single-channel night-time images,

the abscissa representing the coordinates of the pixel,

the ordinate representing the pixel coordinate,

representation matrix

Upper coordinate is

The value of the pixel of (a) is,

representation matrix

Upper coordinate is

The value of the pixel of (a) is,

representation matrix

Upper coordinate is

The value of the pixel of (a) is,

which represents a convolution operation, is a function of,

represents a logarithmic operation;

in step a12, the method for performing convolution operation on the relational expression of the nighttime image by using the gaussian kernel includes: selecting a sigma value and calculating a corresponding weight matrix, and performing convolution operation by taking the weight matrix as a Gaussian kernel, wherein the calculation formula is as follows:

wherein,

a matrix of a gaussian kernel is represented,

representing a Gaussian kernel matrix

Upper coordinate is

A pixel value of (a);

wherein,

a matrix of multi-channel night images representing the acquisition,

representation matrix

Upper coordinate is

The value of the pixel of (a) is,

representation matrix

Upper coordinate is

The value of the pixel of (a) is,

representation matrix

Upper coordinate is

A pixel value of (a);

wherein,

representation matrix

Upper coordinate is

The value of the pixel of (a) is,

the corresponding weight for each channel is represented,

representing the acquired night images

The coordinates in each channel layer are

The value of the pixel of (a) is,

is shown as

The coordinates of each channel layer after Gaussian blur are

The pixel value of (c).

3. The night scene restoration method based on the improved image enhancement algorithm and the generation of the confrontation network as claimed in claim 1, wherein in step S3, the unpaired image translation neural network for training the night image region adopts a small batch gradient descent method, and the training parameter setting comprises: the batch size is set,

cutting the image into 256 in the training process

4. The night scene restoration method based on the improved image enhancement algorithm and the generated confrontation network as claimed in claim 1, wherein in step S3, the method of performing style migration by using the enhanced night image as the night domain input of the training set comprises: performing non-paired image translation by adopting cycle consistency; and finishing style migration by adopting least square antagonism loss and self identity mapping loss in sequence.

5. The night scene restoration method based on the improved image enhancement algorithm and the generated countermeasure network as claimed in claim 4, wherein in step S3, the function expression for performing the image unpaired translation with the cycle consistency is:

wherein,

to represent

Is a real sample from the nighttime image domain,

to represent

Is a real sample from the daytime image domain,

a generator responsible for the night-to-day mapping is shown,

a generator responsible for day-to-night mapping is shown,

representing the placement of real night photographs into a generator

The generated false daytime picture is taken,

representing the placement of real day photos into the generator

The generated photo at the pseudo-night time is taken,

representing a manhattan distance operation between images;

the functional expression of the least squares antagonism loss is:

wherein,

a domain of the image at night is represented,

a field of an image in the daytime is represented,

indicating a distinguishing daytime image domain

A discriminator of true or false, and a discriminator of false,

representing the placement of real night photographs into a generator

Generated false daytime illuminationThe sheet is a sheet of a plastic material,

indicating that a real day sample is input into the discriminator

Output between [0,1]]The predicted value of (a) is determined,

represent to generator

False daytime photo input discriminator

Output between [0,1]]The predicted value of (2);

the functional expression of the self identity mapping loss is:

wherein,

representing the placement of real day photos into the generator

The generated false daytime picture is taken,

representing the placement of real night photographs into a generator

And generating a pseudo-night photo.

6. The night scene restoration method based on the improved image enhancement algorithm and the generated antagonistic network as claimed in claim 1, wherein in step S3, the unpaired image translation neural network comprises a local discriminator and a generator, wherein the generator combines the U-net and the residual network and introduces the illumination intensity self-normative graph to perform self-regularization operation on the night scene image to be restored and each feature graph generated by the generator network; the local arbiter uses a convolutional neural network to determine whether a randomly cropped 70x70 image block is true or false.

7. The night scene restoration method based on the improved image enhancement algorithm and the generated confrontation network as claimed in claim 1, wherein said step S4 comprises the following sub-steps:

8. The night scene restoration method based on the improved image enhancement algorithm and the generation of the confrontation network as claimed in claim 7, wherein said step S41 comprises the following sub-steps:

The calculation formula is as follows:

wherein,

each channel of the color image is represented by a color image,

which is indicative of the color red,

the color of the green is represented and,

which represents the color of blue,

a set of three primary colors is represented,

the red channel of the color image is represented,

the green channel representing a color image is represented,

the blue channel of the color image is represented,

representing a window centered on a pixel,

representation of belonging to a window

Any one pixel of (1);

s412: based on dark passageway

For night imagesCarrying out defogging treatment;

in step S412, the fog pattern modeling equation is:

wherein,

which represents the image to be defogged and,

it is shown that there is no fog image,

which represents the light component of the atmosphere in the world,

represents a transmittance;

based on dark passageway

wherein,

which represents a threshold value of the transmittance,

indicating a maximum operation.

9. The night scene restoration method based on the improved image enhancement algorithm and the generated confrontation network as claimed in claim 7, wherein in step S42, the method for performing sharpness processing on the image is as follows: sequentially carrying out sharpening and filtering;

；